Automation Prerequisites · Lesson 1

HTML & CSS for Testers

Every locator you will ever write in automation points at a piece of a web page. To write one that works — and keeps working — you need to read the page underneath. This lesson teaches you just enough HTML and CSS to do exactly that.

Prerequisites Foundations for Non-Coders — Lesson 1 of 3 ~25 min read · ~60 min with exercises

1 The Hook

Aroha had been a manual tester at a Wellington govtech shop for four years. She was good — sharp eye, clear bug reports, trusted on the hardest flows. When the team started automating their regression pack, she put her hand up to learn. Her first task was simple: write a test that clicks the “Submit application” button on a rates-rebate form.

She copied a locator a teammate had used elsewhere and the test passed. Two weeks later the same test started failing — on a build where nothing about the button had visibly changed. The button was right there on screen. She could see it, click it, use it by hand. But the test could not find it. She had no idea why, because she had never actually read the HTML behind the button. The locator pointed at something that had quietly moved.

It turned out the locator was pinned to the button’s position in the page — “the third button inside the fourth box” — and a developer had added a new box above it. The button had not changed. Its address had. Aroha had been writing locators blind, copying shapes she did not understand, and the first time the page shifted underneath her, she was stuck.

Here is the lesson hidden in that story: you cannot write a reliable locator for a page you cannot read. The automation tool is not magic — it finds elements by their address in the page’s structure. Learn to read that structure and locators stop being something you copy and start being something you understand. That is what this lesson teaches.

2 The Rule

A locator is an address for one piece of a web page. Automation finds elements by reading the page’s structure — its HTML — so if you can read that structure, you can write locators that point at the right thing and keep working when the page changes around them. You do not need to build web pages. You need to read them.

3 The Analogy

Analogy

A street address versus “the third house on the left.”

Imagine sending a courier to a house in Ponsonby. You could give them the actual address — “14 Vinegar Lane” — and they will find it every time, even if a new house is built next door. Or you could say “the third house on the left after the dairy.” That works fine today, but the moment someone builds a new house in the gap, “the third on the left” is now the wrong house, and the courier delivers to a stranger.

A web page is the street. Each element — a button, a field, a heading — is a house. An id or a stable attribute is the real street number: it points at the same element no matter what gets built around it. A position-based locator like “the third button in the fourth box” is “third house on the left” — it works until the page changes, then it silently points at the wrong thing. Reading HTML is learning to find the street numbers.

4 The DOM & HTML Elements

A web page is built from HTML — a set of nested tags that describe what is on the page. When the browser loads that HTML it builds a live tree of those tags in memory. That tree is called the DOM (Document Object Model). When your automation “finds an element”, it is searching that tree. So the DOM is the thing your locators actually search.

An element is a single tag, usually with an opening tag, some content, and a closing tag:

<button>Submit application</button>

Here <button> is the opening tag, Submit application is the text content, and </button> closes it. Elements nest inside each other, which is what builds the tree. A small login form might look like this:

<form>
  <label>Email</label>
  <input type="email">
  <button>Log in</button>
</form>

Read that as a tree: the form is the parent; the label, input, and button are its three children, sitting side by side inside it. The common tags you will meet again and again are div (a generic box used for layout), span (a small inline piece of text), a (a link), input (a field), button, and headings h1 through h6. You do not need to know all of HTML — you need to recognise a tag, its content, and what it sits inside.

5 Attributes, IDs & Classes

An element can carry extra information inside its opening tag. These are attributes — written as name="value" pairs. For testers, attributes are gold, because they are how you tell one element apart from another.

<button id="submit-btn" class="btn btn-primary" data-testid="submit-application">Submit</button>

That single button carries three attributes you care about:

  • id — a unique name for one element. An id is supposed to appear only once on a page, which makes it the most reliable address you can get. id="submit-btn" points at this one button and nothing else.
  • class — a shared label, often several at once. Here the button has two classes, btn and btn-primary, separated by a space. Classes are mostly used for styling, so many elements can share the same class. Useful, but not unique on their own.
  • data-testid — an attribute added on purpose for testing. When developers add a data-testid (or data-test), they are handing you a stable address that exists only so tests can find the element. If a page has these, prefer them — they are the street numbers put up specifically for you.

The whole game of writing a good locator is finding an attribute — ideally an id or a data-testid — that uniquely and stably identifies the element you want.

6 CSS Selectors — the Basis of Locators

A CSS selector is a short pattern that picks out elements from the page. CSS was invented to style pages — “make every primary button teal” — but the exact same patterns are what most automation tools use to locate elements. Learn CSS selectors and you have learned the language of locators. Here are the ones that cover the vast majority of real cases:

button    → every <button> on the page (by tag name)
#submit-btn  → the element whose id is "submit-btn" (# means id)
.btn-primary → every element with the class "btn-primary" (. means class)
[data-testid="submit-application"] → the element with that exact attribute
form button → every <button> that sits anywhere inside a <form>
input[type="email"] → every <input> whose type attribute equals "email"

Read each one out loud and it makes sense: # means “the id”, . means “the class”, square brackets mean “the attribute”, and putting two selectors with a space between them means “this one, somewhere inside that one.” You can combine them — form input[type="email"] means “the email input inside the form.”

Pro tip: You can test any selector live without writing a single line of automation. Open a page, press F12 to open DevTools, click the Console tab, and type document.querySelectorAll('#submit-btn'). The browser shows you exactly which elements that selector matches — one, none, or many. If it matches more than one when you wanted one, your locator is ambiguous.

7 Reading Page Structure in DevTools

You do not read HTML out of a file — you read it live, in the browser, using DevTools. This is the single most useful skill in this lesson, and every browser has it built in. Press F12 (or right-click an element and choose Inspect) and a panel opens showing the page’s live HTML.

The workflow that automation testers use every day:

  • Inspect the element you care about. Right-click the “Submit application” button on the page and choose Inspect. DevTools jumps straight to that element’s tag in the HTML and highlights it.
  • Read its attributes. Look at the highlighted tag. Does it have an id? A data-testid? What classes does it carry? This tells you what addresses are available.
  • Look at where it sits. Note its parent — the tag it is nested inside. That tells you whether a path like form button would reach it.
  • Test a selector in the Console. Switch to the Console tab and run document.querySelectorAll('your-selector') to confirm it matches the one element you meant.

That loop — inspect, read the attributes, check the parent, test the selector — is how you go from “I can see the button” to “I have a locator I trust.” It is also exactly how you debug a failing locator: inspect the element, and you will usually see that an attribute changed or the element moved.

8 Writing Locators That Don’t Break

Two locators can both find the right element today and have completely different lifespans. A robust locator keeps working as the page evolves; a brittle one breaks the next time a developer touches the page. The difference is what the locator depends on. Prefer, in roughly this order:

1. A test-specific attribute[data-testid="submit-application"]. Added on purpose for tests, so it only changes if someone means to change it. Most robust.
2. A stable id#submit-btn. Unique and meaningful, unlikely to change. Very good.
3. A meaningful attribute or short pathform button[type="submit"]. Readable and tied to what the element is. Good.
4. A position or deep path — “the 3rd button in the 4th div”. Breaks the moment the layout shifts. Avoid.
5. An auto-generated class.css-1a2b3c. Random-looking classes are often regenerated on every build. Avoid.

The test that broke for Aroha was a number 4 — it depended on position, so a new box above it changed the address. The fix was to ask a developer to add a data-testid, then point the locator at that. The button looked identical to a user the whole time; what changed was whether the locator depended on something stable.

Pro tip: If the page has no good id or data-testid for an element you need to test often, that is a finding worth raising. Asking a developer to add a data-testid is a normal, welcome request — it makes the page more testable and costs them almost nothing. Good automation starts before the test, in the HTML.

9 Common Mistakes

🚫 Copying a locator you do not understand

Why it happens: A teammate’s locator works, so you paste it and move on.
The fix: Read what it actually points at first. Inspect the element, find the id or attribute, and write a locator you can explain. A locator you understand is one you can fix when it breaks — a copied one leaves you stuck, like Aroha.

🚫 Pinning locators to position in the page

Why it happens: “The third button” is easy to point at and works right now.
The fix: Position-based locators break the moment a developer adds or moves anything above your element. Prefer a stable id or data-testid — the street number, not “third house on the left.”

🚫 Treating a class as if it were unique

Why it happens: A class like .btn-primary matches your button, so it feels specific.
The fix: Classes are shared for styling — the same class is usually on many elements. Check in the Console with querySelectorAll; if it matches more than one element, your locator is ambiguous and will eventually grab the wrong one.

🚫 Trusting auto-generated class names

Why it happens: A class like .css-1a2b3c is right there in the HTML, so it looks usable.
The fix: Random-looking class names are often generated fresh on every build, so they change without warning. Never depend on them — reach for an id, a data-testid, or a meaningful attribute instead.

10 Now You Try

Three graded exercises — spot it, fix it, build it. Write your answer, run it for AI feedback, then compare to the model answer.

🔍 Exercise 1 of 3 — What Does This Selector Match?

Read the snippet of HTML from a fictional RealMe login form below. For each of the three CSS selectors, say in plain English which element(s) it matches, and whether it uniquely identifies one element.

<form id="login-form">
  <input id="username" class="field" type="text">
  <input id="password" class="field" type="password">
  <button class="btn btn-primary" data-testid="login-submit">Log in</button>
</form>

Explain what each selector matches and whether it is unique:

Show model answer
Selector A: .field — Matches BOTH inputs (the username field and the password field), because both carry class="field". NOT unique — it matches two elements, so it is a poor choice for locating one specific field.

Selector B: #password — Matches exactly ONE element, the password input, because # means "the id" and an id should appear only once. Unique — a good, reliable locator.

Selector C: [data-testid="login-submit"] — Matches exactly ONE element, the Log in button, by its test-specific attribute. Unique — and the best choice of the three, because data-testid is added on purpose for testing and is the least likely to change.

Key point: A and B both technically "work" on the inputs, but only B (and C for the button) pins down a single element. .field is shared, so it is ambiguous.
🔧 Exercise 2 of 3 — Fix the Broken Locator

A test on a fictional Trade Me listing page uses the brittle locator below to find the “Buy Now” button. It keeps breaking whenever the page layout changes. Using the HTML provided, rewrite it as a robust locator and explain why yours is better.

Brittle locator (keeps breaking):
div.listing > div:nth-child(4) > div > button:nth-child(2)
<div class="listing">
  ...
  <button class="btn btn-watch">Watchlist</button>
  <button id="buy-now" class="btn btn-buy" data-testid="buy-now-button">Buy Now</button>
</div>

Write a robust locator and explain why it is better:

Show model answer
My robust locator: [data-testid="buy-now-button"]
(equally good: #buy-now)

Why it is more robust than the original:
The original locator is a chain of positions — "the 4th div, then a div, then the 2nd button". It depends entirely on the page's layout staying exactly the same. The moment a developer adds, removes, or reorders any box above the button, nth-child counting shifts and the locator either fails or — worse — silently points at a different button.

The Buy Now button carries a data-testid="buy-now-button" and an id="buy-now". Both are unique, stable addresses that travel WITH the element no matter where it sits in the layout. data-testid is the best choice because it exists specifically for testing and will not be changed for styling reasons. id is the next best. Either one means the test finds the button by what it IS, not by where it happens to sit today.

Trap to avoid: do not use .btn (matches both buttons) or .btn-buy (a styling class that could be renamed). Reach for the id or data-testid.
🏗️ Exercise 3 of 3 — Build Locators for a Form

Here is the HTML for a fictional IRD myIR contact-details form. Write a robust CSS-selector locator for each of the four targets listed, choosing the most stable option available, and note why you chose it.

<form id="contact-form">
  <input id="email" class="field" type="email" data-testid="contact-email">
  <input id="mobile" class="field" type="tel">
  <input class="field" type="text" name="postcode">
  <button class="btn btn-primary" data-testid="save-contact">Save</button>
</form>
Show model answer
Target 1 — the email input: [data-testid="contact-email"] (or #email)
Why: It has a test-specific attribute, the most stable option. #email is an equally fine fallback.

Target 2 — the mobile input: #mobile
Why: No data-testid here, but it has a unique id, which is the next most reliable address. Do NOT use .field — three elements share that class.

Target 3 — the postcode input (no id, no data-testid): input[name="postcode"]
Why: It has neither an id nor a data-testid, so use the next most meaningful, unique attribute — name="postcode" is unique on this form and tied to what the field IS. Better than position. (Worth raising: ask a developer to add an id or data-testid to make it as testable as the others.)

Target 4 — the Save button: [data-testid="save-contact"]
Why: A test-specific attribute, the best option. Avoid .btn-primary — it is a shared styling class.

Key skill shown: pick the MOST stable address available for each element, and recognise when none is good (Target 3) so you can raise it.

11 Self-Check

Click each question to reveal the answer.

Q1: What is the DOM, and why does it matter to a tester?

The DOM (Document Object Model) is the live tree of HTML elements the browser builds when it loads a page. It matters because when your automation “finds an element”, it is searching that tree — so the DOM is exactly what your locators point at.

Q2: What do # and . mean at the start of a CSS selector?

# means “the element with this id” — e.g. #submit-btn. . means “every element with this class” — e.g. .btn-primary. An id should be unique on the page; a class is usually shared by many elements.

Q3: Why is a position-based locator like “the third button in the fourth div” brittle?

Because it depends on the page’s layout staying exactly the same. The moment a developer adds, removes, or reorders anything above the element, the position shifts and the locator either fails or silently points at the wrong element. It is the “third house on the left” problem.

Q4: You have an element with id, class, and data-testid. Which do you reach for first, and why?

The data-testid. It is added on purpose for testing, so it changes only when someone deliberately changes the test target — not for styling or layout reasons. A stable id is the next best. A class is usually shared and used for styling, so it is the least reliable.

Q5: How can you confirm a selector matches exactly one element without writing any automation?

Open DevTools (F12), go to the Console tab, and run document.querySelectorAll('your-selector'). The browser shows every element that matches. If it returns more than one when you wanted one, your locator is ambiguous and needs to be more specific.

12 Interview Prep

Real questions asked in NZ QA interviews for junior automation roles. Read the model answers, then practise your own version.

“What makes a locator robust versus brittle?”

A robust locator points at something stable about the element — ideally a data-testid added for testing, or a unique id — so it keeps working as the page changes around it. A brittle locator depends on something that changes easily: the element’s position in the layout (nth-child paths), a shared styling class, or an auto-generated class name. The test for me is: if a developer adds a new box above this element, does my locator still find it? If yes, it is robust.

“A test can’t find a button that is clearly visible on the page. How would you start debugging it?”

I’d open DevTools and inspect the button to read its current HTML — its id, classes, and any data-testid — and compare that against what my locator expects. Often an attribute has changed or the element has moved, breaking a position-based locator. Then I’d test my selector live in the Console with querySelectorAll to see whether it matches zero elements (locator is wrong) or many (it is ambiguous). That usually points straight at the cause, and I’d fix it by switching to a stable id or data-testid.

“The element you need to test has no id and no data-testid. What do you do?”

First I’d look for the next most stable, unique attribute — something like name, or a short meaningful path tied to what the element is, such as form button[type="submit"] — rather than falling back to a fragile position-based path. Then I’d raise it: I’d ask a developer to add a data-testid. It is a cheap change that makes the page more testable, and good automation often starts before the test, in making the HTML easy to locate against.