BDD & Gherkin for Automation Engineers
Write human-readable test scenarios in Gherkin, wire them to Playwright with Cucumber, and collaborate with product owners and developers in a shared language that prevents ambiguity.
1 The Hook — Why This Matters
An Auckland fintech is building a login feature with a "remember me" checkbox. Developers implement it to persist a session cookie for 30 days. QA tests it by checking that the cookie exists after login. Product described it in Jira as "users should stay logged in across browser restarts."
Three teams. Three different interpretations. When QA finds the session doesn't survive a browser restart (which it doesn't, because the developer stored it in session storage rather than a persistent cookie), it's escalated as a bug — but the developer says they met the spec, and product isn't sure who is right.
A Three Amigos session using Gherkin would have surfaced this ambiguity before a single line of code was written. The word "remember" would have required a precise Gherkin scenario. Precisely what event triggers persistence? How long? What clears it? Everyone signs off on the Gherkin before development begins. No surprises at the end.
2 The Rule — The One-Sentence Version
Gherkin is not a test automation format — it is a communication format. The automation is secondary. The primary goal is a shared, unambiguous definition of behaviour that developers, testers, and product owners all agree on.
When you write Gherkin, you are not writing tests. You are writing a specification in plain English that happens to be executable. If the Three Amigos can't agree on the Gherkin wording, you have found an ambiguity before it becomes a defect.
3 The Analogy — Think Of It Like...
A Gherkin scenario is a contract.
Given sets the preconditions (the terms). When describes the action (the triggering event). Then defines the expected outcome (the obligation). Everyone — developer, tester, product owner — signs the contract before the work begins. No one can claim they were unaware of what was agreed.
A contract that is vague is unenforceable. A Gherkin scenario that uses fuzzy language like "the user sees a success message" is just as unenforceable — what message? where? how long does it show?
4 Watch Me Do It — Step by Step
Here is a complete Gherkin feature file for a NZ e-commerce login, followed by the Playwright + Cucumber step definitions that wire the scenarios to real browser automation.
Feature: Customer Login
As a returning NZ customer
I want to log in with my email and password
So that I can view my order history
Scenario: Successful login redirects to dashboard
Given I am on the Resync Store login page
And I have a registered account with email "kiri@example.co.nz"
When I enter my email "kiri@example.co.nz" and password "ValidPass123!"
And I click the "Log in" button
Then I should be redirected to the dashboard
And I should see the message "Welcome back, Kiri"
Scenario: Invalid password shows error
Given I am on the Resync Store login page
When I enter my email "kiri@example.co.nz" and password "WrongPass!"
And I click the "Log in" button
Then I should see an error message "Invalid email or password"
And I should remain on the login page
import { Given, When, Then } from '@cucumber/cucumber';
import { expect } from '@playwright/test';
Given('I am on the Resync Store login page', async function () {
await this.page.goto('/login');
});
When('I enter my email {string} and password {string}',
async function (email: string, password: string) {
await this.page.getByLabel('Email address').fill(email);
await this.page.getByLabel('Password').fill(password);
}
);
When('I click the {string} button', async function (buttonName: string) {
await this.page.getByRole('button', { name: buttonName }).click();
});
Then('I should be redirected to the dashboard', async function () {
await expect(this.page).toHaveURL(/.*dashboard/);
});
Then('I should see the message {string}', async function (msg: string) {
await expect(this.page.getByText(msg)).toBeVisible();
});
Then('I should see an error message {string}', async function (msg: string) {
await expect(this.page.getByRole('alert')).toContainText(msg);
});
Then('I should remain on the login page', async function () {
await expect(this.page).toHaveURL(/.*login/);
});
{string} instead of regex where possible — they are more readable and self-documenting. Step definitions become reusable across scenarios automatically because the parameter is extracted from the Gherkin step.
5 When to Use It / When NOT to Use It
| Context | BDD adds value? | Reason |
|---|---|---|
| New feature with unclear requirements | Yes | Three Amigos surfaces ambiguity early |
| Cross-team collaboration (dev + QA + product) | Yes | Shared language, shared ownership |
| Regulated domain (banking, health, IRD integrations) | Yes | Scenarios become living audit-ready documentation |
| Acceptance criteria for complex user journeys | Yes | Gherkin is the definition of done |
| Solo automation engineer, no PO involvement | Overkill | Communication benefit is lost; plain tests are faster |
| Low-complexity CRUD pages with obvious behaviour | Overkill | Gherkin adds ceremony without reducing ambiguity |
| Performance or load testing | No | BDD describes functional behaviour, not throughput |
6 Common Mistakes — Don't Do This
🚫 Writing implementation-specific steps
Bad: I click the green button at coordinates 300,400
Good: I click the "Add to Cart" button
Gherkin describes what the user intends, not how the UI implements it. Coordinates and colours change; intent doesn't. If a product owner can't read the step and understand what a user does, it's too implementation-specific.
🚫 Creating one giant feature file with 50+ scenarios
Keep each feature file focused on a single area of behaviour and aim for 10–15 scenarios max. A feature file with 50 scenarios becomes a maintenance burden and loses the communication benefit. Split by user journey, not by test type.
🚫 Skipping the Three Amigos session and writing Gherkin alone
Writing Gherkin without a developer and product owner defeats the entire purpose of BDD. You end up with executable specs that only one person understands — which is just test automation with extra ceremony. The Three Amigos session IS the value. The Gherkin is the output of that session, not the input to it.
7 Now You Try — Prompt Lab
Write your Gherkin feature file below, then send it to the AI coach for review.
Write a Gherkin Feature file with 3 scenarios for a NZ banking app password reset flow. Include:
- Reset link sent by email
- Link expiry (24 hours)
- Successful password change
8 Self-Check — Can You Actually Do This?
Click each question to reveal the answer. Three from three means you're ready for the interview round.
Q1. What does each keyword — Given, When, Then — represent in a Gherkin scenario?
Given establishes the precondition or context before any action. When describes the specific action or event the user (or system) performs. Then describes the observable outcome that should result. And / But continue the previous keyword to avoid repetition.
Q2. What makes a good step definition?
A good step definition is reusable (uses parameters like {string} rather than hard-coded values), does one thing, makes no assertions (assertions belong in Then steps), and uses stable, accessible locators like roles and labels rather than CSS selectors or coordinates. It should read naturally in any Gherkin step that matches its pattern.
Q3. Why is BDD described as a communication technique rather than an automation technique?
BDD's primary value is the conversation it forces between developers, testers, and product owners before any code is written. The Gherkin scenarios are the artefact of that conversation — a shared, executable specification everyone has agreed on. The automation layer (Cucumber + Playwright) is a secondary benefit. Teams that skip the Three Amigos session and write Gherkin in isolation get the overhead without the communication benefit.
9 Interview Prep — What They'll Ask
Q1. "Walk me through how you'd introduce BDD to a team that's currently doing manual testing with Word documents."
I'd start by running a single Three Amigos session on one upcoming feature — not the whole product. I'd show the team that Gherkin is just structured plain English; no one needs to learn a new tool. Once the first feature file exists, I'd wire a handful of scenarios to Playwright with Cucumber to demonstrate that the spec is executable. I'd keep the first cycle short and show the team the time saved when the feature ships without ambiguity bugs. In NZ teams, the hardest part is usually getting product owners in the room — framing it as "catching bugs in the spec, not the code" tends to land well.
Q2. "What's the difference between a Gherkin step and a step definition?"
A Gherkin step is a plain-English line in a .feature file, such as When I click the "Log in" button. It describes behaviour in business language. A step definition is the TypeScript (or other language) function that Cucumber maps to that step — it contains the actual Playwright commands that perform the action in the browser. One step definition can match multiple Gherkin steps if they share the same pattern, making step definitions reusable across scenarios and feature files.
10 Next Step
You can now write Gherkin scenarios and wire them to Playwright with Cucumber. The next module covers catching visual regressions automatically — layout shifts, colour changes, and rendering bugs that functional tests miss entirely.