AI Test Stack
AI Foundations for QA Professionals/Level 6 — AI Tools Ecosystem & Advanced QA Techniques
Lesson

Natural Language QA: Testing in English

Write tests in plain English; let AI convert them to Playwright, Selenium, or API calls.

13 min read
A natural-language QA workflow showing English test steps, AI interpretation, generated Playwright code, review by QA and SDET, and CI execution.
A natural-language QA workflow showing English test steps, AI interpretation, generated Playwright code, review by QA and SDET, and CI execution.

Overview

Test automation has a gatekeeping problem. Only engineers who know Playwright, Selenium, or Python can write tests. Business analysts, non-technical QA, and product owners are locked out.

Natural Language QA removes that barrier. Write tests in plain English—literally describe what you want to test—and AI converts it to automation code:

code
7 lines
1English:
2"Click the 'Add to Cart' button, enter quantity 5, then verify the price is $150"
3
4AI converts to:
5await page.click("button:has-text('Add to Cart')")
6await page.fill("input[type='number']", "5")
7await expect(page.locator(".total-price")).toContainText("$150")

This lesson teaches you how to:

  • Write tests in plain English (Gherkin-style or natural prose)
  • Use AI tools to convert English → automation code
  • Integrate natural language tests into your CI/CD
  • Combine with SDETs for edge cases and complex workflows
  • Scale testing to non-technical team members

A Practical Note for QA Learners

This is the most "team-bridging" lesson in Level 6.

After learning prompt patterns, tool choice, visual AI, self-healing, and test-data generation, we now ask:

  • how do more people participate in QA?
  • how can manual QA, BAs, and product people contribute test intent?
  • where should SDETs still take over?

Learning Goals

  • Write tests in natural English without coding knowledge
  • Understand AI → code conversion and when it works well
  • Integrate Gherkin + AI (Given/When/Then format)
  • Use no-code tools like agent-qa and quorvex_ai
  • Combine human-written English + AI execution for scalable testing

Core Concepts

The Natural Language Workflow

code
17 lines
1Step 1: Write English Specification
2"Log in with valid email, then verify dashboard loads"
3
4Step 2: AI Understands Intent
5Agent: "I need to: 1) Navigate to login, 2) Enter credentials, 3) Submit, 4) Wait for dashboard, 5) Assert dashboard visible"
6
7Step 3: AI Generates Code
8Playwright script:
9 await page.goto('/login')
10 await page.fill('input[name="email"]', 'test@example.com')
11 await page.fill('input[name="password"]', 'password123')
12 await page.click('button:has-text("Sign In")')
13 await page.waitForSelector('[data-testid="dashboard"]')
14 await expect(page.locator('[data-testid="dashboard"]')).toBeVisible()
15
16Step 4: Run & Report
17Results: PASS in 2.3 seconds

Three Natural Language Formats

Format 1: Gherkin (BDD Style)

gherkin
8 lines
1Scenario: User adds item to cart
2
3Given the user is on the product page
4And the product price is $50
5When the user clicks "Add to Cart"
6And the user enters quantity 3
7Then the cart total should be $150
8And the "Checkout" button should be enabled

Format 2: Natural Prose

code
14 lines
1Test: Complete checkout flow
2
31. Navigate to shop.example.com
42. Search for "wireless headphones"
53. Click the first result
64. Verify price is between $50 and $200
75. Add 2 units to cart
86. Proceed to checkout
97. Fill shipping address: "123 Main St, Boston, MA"
108. Select "Express Shipping"
119. Enter payment details
1210. Click "Place Order"
1311. Verify order confirmation page shows order number
1412. Verify email was sent

Format 3: User Story + Test Cases

code
13 lines
1User Story: As a buyer, I want to search for products so I can find what I need
2
3Test Case 1: Search returns results
4- Open search
5- Type "laptop"
6- Press Enter
7- Verify results appear
8- Verify each result has image, title, price
9
10Test Case 2: Search with no results
11- Type "zzzzzzzzzz"
12- Press Enter
13- Verify "No results found" message

How AI Converts English → Code

Step 1: Parse Intent

code
6 lines
1Input: "Click the add button"
2
3Parse:
4- Action: click
5- Target: add button
6- Context: unclear which button

Step 2: Infer Selectors

code
7 lines
1Possibilities:
2- button:has-text('Add')
3- button[aria-label='Add']
4- button.add-btn
5- [data-testid='add-button']
6
7AI picks most likely: button:has-text('Add')

Step 3: Add Waits & Assertions

code
6 lines
1Input: "Click add button and verify price updated"
2
3Generated:
4await page.click("button:has-text('Add')")
5await page.waitForLoadState('networkidle')
6await expect(page.locator('.price')).toContainText(expectedPrice)

Step 4: Handle Conditionals & Loops

code
10 lines
1Input: "For each product in the list, click 'Star' to favorite it"
2
3Generated:
4const products = page.locator('[data-testid="product-card"]')
5const count = await products.count()
6for (let i = 0; i < count; i++) {
7 const product = products.nth(i)
8 await product.locator('button[aria-label="Star"]').click()
9 await page.waitForTimeout(500)
10}

Practical Natural Language QA

Example 1: Gherkin → Playwright (agent-qa)

agent-qa is a tool that reads Gherkin and generates Playwright code.

python
31 lines
1from agent_qa import FeatureParser, PlaywrightCodegen
2
3# Write feature file
4feature_text = """
5Feature: Shopping Cart
6
7Scenario: Add multiple items to cart
8 Given I am on the products page
9 When I add "Laptop" to cart with quantity 1
10 And I add "Mouse" to cart with quantity 2
11 Then the cart should have 2 items
12 And the total should be $1,249.99
13 And the "Checkout" button should be enabled
14
15Scenario: Remove item from cart
16 Given I have "Keyboard" in my cart
17 When I click "Remove" next to "Keyboard"
18 Then the cart should not contain "Keyboard"
19 And the total price should update
20"""
21
22# Parse feature
23parser = FeatureParser()
24scenarios = parser.parse(feature_text)
25
26# Generate Playwright code
27codegen = PlaywrightCodegen()
28code = codegen.generate(scenarios)
29
30# Output: Playwright TypeScript ready to run
31print(code)

Generated code example:

typescript
23 lines
1test('Add multiple items to cart', async ({ page }) => {
2 // Given I am on the products page
3 await page.goto('https://shop.example.com/products')
4
5 // When I add "Laptop" to cart with quantity 1
6 await page.click('button:has-text("Laptop")')
7 await page.fill('input[type="number"]', '1')
8 await page.click('button:has-text("Add to Cart")')
9
10 // And I add "Mouse" to cart with quantity 2
11 await page.click('button:has-text("Mouse")')
12 await page.fill('input[type="number"]', '2')
13 await page.click('button:has-text("Add to Cart")')
14
15 // Then the cart should have 2 items
16 await expect(page.locator('[data-testid="cart-count"]')).toContainText('2')
17
18 // And the total should be $1,249.99
19 await expect(page.locator('.cart-total')).toContainText('$1,249.99')
20
21 // And the "Checkout" button should be enabled
22 await expect(page.locator('button:has-text("Checkout")')).toBeEnabled()
23})

Example 2: English Prose → API Tests

python
62 lines
1import anthropic
2import requests
3import json
4
5def generate_api_tests_from_english():
6 """Convert English description to API test code."""
7
8 client = anthropic.Anthropic()
9
10 english_spec = """
11 Test login API endpoint (POST /api/auth/login)
12
13 Test Case 1: Valid login
14 - Send username and password
15 - Expect 200 OK response
16 - Response should include JWT token and user object
17 - User object should have id, email, roles
18
19 Test Case 2: Invalid password
20 - Send valid username, wrong password
21 - Expect 401 Unauthorized
22 - Response should include error message "Invalid credentials"
23
24 Test Case 3: Account locked
25 - User has failed 5 login attempts
26 - Send correct credentials
27 - Expect 423 Locked
28 - Response should say "Account locked. Try again in 30 minutes"
29 """
30
31 prompt = f"""Convert this English test specification into Python pytest code:
32
33{english_spec}
34
35Requirements:
361. Use requests library for HTTP calls
372. Use pytest for assertions
383. Each test case should be a separate function
394. Mock the API responses if needed (or use a real test server URL)
405. Include setup/teardown for test data
416. Use meaningful variable names
42
43Generate ONLY the Python code, no explanation."""
44
45 response = client.messages.create(
46 model="claude-opus-4-7",
47 max_tokens=2000,
48 messages=[{"role": "user", "content": prompt}]
49 )
50
51 code = response.content[0].text
52 return code
53
54# Usage
55api_test_code = generate_api_tests_from_english()
56print(api_test_code)
57
58# Example output:
59# def test_login_valid():
60# response = requests.post(...)
61# assert response.status_code == 200
62# ...

Example 3: Collaborative Testing (Human + AI)

python
90 lines
1class CollaborativeTestSuite:
2 """
3 Manual QA writes English tests.
4 AI converts to code.
5 SDET reviews and refines.
6 """
7
8 def __init__(self):
9 self.ai = anthropic.Anthropic()
10 self.manual_qa_tests = []
11 self.generated_code = []
12 self.sdet_reviewed = []
13
14 def submit_english_test(self, description: str, author: str):
15 """Manual QA submits English test."""
16 test = {
17 "id": len(self.manual_qa_tests) + 1,
18 "description": description,
19 "author": author,
20 "status": "pending_generation"
21 }
22 self.manual_qa_tests.append(test)
23
24 # AI generates code
25 code = self._generate_code(description)
26 self.generated_code.append({
27 "test_id": test["id"],
28 "code": code,
29 "status": "pending_review"
30 })
31
32 def _generate_code(self, description):
33 """Ask Claude to generate test code."""
34 response = self.ai.messages.create(
35 model="claude-opus-4-7",
36 max_tokens=1000,
37 messages=[{
38 "role": "user",
39 "content": f"Convert to Playwright TypeScript:\n{description}"
40 }]
41 )
42 return response.content[0].text
43
44 def sdet_review(self, test_id: int, feedback: str, approved: bool):
45 """SDET reviews generated code."""
46 self.sdet_reviewed.append({
47 "test_id": test_id,
48 "approved": approved,
49 "feedback": feedback
50 })
51
52 def report(self):
53 """Show workflow progress."""
54 return {
55 "submitted_by_qa": len(self.manual_qa_tests),
56 "generated_by_ai": len(self.generated_code),
57 "reviewed_by_sdet": len(self.sdet_reviewed),
58 "approved": sum(1 for r in self.sdet_reviewed if r["approved"])
59 }
60
61# Usage
62suite = CollaborativeTestSuite()
63
64# Manual QA submits tests
65suite.submit_english_test(
66 "Verify checkout calculates shipping correctly for Boston, MA",
67 author="Alice (QA)"
68)
69
70suite.submit_english_test(
71 "Test that cart persists after page refresh",
72 author="Bob (QA)"
73)
74
75# SDET reviews
76suite.sdet_review(
77 test_id=1,
78 approved=True,
79 feedback="Good coverage. Added shipping address validation."
80)
81
82# Report
83print(suite.report())
84# Output:
85# {
86# "submitted_by_qa": 2,
87# "generated_by_ai": 2,
88# "reviewed_by_sdet": 1,
89# "approved": 1
90# }

QA/SDET Relevance

Manual QA Perspective

Major impact. You can now write tests without waiting for automation engineers:

  • Write tests in English
  • AI converts to code
  • Tests run automatically
  • You focus on *what* to test, not *how* to code

Automation Engineer Perspective

Frees up time from translating requirements to code. Instead of:

  • 3 hours writing Playwright for one feature
  • Focus on: architecting the test framework, maintaining complex tests, optimization

SDET Perspective

Build a review + refinement pipeline:

code
8 lines
11. QA writes English tests (5 min each)
22. AI generates code (30 sec each)
33. SDET reviews + enhances (15 min each)
4 - Add edge cases
5 - Optimize selectors
6 - Add error handling
7
8Result: 10 tests/day instead of 2 tests/day

Examples and Use Cases

Use Case 1: Non-Technical Team Members

Scenario: Product manager wants to verify new checkout flow works.

Before: "Can you write a test for this?" → Waits for engineer → Engineer has 20 other tasks

After: PM writes in English:

code
8 lines
1"Verify checkout:
21. Add 3 items to cart
32. Proceed to checkout
43. Enter Boston, MA address
54. Select Express Shipping ($25)
65. Verify total = product subtotal + $25
76. Click Place Order
87. See confirmation"

AI generates Playwright. SDET reviews (5 min). Test runs every PR.

Use Case 2: Exploratory Testing Automation

Scenario: Exploratory tester finds an interesting edge case.

Before: Manual testing, can't easily repeat or scale

After:

code
4 lines
1"Click discount code field, paste <script>alert('xss')</script>, verify it's escaped"
2 AI generates test
3 Test runs on every build
4 Prevents regression of that specific bug

Use Case 3: Business Analyst Test Cases

Scenario: Business analyst writes requirements as test cases.

code
9 lines
1"If customer spends > $100 in one month, they get 10% loyalty bonus
2Test:
3- Create customer with 5 purchases totaling $120 in April
4- Verify bonus is 10% of $120 = $12
5- In May, bonus is applied to first order"
6
7 AI converts to code
8 SDET enhances with edge cases
9 Runs in CI/CD

Hands-On Exercise

Exercise 1: Write Your First English Test

Your task: Write a test in plain English (no code).

Steps:

  1. Pick any website (Gmail, GitHub, Amazon)
  2. Write 5 test steps in English:
code
6 lines
1 Test: Search functionality
2 1. Click the search box
3 2. Type "laptop"
4 3. Press Enter
5 4. Wait for results
6 5. Verify results contain "laptop"
  1. Try more complex: Add assertions, conditions
code
6 lines
1 Test: Shopping cart persistence
2 1. Add item with price $50 to cart
3 2. Verify cart shows 1 item
4 3. Refresh page
5 4. Verify cart still shows 1 item
6 5. Verify price still shows $50

Exercise 2: Ask AI to Convert to Code

Your task: Have Claude convert your English test to Playwright code.

Steps:

  1. Go to Claude.ai
  2. Paste your English test from Exercise 1
  3. Add this instruction:
code
10 lines
1 Convert this test to Playwright TypeScript code.
2
3 Use:
4 - page.goto() for navigation
5 - page.click() for clicks
6 - page.fill() for typing
7 - expect() for assertions
8 - page.waitForSelector() for waits
9
10 Include comments explaining each step.
  1. Review the generated code: Is it correct? Would it work?
  2. Try a second test to see variation in generated code

Exercise 3: Estimate Impact for Your Team

Your task: Calculate potential efficiency gain.

Questions:

  1. How many tests does your team write per sprint? ___
  2. Average time per test: ___ hours (manual writing by engineer)
  3. If AI wrote tests 70% faster, new time per test: ___ hours
  4. Total time saved per sprint: ___ hours
  5. How many additional tests could you write with that time? ___
  6. Current test coverage: ___ % of features
  7. Expected coverage with 2x more tests: ___ %

When NOT to Use Natural Language QA

⚠️ Limitations

code
19 lines
1DON'T use natural language QA for:
2
31. Complex workflows with many conditionals
4 - "For each user, if they have > 5 orders, apply discount"
5 - Hard for AI to understand complex business logic
6 - SDET should hand-code these
7
82. Tests requiring deep DOM knowledge
9 - "Click the 3rd button in the form, not the one that looks like it"
10 - AI may pick wrong button
11 - Manual code needed
12
133. Sensitive data/security tests
14 - Too risky for AI to generate payment flows
15 - Manual SDET review + approval required
16
174. Performance/load testing
18 - "Simulate 1000 concurrent users"
19 - Needs specialized tools, not just English code

✓ Best For

code
17 lines
1DO use natural language QA for:
2
31. Happy path testing
4 - Clear steps, obvious success criteria
5
62. Smoke tests (basic sanity)
7 - Verify page loads, buttons work
8
93. Regression testing
10 - "Verify login still works"
11
124. Non-technical stakeholders
13 - QA, BA, PM can write tests
14
155. High-volume test cases
16 - 50+ similar tests with slight variations
17 - Natural language + AI = faster than manual coding

Key Takeaways

  • Natural Language QA democratizes test automation. Non-engineers can now write tests, scaling testing effort across the team.
  • English → Code is 70–80% accurate for straightforward tests. Complex logic still needs manual SDET engineering.
  • Collaborative model works best: QA writes English (5 min), AI generates code (30 sec), SDET reviews (10 min). Result: 3x throughput.
  • Integration with CI/CD is critical. Generated tests are only valuable if they run on every commit and report failures.
  • Combine with other AI strategies: Natural language + self-healing + data generation = full AI-assisted testing lifecycle.
  • Human review is essential. AI-generated code can be wrong. Always have an SDET verify before merging to main.

Next Steps

  • Start with simple tests: Login, search, navigation (high confidence for AI)
  • Use agent-qa or quorvex_ai to experiment with Gherkin → code
  • Build a review workflow: QA submits → AI generates → SDET approves → Run in CI/CD
  • In Level 7, combine natural language with full test suite generation: describe feature → AI writes all tests → integrated into pipelines