AI Test Stack
AI Foundations for QA Professionals/Level 3 — Generative AI
Lesson

What Is Generative AI?

Compare traditional AI and generative AI across text, image, video, code, and music use cases.

9 min read
A visual comparison of traditional AI classification on the left and generative AI creation on the right.
A visual comparison of traditional AI classification on the left and generative AI creation on the right.

Overview

Generative AI is the branch of AI that creates new content, instead of only classifying or scoring existing data. It can produce text, code, images, audio, and structured outputs from natural-language prompts.

For QA professionals, this changes daily work. You can generate test ideas, draft automation code, synthesize test data, summarize defects, and build assistants that speed up repetitive tasks. At the same time, you must test these systems carefully because generated output can be plausible but wrong.

This lesson builds directly on the deep-learning foundations from the previous level. Neural networks and deep learning explain how modern AI systems became powerful enough to generate new content instead of only classifying existing data.

A Practical Note for QA Learners

If this topic starts to feel dense, you do not need to master every internal detail right now. The most important takeaway is that generative AI systems predict and assemble new content from patterns they learned during training.

For day-to-day QA work, three things matter most:

  • These tools can accelerate test design, analysis, and automation drafting.
  • Their output can still be wrong or inconsistent.
  • Human review and validation remain essential.

If you prefer, skim the deeper sections and focus on the comparison table, failure modes, QA relevance, and practical exercise. That will give you a strong working understanding without needing to go too deep into model internals yet.

Learning Goals

  • Define generative AI in practical, non-academic terms.
  • Explain how generative AI differs from traditional AI systems.
  • Identify common generative model types and where they appear in QA workflows.
  • Recognize typical failure modes such as hallucinations and prompt sensitivity.
  • Apply a simple evaluation rubric to generated outputs.

Core Concepts

1. Traditional AI vs Generative AI

Traditional AI usually predicts a label, score, or action from input data.

Examples:

  • Spam detector predicts spam or not spam.
  • Fraud model predicts risk score.
  • Defect classifier predicts severity bucket.

Generative AI creates new artifacts from prompts.

Examples:

  • Drafting test cases from requirements.
  • Generating Playwright test skeletons.
  • Writing release note summaries from defect data.
DimensionTraditional AIGenerative AI
Primary outputLabel, score, decisionNew content
Common tasksClassification, regression, rankingText, code, image, audio generation
QA usageRisk prediction, anomaly detectionTest ideation, automation drafts, analysis summaries
Main riskMisclassificationHallucinated or unsafe output

2. What Generative AI Models Actually Learn

Most text-based generative systems learn patterns in sequences. During training, the model repeatedly predicts likely next tokens from context.

A simplified objective is:

That does not mean the model understands truth the way a human does. It means the model learns statistical structure from large datasets.

For QA, this is a key mindset shift:

  • The model is not a deterministic rule engine.
  • It can produce high-quality answers and still be incorrect on factual details.
  • You need verification workflows, not blind trust.

3. Multimodal Generative AI

Modern systems are often multimodal, meaning they can accept and generate across multiple data types.

InputOutputQA-relevant use case
Text promptTest plan or script draftGenerate smoke test checklist
Requirement docsBDD scenariosConvert stories to Given/When/Then
ScreenshotBug explanationExplain probable UI issue from visual evidence
Logs + promptRoot-cause summarySuggest likely failure cluster

Manual QA angle:

  • Use generated checklists to improve exploratory testing depth.

Automation QA/SDET angle:

  • Use generated code as a first draft, then refactor to team standards.

4. Common Generative Model Types

You do not need deep mathematical detail yet, but it helps to know the major families you will hear about in practice:

Model TypeWhat It Commonly GeneratesQA-Relevant Example
Large language models (LLMs)Text, summaries, code, structured answersGenerate test cases, draft automation code, summarize defects
Diffusion modelsImages and visual variationsGenerate UI concept images or synthetic visual assets for discussion
Speech and audio generation modelsSpoken responses, transcription, audio synthesisConvert spoken bug reports into text or generate voice test samples
Multimodal modelsMixed text + image or text + audio workflowsExplain a screenshot, summarize logs, and suggest likely issues together

For QA teams, the practical point is simple: different generative systems fail in different ways. A text model may hallucinate facts, while an image model may ignore visual constraints. Your evaluation strategy should match the output type.

5. Common Failure Modes You Must Test

Generative AI systems are useful but fragile under poor prompts, ambiguous context, and out-of-domain inputs.

Frequent failure modes:

  • Hallucination: generated facts or references that are not real.
  • Prompt brittleness: small wording changes produce very different outputs.
  • Incomplete reasoning: output looks correct but skips critical constraints.
  • Format drift: model ignores required JSON or schema format.
  • Hidden bias: output quality differs by language, locale, or user profile.

6. Prompt Quality Directly Impacts Output Quality

Prompting is part of system design, not just user behavior.

A practical prompt template for QA tasks:

text
5 lines
1Role: You are a senior QA engineer.
2Task: Generate boundary-focused test cases for the login API.
3Context: Auth supports email + password, 5 failed attempts lockout, MFA optional.
4Constraints: Include positive, negative, security, and rate-limit cases.
5Output format: Markdown table with columns ID, Scenario, Type, Expected Result.

This structure reduces ambiguity and improves repeatability.

7. What QA Teams Should Verify Before Trusting a Generative AI Workflow

Before a team adopts a generative AI tool in daily QA work, it helps to validate a few basic things explicitly:

AreaWhat to verifyWhy it matters
AccuracyDoes the tool preserve important facts, rules, and constraints?Plausible language can still hide wrong content
CoverageDoes it include positive, negative, edge, and failure scenarios?AI often produces a polished but incomplete first draft
RepeatabilityDoes the quality remain acceptable across prompt rewrites?Good output from one phrasing does not guarantee stability
SafetyDoes it avoid unsafe, non-compliant, or misleading suggestions?QA teams need safe defaults, not only useful output
ReviewabilityCan a human quickly check and correct the result?Fast review is what turns AI output into a practical productivity gain

QA/SDET Relevance

Generative AI can accelerate many quality activities if used with verification controls.

Manual QA use cases:

  • Expand exploratory charters from requirement text.
  • Generate edge-case ideas before sprint testing starts.
  • Summarize lengthy defect discussions into triage-ready notes.

QA automation and SDET use cases:

  • Draft API and UI test skeletons from acceptance criteria.
  • Generate test data variants for boundary and negative testing.
  • Auto-suggest assertions, then validate against business rules.
  • Build internal prompt libraries for recurring testing tasks.

Guardrails you should enforce:

  • Human review before merging generated test artifacts.
  • Fact-check references, endpoints, and expected behavior.
  • Track prompt versions for reproducibility.
  • Add evaluation checks for schema and policy compliance.

A useful operating principle:

  • Treat generative AI as a fast drafting partner, not as a final authority. The value usually comes from accelerating a first pass while keeping QA ownership over correctness and risk.

Practical Work

Exercise: Evaluate Generated Test Cases with a Rubric

Objective: Learn to evaluate output quality instead of only generating more output.

  1. Choose one requirement from your project (or sample requirement below).
  2. Ask an AI tool to generate 12 test cases.
  3. Score each case using this rubric.
  4. Improve the prompt and run again.
  5. Compare before and after quality.

Sample requirement:

text
3 lines
1Users can reset password via email OTP.
2OTP expires in 10 minutes.
3After 5 invalid OTP attempts, account is locked for 15 minutes.

Rubric:

Criterion0 points1 point2 points
Requirement coverageMisses key rulesCovers most rulesCovers all rules
Edge-case depthNo boundariesSome boundariesStrong boundary set
Negative scenariosMissingPartialComprehensive
Security awarenessMissingBasicIncludes abuse and lockout checks
ClarityAmbiguousMostly clearPrecise and executable

Score interpretation:

  • 0 to 4: low confidence output
  • 5 to 7: usable with strong edits
  • 8 to 10: high-quality draft for review

Reflection:

  1. Which criterion improved most after refining the prompt?
  2. Which defects could slip through if you trusted first-output quality?
  3. What review checklist should your team standardize for generated artifacts?

Key Takeaways

  • Generative AI creates new content; traditional AI mostly predicts labels or scores.
  • Output quality depends heavily on prompt quality and context quality.
  • These systems are probabilistic, so reproducibility and verification matter.
  • QA teams should evaluate generated output with explicit rubrics.
  • Human review remains mandatory for correctness, safety, and policy compliance.

YouTube Resources

Google Cloud Tech video thumbnail for a beginner-friendly introduction to generative AI.
Google Cloud Tech video thumbnail for a beginner-friendly introduction to generative AI.

What this helps with: A clear, beginner-friendly explanation of what generative AI is, how it differs from earlier AI systems, and where it is used in practice.

IBM Technology video thumbnail comparing AI, machine learning, deep learning, and generative AI.
IBM Technology video thumbnail comparing AI, machine learning, deep learning, and generative AI.

What this helps with: A useful comparison of AI, machine learning, deep learning, and generative AI so learners can place this lesson in the bigger picture.

Next Step

Next, continue with How AI Systems Learn to understand pretraining, fine-tuning, and feedback loops that shape model behavior and failure patterns.