AI Terminology in Real QA Workflows
Apply AI jargon correctly in test design, defect analysis, evaluations, and release decisions.
Overview
Knowing terminology is useful, but applying it correctly in workflow is what prevents production incidents. This lesson maps AI jargon to concrete QA actions so your team can use the same language in planning, testing, and release gating.
After this lesson, terms like token budget, context window, grounding, and hallucination will no longer be abstract. They will become specific checks in your daily quality process.
A Practical Note for QA Learners
This lesson is where terminology stops being a glossary and starts becoming a working QA tool. You do not need to remember every definition perfectly. What matters is being able to choose the right term when you write a requirement, design a test, raise a defect, or argue for a release check.
If you prefer a simpler focus, concentrate on these three ideas:
- vague AI language leads to vague testing
- precise terminology makes risks measurable
- good QA workflows turn jargon into concrete checks
Learning Goals
- Map AI terminology to practical QA lifecycle stages.
- Write clearer defects and risks using precise AI language.
- Design better test suites by selecting the right term-driven checks.
- Build a release checklist that reflects AI-specific failure patterns.
- Prepare for Prompt Engineering with a strong operational vocabulary.
Core Concepts
1. Requirement Stage: Vocabulary Drives Scope
If requirements say "AI summary should be accurate," that is too vague.
Better requirement language:
- Grounding: summary must rely only on provided evidence.
- Format reliability: output must match schema.
- Robustness: paraphrased input should preserve key meaning.
- Latency target: response under agreed threshold.
Terminology here prevents ambiguous acceptance criteria.
2. Prompt Design Stage: Terms Become Constraints
Key terms and how they apply:
| Term | Prompt-design usage |
|---|---|
| Context | Include only relevant facts; avoid noise |
| Token budget | Keep instructions concise to preserve room for retrieved evidence |
| Role | Separate developer constraints from user request |
| Output format | Ask for schema/table/JSON explicitly |
| Guardrails | State forbidden behavior clearly |
Example prompt frame:
1Role: You are a QA assistant.2Task: Generate boundary-focused API test cases.3Context: Include auth rules and retry policy.4Constraints: No invented endpoints. Use only provided fields.5Output: JSON array with id, scenario, type, expected_result.3. Test Design Stage: Build Term-Aligned Coverage
A useful AI test matrix should include:
| Test type | Linked terminology |
|---|---|
| Long-input truncation test | token budget, context window |
| Fact-consistency test | grounding, hallucination |
| Prompt variation test | robustness, stability |
| Safety test | guardrails, alignment |
| Retrieval quality test | embedding, vector search, RAG |
This helps teams avoid generic "AI test" buckets and create measurable checks.
4. Defect Triage Stage: Use Exact Language
Weak defect title:
- "AI gave wrong answer"
Strong defect title:
- "Hallucination: model generated nonexistent API field outside provided schema"
Weak root-cause note:
- "Model confused"
Strong root-cause note:
- "Context window exceeded; relevant acceptance criteria truncated before inference"
Precise terms accelerate triage and make fixes testable.
5. Evaluation Stage: Multi-Dimensional Quality
AI output quality is not one score. A useful rubric tracks several dimensions:
| Dimension | Term linkage |
|---|---|
| Correctness | grounding, hallucination |
| Completeness | context coverage |
| Consistency | robustness |
| Safety | alignment, guardrails |
| Efficiency | latency, token usage |
6. Release Gate Stage: Terminology-Based Checklist
Before release, ask:
- Do we have evidence for hallucination rate on our core tasks?
- Do we test context-limit behavior with realistic payloads?
- Are safety refusal and over-refusal both measured?
- Are RAG citations validated against source documents?
- Do prompts and templates have version tracking?
These are vocabulary-backed controls, not informal opinions.
7. Agent Terms: Operational Caution for Now
You may hear "let's make it agentic" before your team is ready.
For now, apply caution language:
- We currently ship an assistant workflow, not a full autonomous agent.
- Tool-calling behavior needs separate test coverage.
- Memory persistence requires privacy and data-retention checks.
Deep agent design will come in later levels.
QA/SDET Relevance
Manual QA impact:
- better exploratory prompts
- clearer evidence-based defects
- improved risk communication with product and leadership
Automation/SDET impact:
- cleaner test taxonomy
- easier CI integration for AI checks
- better observability metrics tied to known failure modes
Practical Work
Exercise: Convert a Generic QA Plan into an AI-Specific Plan
Objective: Upgrade one existing QA plan using precise AI terminology.
- Take a current feature that uses AI output.
- Identify vague terms: smart, accurate, stable, safe.
- Rewrite them using precise terms from this module.
- Add at least 8 term-linked tests.
- Define pass/fail thresholds for release.
Template:
| Old statement | Revised statement |
|---|---|
| AI should be accurate | Grounded summary must contain only facts present in source ticket and log excerpt |
| Output should be stable | Across 5 paraphrases, key entity extraction F1 must remain above agreed threshold |
| AI should be safe | Prompt-injection attempts must not expose hidden system instructions |
Reflection:
- Which revised term changed your test strategy the most?
- Which terms are now mandatory in defect reports?
- What should be automated first before Prompt Engineering level starts?
Key Takeaways
- Terminology becomes valuable only when it changes workflow behavior.
- Better terms produce better requirements, tests, and triage outcomes.
- AI quality requires multi-dimensional evaluation, not single-score thinking.
- Precise vocabulary reduces confusion and release risk.
- You are now ready to enter Prompt Engineering with a shared language baseline.
Next Step
Proceed to Level 5, Prompt Engineering Fundamentals, where you will design prompts systematically using the terminology and workflow controls from this module. If that next level is still being authored, pause here and make sure your team can already use these terms correctly in requirements, tests, and defect reports.