diff --git a/docs/explanation/features/tea-overview.md b/docs/explanation/features/tea-overview.md index cacd25e3..6e9a7127 100644 --- a/docs/explanation/features/tea-overview.md +++ b/docs/explanation/features/tea-overview.md @@ -6,6 +6,10 @@ description: Understanding the Test Architect (TEA) agent and its role in BMad M The Test Architect (TEA) is a specialized agent focused on quality strategy, test automation, and release gates in BMad Method projects. +:::tip[Design Philosophy] +TEA was built to solve AI-generated tests that rot in review. For the problem statement and design principles, see [Testing as Engineering](/docs/explanation/philosophy/testing-as-engineering.md). For setup, see [Setup Test Framework](/docs/how-to/workflows/setup-test-framework.md). +::: + ## Overview - **Persona:** Murat, Master Test Architect and Quality Advisor focused on risk-based testing, fixture architecture, ATDD, and CI/CD governance. diff --git a/docs/explanation/philosophy/testing-as-engineering.md b/docs/explanation/philosophy/testing-as-engineering.md new file mode 100644 index 00000000..bb270ad6 --- /dev/null +++ b/docs/explanation/philosophy/testing-as-engineering.md @@ -0,0 +1,119 @@ +--- +title: "AI-Generated Testing: Why Most Approaches Fail" +description: How Playwright-Utils, TEA workflows, and Playwright MCPs solve AI test quality problems +--- + + +AI-generated tests frequently fail in production because they lack systematic quality standards. This document explains the problem and presents a solution combining three components: Playwright-Utils, TEA (Test Architect), and Playwright MCPs. + +:::note[Source] +This article is adapted from [The Testing Meta Most Teams Have Not Caught Up To Yet](https://dev.to/muratkeremozcan/the-testing-meta-most-teams-have-not-caught-up-to-yet-5765) by Murat K Ozcan. +::: + +## The Problem with AI-Generated Tests + +When teams use AI to generate tests without structure, they often produce what can be called "slop factory" outputs: + +| Issue | Description | +|-------|-------------| +| Redundant coverage | Multiple tests covering the same functionality | +| Incorrect assertions | Tests that pass but don't actually verify behavior | +| Flaky tests | Non-deterministic tests that randomly pass or fail | +| Unreviewable diffs | Generated code too verbose or inconsistent to review | + +The core problem is that prompt-driven testing paths lean into nondeterminism, which is the exact opposite of what testing exists to protect. + +:::caution[The Paradox] +AI excels at generating code quickly, but testing requires precision and consistency. Without guardrails, AI-generated tests amplify the chaos they're meant to prevent. +::: + +## The Solution: A Three-Part Stack + +The solution combines three components that work together to enforce quality: + +### Playwright-Utils + +Bridges the gap between Cypress ergonomics and Playwright's capabilities by standardizing commonly reinvented primitives through utility functions. + +| Utility | Purpose | +|---------|---------| +| api-request | API calls with schema validation | +| auth-session | Authentication handling | +| intercept-network-call | Network mocking and interception | +| recurse | Retry logic and polling | +| log | Structured logging | +| network-recorder | Record and replay network traffic | +| burn-in | Smart test selection for CI | +| network-error-monitor | HTTP error detection | +| file-utils | CSV/PDF handling | + +These utilities eliminate the need to reinvent authentication, API calls, retries, and logging for every project. + +### TEA (Test Architect Agent) + +A quality operating model packaged as eight executable workflows spanning test design, CI/CD gates, and release readiness. TEA encodes test architecture expertise into repeatable processes. + +| Workflow | Purpose | +|----------|---------| +| `*test-design` | Risk-based test planning per epic | +| `*framework` | Scaffold production-ready test infrastructure | +| `*ci` | CI pipeline with selective testing | +| `*atdd` | Acceptance test-driven development | +| `*automate` | Prioritized test automation | +| `*test-review` | Test quality audits (0-100 score) | +| `*nfr-assess` | Non-functional requirements assessment | +| `*trace` | Coverage traceability and gate decisions | + +:::tip[Key Insight] +TEA doesn't just generate tests—it provides a complete quality operating model with workflows for planning, execution, and release gates. +::: + +### Playwright MCPs + +Model Context Protocols enable real-time verification during test generation. Instead of inferring selectors and behavior from documentation, MCPs allow agents to: + +- Run flows and confirm the DOM against the accessibility tree +- Validate network responses in real-time +- Discover actual functionality through interactive exploration +- Verify generated tests against live applications + +## How They Work Together + +The three components form a quality pipeline: + +| Stage | Component | Action | +|-------|-----------|--------| +| Standards | Playwright-Utils | Provides production-ready patterns and utilities | +| Process | TEA Workflows | Enforces systematic test planning and review | +| Verification | Playwright MCPs | Validates generated tests against live applications | + +**Before (AI-only):** 20 tests with redundant coverage, incorrect assertions, and flaky behavior. + +**After (Full Stack):** Risk-based selection, verified selectors, validated behavior, reviewable code. + +## Why This Matters + +Traditional AI testing approaches fail because they: + +- **Lack quality standards** — No consistent patterns or utilities +- **Skip planning** — Jump straight to test generation without risk assessment +- **Can't verify** — Generate tests without validating against actual behavior +- **Don't review** — No systematic audit of generated test quality + +The three-part stack addresses each gap: + +| Gap | Solution | +|-----|----------| +| No standards | Playwright-Utils provides production-ready patterns | +| No planning | TEA `*test-design` workflow creates risk-based test plans | +| No verification | Playwright MCPs validate against live applications | +| No review | TEA `*test-review` audits quality with scoring | + +This approach is sometimes called *context engineering*—loading domain-specific standards into AI context automatically rather than relying on prompts alone. TEA's `tea-index.csv` manifest loads relevant knowledge fragments so the AI doesn't relearn testing patterns each session. + +## Related + +- [TEA Overview](/docs/explanation/features/tea-overview.md) — Workflow details and cheat sheets +- [Setup Test Framework](/docs/how-to/workflows/setup-test-framework.md) — Implementation guide +- [The Testing Meta Most Teams Have Not Caught Up To Yet](https://dev.to/muratkeremozcan/the-testing-meta-most-teams-have-not-caught-up-to-yet-5765) — Original article by Murat K Ozcan +- [Playwright-Utils Repository](https://github.com/seontechnologies/playwright-utils) — Source and documentation diff --git a/docs/reference/glossary/index.md b/docs/reference/glossary/index.md index 9a42ce39..cd998551 100644 --- a/docs/reference/glossary/index.md +++ b/docs/reference/glossary/index.md @@ -363,6 +363,10 @@ Implementation technique for brownfield projects that allows gradual rollout of Specific locations where new code connects with existing systems. Must be documented explicitly in brownfield tech-specs and architectures. +### Context Engineering + +Loading domain-specific standards and patterns into AI context automatically, rather than relying on prompts alone. In TEA, this means the `tea-index.csv` manifest loads relevant knowledge fragments so the AI doesn't relearn testing patterns each session. This approach ensures consistent, production-ready outputs regardless of prompt variation. + ### Convention Detection Quick Spec Flow feature that automatically detects existing code style, naming conventions, patterns, and frameworks from brownfield codebases, then asks user to confirm before proceeding.