From 48881f86a6e862024ad513e726ab6c5987f4b5cb Mon Sep 17 00:00:00 2001 From: Murat K Ozcan <34237651+muratkeremozcan@users.noreply.github.com> Date: Fri, 23 Jan 2026 13:00:48 -0600 Subject: [PATCH] doc: test design refinements (#1382) --- .../testarch/test-design/checklist.md | 204 ++++++--- .../testarch/test-design/instructions.md | 359 +++++++++++++-- .../test-design-architecture-template.md | 71 ++- .../test-design/test-design-qa-template.md | 420 ++++++++---------- 4 files changed, 687 insertions(+), 367 deletions(-) diff --git a/src/bmm/workflows/testarch/test-design/checklist.md b/src/bmm/workflows/testarch/test-design/checklist.md index 3dadfbbb..8ed106ec 100644 --- a/src/bmm/workflows/testarch/test-design/checklist.md +++ b/src/bmm/workflows/testarch/test-design/checklist.md @@ -80,23 +80,29 @@ - [ ] Owners assigned where applicable - [ ] No duplicate coverage (same behavior at multiple levels) -### Execution Order +### Execution Strategy -- [ ] Smoke tests defined (<5 min target) -- [ ] P0 tests listed (<10 min target) -- [ ] P1 tests listed (<30 min target) -- [ ] P2/P3 tests listed (<60 min target) -- [ ] Order optimizes for fast feedback +**CRITICAL: Keep execution strategy simple, avoid redundancy** + +- [ ] **Simple structure**: PR / Nightly / Weekly (NOT complex smoke/P0/P1/P2 tiers) +- [ ] **PR execution**: All functional tests unless significant infrastructure overhead +- [ ] **Nightly/Weekly**: Only performance, chaos, long-running, manual tests +- [ ] **No redundancy**: Don't re-list all tests (already in coverage plan) +- [ ] **Philosophy stated**: "Run everything in PRs if <15 min, defer only if expensive/long" +- [ ] **Playwright parallelization noted**: 100s of tests in 10-15 min ### Resource Estimates -- [ ] P0 hours calculated (count × 2 hours) -- [ ] P1 hours calculated (count × 1 hour) -- [ ] P2 hours calculated (count × 0.5 hours) -- [ ] P3 hours calculated (count × 0.25 hours) -- [ ] Total hours summed -- [ ] Days estimate provided (hours / 8) -- [ ] Estimates include setup time +**CRITICAL: Use intervals/ranges, NOT exact numbers** + +- [ ] P0 effort provided as interval range (e.g., "~25-40 hours" NOT "36 hours") +- [ ] P1 effort provided as interval range (e.g., "~20-35 hours" NOT "27 hours") +- [ ] P2 effort provided as interval range (e.g., "~10-30 hours" NOT "15.5 hours") +- [ ] P3 effort provided as interval range (e.g., "~2-5 hours" NOT "2.5 hours") +- [ ] Total effort provided as interval range (e.g., "~55-110 hours" NOT "81 hours") +- [ ] Timeline provided as week range (e.g., "~1.5-3 weeks" NOT "11 days") +- [ ] Estimates include setup time and account for complexity variations +- [ ] **No false precision**: Avoid exact calculations like "18 tests × 2 hours = 36 hours" ### Quality Gate Criteria @@ -126,11 +132,16 @@ ### Priority Assignment Accuracy -- [ ] P0: Truly blocks core functionality -- [ ] P0: High-risk (score ≥6) -- [ ] P0: No workaround exists -- [ ] P1: Important but not blocking -- [ ] P2/P3: Nice-to-have or edge cases +**CRITICAL: Priority classification is separate from execution timing** + +- [ ] **Priority sections (P0/P1/P2/P3) do NOT include execution context** (e.g., no "Run on every commit" in headers) +- [ ] **Priority sections have only "Criteria" and "Purpose"** (no "Execution:" field) +- [ ] **Execution Strategy section** is separate and handles timing based on infrastructure overhead +- [ ] P0: Truly blocks core functionality + High-risk (≥6) + No workaround +- [ ] P1: Important features + Medium-risk (3-4) + Common workflows +- [ ] P2: Secondary features + Low-risk (1-2) + Edge cases +- [ ] P3: Nice-to-have + Exploratory + Benchmarks +- [ ] **Note at top of Test Coverage Plan**: Clarifies P0/P1/P2/P3 = priority/risk, NOT execution timing ### Test Level Selection @@ -176,58 +187,90 @@ - [ ] 🚨 BLOCKERS - Team Must Decide (Sprint 0 critical path items) - [ ] ⚠️ HIGH PRIORITY - Team Should Validate (recommendations for approval) - [ ] 📋 INFO ONLY - Solutions Provided (no decisions needed) -- [ ] **Risk Assessment** section +- [ ] **Risk Assessment** section - **ACTIONABLE** - [ ] Total risks identified count - [ ] High-priority risks table (score ≥6) with all columns: Risk ID, Category, Description, Probability, Impact, Score, Mitigation, Owner, Timeline - [ ] Medium and low-priority risks tables - [ ] Risk category legend included -- [ ] **Testability Concerns** section (if system has architectural constraints) - - [ ] Blockers to fast feedback table - - [ ] Explanation of why standard CI/CD may not apply (if applicable) - - [ ] Tiered testing strategy table (if forced by architecture) - - [ ] Architectural improvements needed (or acknowledgment system supports testing well) +- [ ] **Testability Concerns and Architectural Gaps** section - **ACTIONABLE** + - [ ] **Sub-section: 🚨 ACTIONABLE CONCERNS** at TOP + - [ ] Blockers to Fast Feedback table (WHAT architecture must provide) + - [ ] Architectural Improvements Needed (WHAT must be changed) + - [ ] Each concern has: Owner, Timeline, Impact + - [ ] **Sub-section: Testability Assessment Summary** at BOTTOM (FYI) + - [ ] What Works Well (passing items) + - [ ] Accepted Trade-offs (no action required) + - [ ] This section only included if worth mentioning; otherwise omitted - [ ] **Risk Mitigation Plans** for all high-priority risks (≥6) - [ ] Each plan has: Strategy (numbered steps), Owner, Timeline, Status, Verification + - [ ] **Only Backend/DevOps/Arch/Security mitigations** (production code changes) + - [ ] QA-owned mitigations belong in QA doc instead - [ ] **Assumptions and Dependencies** section + - [ ] **Architectural assumptions only** (SLO targets, replication lag, system design) - [ ] Assumptions list (numbered) - [ ] Dependencies list with required dates - [ ] Risks to plan with impact and contingency + - [ ] QA execution assumptions belong in QA doc instead - [ ] **NO test implementation code** (long examples belong in QA doc) +- [ ] **NO test scripts** (no Playwright test(...) blocks, no assertions, no test setup code) +- [ ] **NO NFR test examples** (NFR sections describe WHAT to test, not HOW to test) - [ ] **NO test scenario checklists** (belong in QA doc) -- [ ] **Cross-references to QA doc** where appropriate +- [ ] **NO bloat or repetition** (consolidate repeated notes, avoid over-explanation) +- [ ] **Cross-references to QA doc** where appropriate (instead of duplication) +- [ ] **RECIPE SECTIONS NOT IN ARCHITECTURE DOC:** + - [ ] NO "Test Levels Strategy" section (unit/integration/E2E split belongs in QA doc only) + - [ ] NO "NFR Testing Approach" section with detailed test procedures (belongs in QA doc only) + - [ ] NO "Test Environment Requirements" section (belongs in QA doc only) + - [ ] NO "Recommendations for Sprint 0" section with test framework setup (belongs in QA doc only) + - [ ] NO "Quality Gate Criteria" section (pass rates, coverage targets belong in QA doc only) + - [ ] NO "Tool Selection" section (Playwright, k6, etc. belongs in QA doc only) ### test-design-qa.md -- [ ] **Purpose statement** at top (execution recipe for QA team) -- [ ] **Quick Reference for QA** section - - [ ] Before You Start checklist - - [ ] Test Execution Order - - [ ] Need Help? guidance -- [ ] **System Architecture Summary** (brief overview of services and data flow) -- [ ] **Test Environment Requirements** in early section (section 1-3, NOT buried at end) - - [ ] Table with Local/Dev/Staging environments - - [ ] Key principles listed (shared DB, randomization, parallel-safe, self-cleaning, shift-left) - - [ ] Code example provided -- [ ] **Testability Assessment** with prerequisites checklist - - [ ] References Architecture doc blockers (not duplication) -- [ ] **Test Levels Strategy** with unit/integration/E2E split - - [ ] System type identified - - [ ] Recommended split percentages with rationale - - [ ] Test count summary (P0/P1/P2/P3 totals) +**NEW STRUCTURE (streamlined from 375 to ~287 lines):** + +- [ ] **Purpose statement** at top (test execution recipe) +- [ ] **Executive Summary** with risk summary and coverage summary +- [ ] **Dependencies & Test Blockers** section in POSITION 2 (right after Executive Summary) + - [ ] Backend/Architecture dependencies listed (what QA needs from other teams) + - [ ] QA infrastructure setup listed (factories, fixtures, environments) + - [ ] Code example with playwright-utils if config.tea_use_playwright_utils is true + - [ ] Test from '@seontechnologies/playwright-utils/api-request/fixtures' + - [ ] Expect from '@playwright/test' (playwright-utils does not re-export expect) + - [ ] Code examples include assertions (no unused imports) +- [ ] **Risk Assessment** section (brief, references Architecture doc) + - [ ] High-priority risks table + - [ ] Medium/low-priority risks table + - [ ] Each risk shows "QA Test Coverage" column (how QA validates) - [ ] **Test Coverage Plan** with P0/P1/P2/P3 sections - - [ ] Each priority has: Execution details, Purpose, Criteria, Test Count - - [ ] Detailed test scenarios WITH CHECKBOXES - - [ ] Coverage table with columns: Requirement | Test Level | Risk Link | Test Count | Owner | Notes -- [ ] **Sprint 0 Setup Requirements** - - [ ] Architecture/Backend blockers listed with cross-references to Architecture doc - - [ ] QA Test Infrastructure section (factories, fixtures) - - [ ] Test Environments section (Local, CI/CD, Staging, Production) - - [ ] Sprint 0 NFR Gates checklist - - [ ] Sprint 1 Items clearly separated -- [ ] **NFR Readiness Summary** (reference to Architecture doc, not duplication) - - [ ] Table with NFR categories, status, evidence, blocker, next action -- [ ] **Cross-references to Architecture doc** (not duplication) -- [ ] **NO architectural theory** (just reference Architecture doc) + - [ ] Priority sections have ONLY "Criteria" (no execution context) + - [ ] Note at top: "P0/P1/P2/P3 = priority, NOT execution timing" + - [ ] Test tables with columns: Test ID | Requirement | Test Level | Risk Link | Notes +- [ ] **Execution Strategy** section (organized by TOOL TYPE) + - [ ] Every PR: Playwright tests (~10-15 min) + - [ ] Nightly: k6 performance tests (~30-60 min) + - [ ] Weekly: Chaos & long-running (~hours) + - [ ] Philosophy: "Run everything in PRs unless expensive/long-running" +- [ ] **QA Effort Estimate** section (QA effort ONLY) + - [ ] Interval-based estimates (e.g., "~1-2 weeks" NOT "36 hours") + - [ ] NO DevOps, Backend, Data Eng, Finance effort + - [ ] NO Sprint breakdowns (too prescriptive) +- [ ] **Appendix A: Code Examples & Tagging** +- [ ] **Appendix B: Knowledge Base References** + +**REMOVED SECTIONS (bloat):** +- [ ] ❌ NO Quick Reference section (bloat) +- [ ] ❌ NO System Architecture Summary (bloat) +- [ ] ❌ NO Test Environment Requirements as separate section (integrated into Dependencies) +- [ ] ❌ NO Testability Assessment section (bloat - covered in Dependencies) +- [ ] ❌ NO Test Levels Strategy section (bloat - obvious from test scenarios) +- [ ] ❌ NO NFR Readiness Summary (bloat) +- [ ] ❌ NO Quality Gate Criteria section (teams decide for themselves) +- [ ] ❌ NO Follow-on Workflows section (bloat - BMAD commands self-explanatory) +- [ ] ❌ NO Approval section (unnecessary formality) +- [ ] ❌ NO Infrastructure/DevOps/Finance effort tables (out of scope) +- [ ] ❌ NO Sprint 0/1/2/3 breakdown tables (too prescriptive) +- [ ] ❌ NO Next Steps section (bloat) ### Cross-Document Consistency @@ -238,6 +281,40 @@ - [ ] Dates and authors match across documents - [ ] ADR and PRD references consistent +### Document Quality (Anti-Bloat Check) + +**CRITICAL: Check for bloat and repetition across BOTH documents** + +- [ ] **No repeated notes 10+ times** (e.g., "Timing is pessimistic until R-005 fixed" on every section) +- [ ] **Repeated information consolidated** (write once at top, reference briefly if needed) +- [ ] **No excessive detail** that doesn't add value (obvious concepts, redundant examples) +- [ ] **Focus on unique/critical info** (only document what's different from standard practice) +- [ ] **Architecture doc**: Concerns-focused, NOT implementation-focused +- [ ] **QA doc**: Implementation-focused, NOT theory-focused +- [ ] **Clear separation**: Architecture = WHAT and WHY, QA = HOW +- [ ] **Professional tone**: No AI slop markers + - [ ] Avoid excessive ✅/❌ emojis (use sparingly, only when adding clarity) + - [ ] Avoid "absolutely", "excellent", "fantastic", overly enthusiastic language + - [ ] Write professionally and directly +- [ ] **Architecture doc length**: Target ~150-200 lines max (focus on actionable concerns only) +- [ ] **QA doc length**: Keep concise, remove bloat sections + +### Architecture Doc Structure (Actionable-First Principle) + +**CRITICAL: Validate structure follows actionable-first, FYI-last principle** + +- [ ] **Actionable sections at TOP:** + - [ ] Quick Guide (🚨 BLOCKERS first, then ⚠️ HIGH PRIORITY, then 📋 INFO ONLY last) + - [ ] Risk Assessment (high-priority risks ≥6 at top) + - [ ] Testability Concerns (concerns/blockers at top, passing items at bottom) + - [ ] Risk Mitigation Plans (for high-priority risks ≥6) +- [ ] **FYI sections at BOTTOM:** + - [ ] Testability Assessment Summary (what works well - only if worth mentioning) + - [ ] Assumptions and Dependencies +- [ ] **ASRs categorized correctly:** + - [ ] Actionable ASRs included in 🚨 or ⚠️ sections + - [ ] FYI ASRs included in 📋 section or omitted if obvious + ## Completion Criteria **All must be true:** @@ -295,9 +372,20 @@ If workflow fails: - **Solution**: Use test pyramid - E2E for critical paths only -**Issue**: Resource estimates too high +**Issue**: Resource estimates too high or too precise -- **Solution**: Invest in fixtures/factories to reduce per-test setup time +- **Solution**: + - Invest in fixtures/factories to reduce per-test setup time + - Use interval ranges (e.g., "~55-110 hours") instead of exact numbers (e.g., "81 hours") + - Widen intervals if high uncertainty exists + +**Issue**: Execution order section too complex or redundant + +- **Solution**: + - Default: Run everything in PRs (<15 min with Playwright parallelization) + - Only defer to nightly/weekly if expensive (k6, chaos, 4+ hour tests) + - Don't create smoke/P0/P1/P2/P3 tier structure + - Don't re-list all tests (already in coverage plan) ### Best Practices @@ -305,7 +393,9 @@ If workflow fails: - High-priority risks (≥6) require immediate mitigation - P0 tests should cover <10% of total scenarios - Avoid testing same behavior at multiple levels -- Include smoke tests (P0 subset) for fast feedback +- **Use interval-based estimates** (e.g., "~25-40 hours") instead of exact numbers to avoid false precision and provide flexibility +- **Keep execution strategy simple**: Default to "run everything in PRs" (<15 min with Playwright), only defer if expensive/long-running +- **Avoid execution order redundancy**: Don't create complex tier structures or re-list tests --- diff --git a/src/bmm/workflows/testarch/test-design/instructions.md b/src/bmm/workflows/testarch/test-design/instructions.md index fbee3103..1eae05be 100644 --- a/src/bmm/workflows/testarch/test-design/instructions.md +++ b/src/bmm/workflows/testarch/test-design/instructions.md @@ -157,7 +157,13 @@ TEA test-design workflow supports TWO modes, detected automatically: 1. **Review Architecture for Testability** - Evaluate architecture against these criteria: + **STRUCTURE PRINCIPLE: CONCERNS FIRST, PASSING ITEMS LAST** + + Evaluate architecture against these criteria and structure output as: + 1. **Testability Concerns** (ACTIONABLE - what's broken/missing) + 2. **Testability Assessment Summary** (FYI - what works well) + + **Testability Criteria:** **Controllability:** - Can we control system state for testing? (API seeding, factories, database reset) @@ -174,8 +180,18 @@ TEA test-design workflow supports TWO modes, detected automatically: - Can we reproduce failures? (deterministic waits, HAR capture, seed data) - Are components loosely coupled? (mockable, testable boundaries) + **In Architecture Doc Output:** + - **Section A: Testability Concerns** (TOP) - List what's BROKEN or MISSING + - Example: "No API for test data seeding → Cannot parallelize tests" + - Example: "Hardcoded DB connection → Cannot test in CI" + - **Section B: Testability Assessment Summary** (BOTTOM) - List what PASSES + - Example: "✅ API-first design supports test isolation" + - Only include if worth mentioning; otherwise omit this section entirely + 2. **Identify Architecturally Significant Requirements (ASRs)** + **CRITICAL: ASRs must indicate if ACTIONABLE or FYI** + From PRD NFRs and architecture decisions, identify quality requirements that: - Drive architecture decisions (e.g., "Must handle 10K concurrent users" → caching architecture) - Pose testability challenges (e.g., "Sub-second response time" → performance test infrastructure) @@ -183,21 +199,60 @@ TEA test-design workflow supports TWO modes, detected automatically: Score each ASR using risk matrix (probability × impact). + **In Architecture Doc, categorize ASRs:** + - **ACTIONABLE ASRs** (require architecture changes): Include in "Quick Guide" 🚨 or ⚠️ sections + - **FYI ASRs** (already satisfied by architecture): Include in "Quick Guide" 📋 section OR omit if obvious + + **Example:** + - ASR-001 (Score 9): "Multi-region deployment requires region-specific test infrastructure" → **ACTIONABLE** (goes in 🚨 BLOCKERS) + - ASR-002 (Score 4): "OAuth 2.1 authentication already implemented in ADR-5" → **FYI** (goes in 📋 INFO ONLY or omit) + + **Structure Principle:** Actionable ASRs at TOP, FYI ASRs at BOTTOM (or omit) + 3. **Define Test Levels Strategy** + **IMPORTANT: This section goes in QA doc ONLY, NOT in Architecture doc** + Based on architecture (mobile, web, API, microservices, monolith): - Recommend unit/integration/E2E split (e.g., 70/20/10 for API-heavy, 40/30/30 for UI-heavy) - Identify test environment needs (local, staging, ephemeral, production-like) - Define testing approach per technology (Playwright for web, Maestro for mobile, k6 for performance) -4. **Assess NFR Testing Approach** + **In Architecture doc:** Only mention test level split if it's an ACTIONABLE concern + - Example: "API response time <100ms requires load testing infrastructure" (concern) + - DO NOT include full test level strategy table in Architecture doc - For each NFR category: - - **Security**: Auth/authz tests, OWASP validation, secret handling (Playwright E2E + security tools) - - **Performance**: Load/stress/spike testing with k6, SLO/SLA thresholds - - **Reliability**: Error handling, retries, circuit breakers, health checks (Playwright + API tests) +4. **Assess NFR Requirements (MINIMAL in Architecture Doc)** + + **CRITICAL: NFR testing approach is a RECIPE - belongs in QA doc ONLY** + + **In Architecture Doc:** + - Only mention NFRs if they create testability CONCERNS + - Focus on WHAT architecture must provide, not HOW to test + - Keep it brief - 1-2 sentences per NFR category at most + + **Example - Security NFR in Architecture doc (if there's a concern):** + ✅ CORRECT (concern-focused, brief, WHAT/WHY only): + - "System must prevent cross-customer data access (GDPR requirement). Requires test infrastructure for multi-tenant isolation in Sprint 0." + - "OAuth tokens must expire after 1 hour (ADR-5). Requires test harness for token expiration validation." + + ❌ INCORRECT (too detailed, belongs in QA doc): + - Full table of security test scenarios + - Test scripts with code examples + - Detailed test procedures + - Tool selection (e.g., "use Playwright E2E + OWASP ZAP") + - Specific test approaches (e.g., "Test approach: Playwright E2E for auth/authz") + + **In QA Doc (full NFR testing approach):** + - **Security**: Full test scenarios, tooling (Playwright + OWASP ZAP), test procedures + - **Performance**: Load/stress/spike test scenarios, k6 scripts, SLO thresholds + - **Reliability**: Error handling tests, retry logic validation, circuit breaker tests - **Maintainability**: Coverage targets, code quality gates, observability validation + **Rule of Thumb:** + - Architecture doc: "What NFRs exist and what concerns they create" (1-2 sentences) + - QA doc: "How to test those NFRs" (full sections with tables, code, procedures) + 5. **Flag Testability Concerns** Identify architecture decisions that harm testability: @@ -228,22 +283,54 @@ TEA test-design workflow supports TWO modes, detected automatically: **Standard Structures (REQUIRED):** **test-design-architecture.md sections (in this order):** + + **STRUCTURE PRINCIPLE: Actionable items FIRST, FYI items LAST** + 1. Executive Summary (scope, business context, architecture, risk summary) 2. Quick Guide (🚨 BLOCKERS / ⚠️ HIGH PRIORITY / 📋 INFO ONLY) - 3. Risk Assessment (high/medium/low-priority risks with scoring) - 4. Testability Concerns and Architectural Gaps (if system has constraints) - 5. Risk Mitigation Plans (detailed for high-priority risks ≥6) - 6. Assumptions and Dependencies + 3. Risk Assessment (high/medium/low-priority risks with scoring) - **ACTIONABLE** + 4. Testability Concerns and Architectural Gaps - **ACTIONABLE** (what arch team must do) + - Sub-section: Blockers to Fast Feedback (ACTIONABLE - concerns FIRST) + - Sub-section: Architectural Improvements Needed (ACTIONABLE) + - Sub-section: Testability Assessment Summary (FYI - passing items LAST, only if worth mentioning) + 5. Risk Mitigation Plans (detailed for high-priority risks ≥6) - **ACTIONABLE** + 6. Assumptions and Dependencies - **FYI** + + **SECTIONS THAT DO NOT BELONG IN ARCHITECTURE DOC:** + - ❌ Test Levels Strategy (unit/integration/E2E split) - This is a RECIPE, belongs in QA doc ONLY + - ❌ NFR Testing Approach with test examples - This is a RECIPE, belongs in QA doc ONLY + - ❌ Test Environment Requirements - This is a RECIPE, belongs in QA doc ONLY + - ❌ Recommendations for Sprint 0 (test framework setup, factories) - This is a RECIPE, belongs in QA doc ONLY + - ❌ Quality Gate Criteria (pass rates, coverage targets) - This is a RECIPE, belongs in QA doc ONLY + - ❌ Tool Selection (Playwright, k6, etc.) - This is a RECIPE, belongs in QA doc ONLY + + **WHAT BELONGS IN ARCHITECTURE DOC:** + - ✅ Testability CONCERNS (what makes it hard to test) + - ✅ Architecture GAPS (what's missing for testability) + - ✅ What architecture team must DO (blockers, improvements) + - ✅ Risks and mitigation plans + - ✅ ASRs (Architecturally Significant Requirements) - but clarify if FYI or actionable **test-design-qa.md sections (in this order):** - 1. Quick Reference for QA (Before You Start, Execution Order, Need Help) - 2. System Architecture Summary (brief overview) - 3. Test Environment Requirements (MOVE UP - section 3, NOT buried at end) - 4. Testability Assessment (lightweight prerequisites checklist) - 5. Test Levels Strategy (unit/integration/E2E split with rationale) - 6. Test Coverage Plan (P0/P1/P2/P3 with detailed scenarios + checkboxes) - 7. Sprint 0 Setup Requirements (blockers, infrastructure, environments) - 8. NFR Readiness Summary (reference to Architecture doc) + 1. Executive Summary (risk summary, coverage summary) + 2. **Dependencies & Test Blockers** (CRITICAL: RIGHT AFTER SUMMARY - what QA needs from other teams) + 3. Risk Assessment (scored risks with categories - reference Arch doc, don't duplicate) + 4. Test Coverage Plan (P0/P1/P2/P3 with detailed scenarios + checkboxes) + 5. **Execution Strategy** (SIMPLE: Organized by TOOL TYPE: PR (Playwright) / Nightly (k6) / Weekly (chaos/manual)) + 6. QA Effort Estimate (QA effort ONLY - no DevOps, Data Eng, Finance, Backend) + 7. Appendices (code examples with playwright-utils, tagging strategy, knowledge base refs) + + **SECTIONS TO EXCLUDE FROM QA DOC:** + - ❌ Quality Gate Criteria (pass/fail thresholds - teams decide for themselves) + - ❌ Follow-on Workflows (bloat - BMAD commands are self-explanatory) + - ❌ Approval section (unnecessary formality) + - ❌ Test Environment Requirements (remove as separate section - integrate into Dependencies if needed) + - ❌ NFR Readiness Summary (bloat - covered in Risk Assessment) + - ❌ Testability Assessment (bloat - covered in Dependencies) + - ❌ Test Levels Strategy (bloat - obvious from test scenarios) + - ❌ Sprint breakdowns (too prescriptive) + - ❌ Infrastructure/DevOps/Data Eng effort tables (out of scope) + - ❌ Mitigation plans for non-QA work (belongs in Arch doc) **Content Guidelines:** @@ -252,26 +339,46 @@ TEA test-design workflow supports TWO modes, detected automatically: - ✅ Clear ownership (each blocker/ASR has owner + timeline) - ✅ Testability requirements (what architecture must support) - ✅ Mitigation plans (for each high-risk item ≥6) - - ✅ Short code examples (5-10 lines max showing what to support) + - ✅ Brief conceptual examples ONLY if needed to clarify architecture concerns (5-10 lines max) + - ✅ **Target length**: ~150-200 lines max (focus on actionable concerns only) + - ✅ **Professional tone**: Avoid AI slop (excessive ✅/❌ emojis, "absolutely", "excellent", overly enthusiastic language) - **Architecture doc (DON'T):** - - ❌ NO long test code examples (belongs in QA doc) - - ❌ NO test scenario checklists (belongs in QA doc) - - ❌ NO implementation details (how QA will test) + **Architecture doc (DON'T) - CRITICAL:** + - ❌ NO test scripts or test implementation code AT ALL - This is a communication doc for architects, not a testing guide + - ❌ NO Playwright test examples (e.g., test('...', async ({ request }) => ...)) + - ❌ NO assertion logic (e.g., expect(...).toBe(...)) + - ❌ NO test scenario checklists with checkboxes (belongs in QA doc) + - ❌ NO implementation details about HOW QA will test + - ❌ Focus on CONCERNS, not IMPLEMENTATION **QA doc (DO):** - ✅ Test scenario recipes (clear P0/P1/P2/P3 with checkboxes) - - ✅ Environment setup (Sprint 0 checklist with blockers) - - ✅ Tool setup (factories, fixtures, frameworks) + - ✅ Full test implementation code samples when helpful + - ✅ **IMPORTANT: If config.tea_use_playwright_utils is true, ALL code samples MUST use @seontechnologies/playwright-utils fixtures and utilities** + - ✅ Import test fixtures from '@seontechnologies/playwright-utils/api-request/fixtures' + - ✅ Import expect from '@playwright/test' (playwright-utils does not re-export expect) + - ✅ Use apiRequest fixture with schema validation, retry logic, and structured responses + - ✅ Dependencies & Test Blockers section RIGHT AFTER Executive Summary (what QA needs from other teams) + - ✅ **QA effort estimates ONLY** (no DevOps, Data Eng, Finance, Backend effort - out of scope) - ✅ Cross-references to Architecture doc (not duplication) + - ✅ **Professional tone**: Avoid AI slop (excessive ✅/❌ emojis, "absolutely", "excellent", overly enthusiastic language) **QA doc (DON'T):** - ❌ NO architectural theory (just reference Architecture doc) - ❌ NO ASR explanations (link to Architecture doc instead) - ❌ NO duplicate risk assessments (reference Architecture doc) + - ❌ NO Quality Gate Criteria section (teams decide pass/fail thresholds for themselves) + - ❌ NO Follow-on Workflows section (bloat - BMAD commands are self-explanatory) + - ❌ NO Approval section (unnecessary formality) + - ❌ NO effort estimates for other teams (DevOps, Backend, Data Eng, Finance - out of scope, QA effort only) + - ❌ NO Sprint breakdowns (too prescriptive - e.g., "Sprint 0: 40 hours, Sprint 1: 48 hours") + - ❌ NO mitigation plans for Backend/Arch/DevOps work (those belong in Architecture doc) + - ❌ NO architectural assumptions or debates (those belong in Architecture doc) **Anti-Patterns to Avoid (Cross-Document Redundancy):** + **CRITICAL: NO BLOAT, NO REPETITION, NO OVERINFO** + ❌ **DON'T duplicate OAuth requirements:** - Architecture doc: Explain OAuth 2.1 flow in detail - QA doc: Re-explain why OAuth 2.1 is required @@ -280,6 +387,24 @@ TEA test-design workflow supports TWO modes, detected automatically: - Architecture doc: "ASR-1: OAuth 2.1 required (see QA doc for 12 test scenarios)" - QA doc: "OAuth tests: 12 P0 scenarios (see Architecture doc R-001 for risk details)" + ❌ **DON'T repeat the same note 10+ times:** + - Example: "Timing is pessimistic until R-005 is fixed" repeated on every P0, P1, P2 section + - This creates bloat and makes docs hard to read + + ✅ **DO consolidate repeated information:** + - Write once at the top: "**Note**: All timing estimates are pessimistic pending R-005 resolution" + - Reference briefly if needed: "(pessimistic timing)" + + ❌ **DON'T include excessive detail that doesn't add value:** + - Long explanations of obvious concepts + - Redundant examples showing the same pattern + - Over-documentation of standard practices + + ✅ **DO focus on what's unique or critical:** + - Document only what's different from standard practice + - Highlight critical decisions and risks + - Keep explanations concise and actionable + **Markdown Cross-Reference Syntax Examples:** ```markdown @@ -330,6 +455,24 @@ TEA test-design workflow supports TWO modes, detected automatically: - Cross-reference between docs (no duplication) - Validate against checklist.md (System-Level Mode section) +**Common Over-Engineering to Avoid:** + + **In QA Doc:** + 1. ❌ Quality gate thresholds ("P0 must be 100%, P1 ≥95%") - Let teams decide for themselves + 2. ❌ Effort estimates for other teams - QA doc should only estimate QA effort + 3. ❌ Sprint breakdowns ("Sprint 0: 40 hours, Sprint 1: 48 hours") - Too prescriptive + 4. ❌ Approval sections - Unnecessary formality + 5. ❌ Assumptions about architecture (SLO targets, replication lag) - These are architectural concerns, belong in Arch doc + 6. ❌ Mitigation plans for Backend/Arch/DevOps - Those belong in Arch doc + 7. ❌ Follow-on workflows section - Bloat, BMAD commands are self-explanatory + 8. ❌ NFR Readiness Summary - Bloat, covered in Risk Assessment + + **Test Coverage Numbers Reality Check:** + - With Playwright parallelization, running ALL Playwright tests is as fast as running just P0 + - Don't split Playwright tests by priority into different CI gates - it adds no value + - Tool type matters, not priority labels + - Defer based on infrastructure cost, not importance + **After System-Level Mode:** Workflow COMPLETE. System-level outputs (test-design-architecture.md + test-design-qa.md) are written in this step. Steps 2-4 are epic-level only - do NOT execute them in system-level mode. --- @@ -540,12 +683,51 @@ TEA test-design workflow supports TWO modes, detected automatically: 8. **Plan Mitigations** + **CRITICAL: Mitigation placement depends on WHO does the work** + For each high-priority risk: - Define mitigation strategy - Assign owner (dev, QA, ops) - Set timeline - Update residual risk expectation + **Mitigation Plan Placement:** + + **Architecture Doc:** + - Mitigations owned by Backend, DevOps, Architecture, Security, Data Eng + - Example: "Add authorization layer for customer-scoped access" (Backend work) + - Example: "Configure AWS Fault Injection Simulator" (DevOps work) + - Example: "Define CloudWatch log schema for backfill events" (Architecture work) + + **QA Doc:** + - Mitigations owned by QA (test development work) + - Example: "Create factories for test data with randomization" (QA work) + - Example: "Implement polling with retry for async validation" (QA test code) + - Brief reference to Architecture doc mitigations (don't duplicate) + + **Rule of Thumb:** + - If mitigation requires production code changes → Architecture doc + - If mitigation is test infrastructure/code → QA doc + - If mitigation involves multiple teams → Architecture doc with QA validation approach + + **Assumptions Placement:** + + **Architecture Doc:** + - Architectural assumptions (SLO targets, replication lag, system design assumptions) + - Example: "P95 <500ms inferred from <2s timeout (requires Product approval)" + - Example: "Multi-region replication lag <1s assumed (ADR doesn't specify SLA)" + - Example: "Recent Cache hit ratio >80% assumed (not in PRD/ADR)" + + **QA Doc:** + - Test execution assumptions (test infrastructure readiness, test data availability) + - Example: "Assumes test factories already created" + - Example: "Assumes CI/CD pipeline configured" + - Brief reference to Architecture doc for architectural assumptions + + **Rule of Thumb:** + - If assumption is about system architecture/design → Architecture doc + - If assumption is about test infrastructure/execution → QA doc + --- ## Step 3: Design Test Coverage @@ -594,6 +776,8 @@ TEA test-design workflow supports TWO modes, detected automatically: 3. **Assign Priority Levels** + **CRITICAL: P0/P1/P2/P3 indicates priority and risk level, NOT execution timing** + **Knowledge Base Reference**: `test-priorities-matrix.md` **P0 (Critical)**: @@ -601,25 +785,28 @@ TEA test-design workflow supports TWO modes, detected automatically: - High-risk areas (score ≥6) - Revenue-impacting - Security-critical - - **Run on every commit** + - No workaround exists + - Affects majority of users **P1 (High)**: - Important user features - Medium-risk areas (score 3-4) - Common workflows - - **Run on PR to main** + - Workaround exists but difficult **P2 (Medium)**: - Secondary features - Low-risk areas (score 1-2) - Edge cases - - **Run nightly or weekly** + - Regression prevention **P3 (Low)**: - Nice-to-have - Exploratory - Performance benchmarks - - **Run on-demand** + - Documentation validation + + **NOTE:** Priority classification is separate from execution timing. A P1 test might run in PRs if it's fast, or nightly if it requires expensive infrastructure (e.g., k6 performance test). See "Execution Strategy" section for timing guidance. 4. **Outline Data and Tooling Prerequisites** @@ -629,13 +816,55 @@ TEA test-design workflow supports TWO modes, detected automatically: - Environment setup - Tools and dependencies -5. **Define Execution Order** +5. **Define Execution Strategy** (Keep It Simple) - Recommend test execution sequence: - 1. **Smoke tests** (P0 subset, <5 min) - 2. **P0 tests** (critical paths, <10 min) - 3. **P1 tests** (important features, <30 min) - 4. **P2/P3 tests** (full regression, <60 min) + **IMPORTANT: Avoid over-engineering execution order** + + **Default Philosophy:** + - Run **everything** in PRs if total duration <15 minutes + - Playwright is fast with parallelization (100s of tests in ~10-15 min) + - Only defer to nightly/weekly if there's significant overhead: + - Performance tests (k6, load testing) - expensive infrastructure + - Chaos engineering - requires special setup (AWS FIS) + - Long-running tests - endurance (4+ hours), disaster recovery + - Manual tests - require human intervention + + **Simple Execution Strategy (Organized by TOOL TYPE):** + + ```markdown + ## Execution Strategy + + **Philosophy**: Run everything in PRs unless significant infrastructure overhead. + Playwright with parallelization is extremely fast (100s of tests in ~10-15 min). + + **Organized by TOOL TYPE:** + + ### Every PR: Playwright Tests (~10-15 min) + All functional tests (from any priority level): + - All E2E, API, integration, unit tests using Playwright + - Parallelized across {N} shards + - Total: ~{N} tests (includes P0, P1, P2, P3) + + ### Nightly: k6 Performance Tests (~30-60 min) + All performance tests (from any priority level): + - Load, stress, spike, endurance + - Reason: Expensive infrastructure, long-running (10-40 min per test) + + ### Weekly: Chaos & Long-Running (~hours) + Special infrastructure tests (from any priority level): + - Multi-region failover, disaster recovery, endurance + - Reason: Very expensive, very long (4+ hours) + ``` + + **KEY INSIGHT: Organize by TOOL TYPE, not priority** + - Playwright (fast, cheap) → PR + - k6 (expensive, long) → Nightly + - Chaos/Manual (very expensive, very long) → Weekly + + **Avoid:** + - ❌ Don't organize by priority (smoke → P0 → P1 → P2 → P3) + - ❌ Don't say "P1 runs on PR to main" (some P1 are Playwright/PR, some are k6/Nightly) + - ❌ Don't create artificial tiers - organize by tool type and infrastructure overhead --- @@ -661,34 +890,66 @@ TEA test-design workflow supports TWO modes, detected automatically: | Login flow | E2E | P0 | R-001 | 3 | QA | ``` -3. **Document Execution Order** +3. **Document Execution Strategy** (Simple, Not Redundant) + + **IMPORTANT: Keep execution strategy simple and avoid redundancy** ```markdown - ### Smoke Tests (<5 min) + ## Execution Strategy - - Login successful - - Dashboard loads + **Default: Run all functional tests in PRs (~10-15 min)** + - All Playwright tests (parallelized across 4 shards) + - Includes E2E, API, integration, unit tests + - Total: ~{N} tests - ### P0 Tests (<10 min) + **Nightly: Performance & Infrastructure tests** + - k6 load/stress/spike tests (~30-60 min) + - Reason: Expensive infrastructure, long-running - - [Full P0 list] - - ### P1 Tests (<30 min) - - - [Full P1 list] + **Weekly: Chaos & Disaster Recovery** + - Endurance tests (4+ hours) + - Multi-region failover (requires AWS FIS) + - Backup restore validation + - Reason: Special infrastructure, very long-running ``` + **DO NOT:** + - ❌ Create redundant smoke/P0/P1/P2/P3 tier structure + - ❌ List all tests again in execution order (already in coverage plan) + - ❌ Split tests by priority unless there's infrastructure overhead + 4. **Include Resource Estimates** + **IMPORTANT: Use intervals/ranges, not exact numbers** + + Provide rough estimates with intervals to avoid false precision: + ```markdown ### Test Effort Estimates - - P0 scenarios: 15 tests × 2 hours = 30 hours - - P1 scenarios: 25 tests × 1 hour = 25 hours - - P2 scenarios: 40 tests × 0.5 hour = 20 hours - - **Total:** 75 hours (~10 days) + - P0 scenarios: 15 tests (~1.5-2.5 hours each) = **~25-40 hours** + - P1 scenarios: 25 tests (~0.75-1.5 hours each) = **~20-35 hours** + - P2 scenarios: 40 tests (~0.25-0.75 hours each) = **~10-30 hours** + - **Total:** **~55-105 hours** (~1.5-3 weeks with 1 QA engineer) ``` + **Why intervals:** + - Avoids false precision (estimates are never exact) + - Provides flexibility for complexity variations + - Accounts for unknowns and dependencies + - More realistic and less prescriptive + + **Guidelines:** + - P0 tests: 1.5-2.5 hours each (complex setup, security, performance) + - P1 tests: 0.75-1.5 hours each (standard integration, API tests) + - P2 tests: 0.25-0.75 hours each (edge cases, simple validation) + - P3 tests: 0.1-0.5 hours each (exploratory, documentation) + + **Express totals as:** + - Hour ranges: "~55-105 hours" + - Week ranges: "~1.5-3 weeks" + - Avoid: Exact numbers like "75 hours" or "11 days" + 5. **Add Gate Criteria** ```markdown diff --git a/src/bmm/workflows/testarch/test-design/test-design-architecture-template.md b/src/bmm/workflows/testarch/test-design/test-design-architecture-template.md index 3cf8be46..571f6f20 100644 --- a/src/bmm/workflows/testarch/test-design/test-design-architecture-template.md +++ b/src/bmm/workflows/testarch/test-design/test-design-architecture-template.md @@ -108,54 +108,51 @@ ### Testability Concerns and Architectural Gaps -**IMPORTANT**: {If system has constraints, explain them. If standard CI/CD achievable, state that.} +**🚨 ACTIONABLE CONCERNS - Architecture Team Must Address** -#### Blockers to Fast Feedback +{If system has critical testability concerns, list them here. If architecture supports testing well, state "No critical testability concerns identified" and skip to Testability Assessment Summary} -| Blocker | Impact | Current Mitigation | Ideal Solution | -|---------|--------|-------------------|----------------| -| **{Blocker name}** | {Impact description} | {How we're working around it} | {What architecture should provide} | +#### 1. Blockers to Fast Feedback (WHAT WE NEED FROM ARCHITECTURE) -#### Why This Matters +| Concern | Impact | What Architecture Must Provide | Owner | Timeline | +|---------|--------|--------------------------------|-------|----------| +| **{Concern name}** | {Impact on testing} | {Specific architectural change needed} | {Team} | {Sprint} | -**Standard CI/CD expectations:** -- Full test suite on every commit (~5-15 min feedback) -- Parallel test execution (isolated test data per worker) -- Ephemeral test environments (spin up → test → tear down) -- Fast feedback loop (devs stay in flow state) +**Example:** +- **No API for test data seeding** → Cannot parallelize tests → Provide POST /test/seed endpoint (Backend, Sprint 0) -**Current reality for {Feature}:** -- {Actual situation - what's different from standard} +#### 2. Architectural Improvements Needed (WHAT SHOULD BE CHANGED) -#### Tiered Testing Strategy - -{If forced by architecture, explain. If standard approach works, state that.} - -| Tier | When | Duration | Coverage | Why Not Full Suite? | -|------|------|----------|----------|---------------------| -| **Smoke** | Every commit | <5 min | {N} tests | Fast feedback, catch build-breaking changes | -| **P0** | Every commit | ~{X} min | ~{N} tests | Critical paths, security-critical flows | -| **P1** | PR to main | ~{X} min | ~{N} tests | Important features, algorithm accuracy | -| **P2/P3** | Nightly | ~{X} min | ~{N} tests | Edge cases, performance, NFR | - -**Note**: {Any timing assumptions or constraints} - -#### Architectural Improvements Needed - -{If system has technical debt affecting testing, list improvements. If architecture supports testing well, acknowledge that.} +{List specific improvements that would make the system more testable} 1. **{Improvement name}** - - {What to change} - - **Impact**: {How it improves testing} + - **Current problem**: {What's wrong} + - **Required change**: {What architecture must do} + - **Impact if not fixed**: {Consequences} + - **Owner**: {Team} + - **Timeline**: {Sprint} -#### Acceptance of Trade-offs +--- -For {Feature} Phase 1, the team accepts: -- **{Trade-off 1}** ({Reasoning}) -- **{Trade-off 2}** ({Reasoning}) -- ⚠️ **{Known limitation}** ({Why acceptable for now}) +### Testability Assessment Summary -This is {**technical debt** OR **acceptable for Phase 1**} that should be {revisited post-GA OR maintained as-is}. +**📊 CURRENT STATE - FYI** + +{Only include this section if there are passing items worth mentioning. Otherwise omit.} + +#### What Works Well + +- ✅ {Passing item 1} (e.g., "API-first design supports parallel test execution") +- ✅ {Passing item 2} (e.g., "Feature flags enable test isolation") +- ✅ {Passing item 3} + +#### Accepted Trade-offs (No Action Required) + +For {Feature} Phase 1, the following trade-offs are acceptable: +- **{Trade-off 1}** - {Why acceptable for now} +- **{Trade-off 2}** - {Why acceptable for now} + +{This is technical debt OR acceptable for Phase 1} that {should be revisited post-GA OR maintained as-is} --- diff --git a/src/bmm/workflows/testarch/test-design/test-design-qa-template.md b/src/bmm/workflows/testarch/test-design/test-design-qa-template.md index a055736b..037856b7 100644 --- a/src/bmm/workflows/testarch/test-design/test-design-qa-template.md +++ b/src/bmm/workflows/testarch/test-design/test-design-qa-template.md @@ -1,314 +1,286 @@ # Test Design for QA: {Feature Name} -**Purpose:** Test execution recipe for QA team. Defines test scenarios, coverage plan, tooling, and Sprint 0 setup requirements. Use this as your implementation guide after architectural blockers are resolved. +**Purpose:** Test execution recipe for QA team. Defines what to test, how to test it, and what QA needs from other teams. **Date:** {date} **Author:** {author} -**Status:** Draft / Ready for Implementation +**Status:** Draft **Project:** {project_name} -**PRD Reference:** {prd_link} -**ADR Reference:** {adr_link} + +**Related:** See Architecture doc (test-design-architecture.md) for testability concerns and architectural blockers. --- -## Quick Reference for QA +## Executive Summary -**Before You Start:** -- [ ] Review Architecture doc (test-design-architecture.md) - understand blockers and risks -- [ ] Verify Sprint 0 blockers resolved (see Sprint 0 section below) -- [ ] Confirm test infrastructure ready (factories, fixtures, environments) +**Scope:** {Brief description of testing scope} -**Test Execution Order:** -1. **Smoke tests** (<5 min) - Fast feedback on critical paths -2. **P0 tests** (~{X} min) - Critical paths, security-critical flows -3. **P1 tests** (~{X} min) - Important features, algorithm accuracy -4. **P2/P3 tests** (~{X} min) - Edge cases, performance, NFR +**Risk Summary:** +- Total Risks: {N} ({X} high-priority score ≥6, {Y} medium, {Z} low) +- Critical Categories: {Categories with most high-priority risks} -**Need Help?** -- Blockers: See Architecture doc "Quick Guide" for mitigation plans -- Test scenarios: See "Test Coverage Plan" section below -- Sprint 0 setup: See "Sprint 0 Setup Requirements" section +**Coverage Summary:** +- P0 tests: ~{N} (critical paths, security) +- P1 tests: ~{N} (important features, integration) +- P2 tests: ~{N} (edge cases, regression) +- P3 tests: ~{N} (exploratory, benchmarks) +- **Total**: ~{N} tests (~{X}-{Y} weeks with 1 QA) --- -## System Architecture Summary +## Dependencies & Test Blockers -**Data Pipeline:** -{Brief description of system flow} +**CRITICAL:** QA cannot proceed without these items from other teams. -**Key Services:** -- **{Service 1}**: {Purpose and key responsibilities} -- **{Service 2}**: {Purpose and key responsibilities} -- **{Service 3}**: {Purpose and key responsibilities} +### Backend/Architecture Dependencies (Sprint 0) -**Data Stores:** -- **{Database 1}**: {What it stores} -- **{Database 2}**: {What it stores} +**Source:** See Architecture doc "Quick Guide" for detailed mitigation plans -**Expected Scale** (from ADR): -- {Key metrics: RPS, volume, users, etc.} +1. **{Dependency 1}** - {Team} - {Timeline} + - {What QA needs} + - {Why it blocks testing} ---- +2. **{Dependency 2}** - {Team} - {Timeline} + - {What QA needs} + - {Why it blocks testing} -## Test Environment Requirements +### QA Infrastructure Setup (Sprint 0) -**{Company} Standard:** Shared DB per Environment with Randomization (Shift-Left) +1. **Test Data Factories** - QA + - {Entity} factory with faker-based randomization + - Auto-cleanup fixtures for parallel safety -| Environment | Database | Test Data Strategy | Purpose | -|-------------|----------|-------------------|---------| -| **Local** | {DB} (shared) | Randomized (faker), auto-cleanup | Local development | -| **Dev (CI)** | {DB} (shared) | Randomized (faker), auto-cleanup | PR validation | -| **Staging** | {DB} (shared) | Randomized (faker), auto-cleanup | Pre-production, E2E | +2. **Test Environments** - QA + - Local: {Setup details} + - CI/CD: {Setup details} + - Staging: {Setup details} -**Key Principles:** -- **Shared database per environment** (no ephemeral) -- **Randomization for isolation** (faker-based unique IDs) -- **Parallel-safe** (concurrent test runs don't conflict) -- **Self-cleaning** (tests delete their own data) -- **Shift-left** (test against real DBs early) - -**Example:** +**Example factory pattern:** ```typescript -import { faker } from "@faker-js/faker"; +import { test } from '@seontechnologies/playwright-utils/api-request/fixtures'; +import { expect } from '@playwright/test'; +import { faker } from '@faker-js/faker'; -test("example with randomized test data @p0", async ({ apiRequest }) => { +test('example test @p0', async ({ apiRequest }) => { const testData = { id: `test-${faker.string.uuid()}`, - customerId: `test-customer-${faker.string.alphanumeric(8)}`, - // ... unique test data + email: faker.internet.email(), }; - // Seed, test, cleanup + const { status } = await apiRequest({ + method: 'POST', + path: '/api/resource', + body: testData, + }); + + expect(status).toBe(201); }); ``` --- -## Testability Assessment +## Risk Assessment -**Prerequisites from Architecture Doc:** +**Note:** Full risk details in Architecture doc. This section summarizes risks relevant to QA test planning. -Verify these blockers are resolved before test development: -- [ ] {Blocker 1} (see Architecture doc Quick Guide → 🚨 BLOCKERS) -- [ ] {Blocker 2} -- [ ] {Blocker 3} +### High-Priority Risks (Score ≥6) -**If Prerequisites Not Met:** Coordinate with Architecture team (see Architecture doc for mitigation plans and owner assignments) +| Risk ID | Category | Description | Score | QA Test Coverage | +|---------|----------|-------------|-------|------------------| +| **{R-ID}** | {CAT} | {Brief description} | **{Score}** | {How QA validates this risk} | ---- +### Medium/Low-Priority Risks -## Test Levels Strategy - -**System Type:** {API-heavy / UI-heavy / Mixed backend system} - -**Recommended Split:** -- **Unit Tests: {X}%** - {What to unit test} -- **Integration/API Tests: {X}%** - ⭐ **PRIMARY FOCUS** - {What to integration test} -- **E2E Tests: {X}%** - {What to E2E test} - -**Rationale:** {Why this split makes sense for this system} - -**Test Count Summary:** -- P0: ~{N} tests - Critical paths, run on every commit -- P1: ~{N} tests - Important features, run on PR to main -- P2: ~{N} tests - Edge cases, run nightly/weekly -- P3: ~{N} tests - Exploratory, run on-demand -- **Total: ~{N} tests** (~{X} weeks for 1 QA, ~{Y} weeks for 2 QAs) +| Risk ID | Category | Description | Score | QA Test Coverage | +|---------|----------|-------------|-------|------------------| +| {R-ID} | {CAT} | {Brief description} | {Score} | {How QA validates this risk} | --- ## Test Coverage Plan -**Repository Note:** {Where tests live - backend repo, admin panel repo, etc. - and how CI pipelines are organized} +**IMPORTANT:** P0/P1/P2/P3 = **priority and risk level** (what to focus on if time-constrained), NOT execution timing. See "Execution Strategy" for when tests run. -### P0 (Critical) - Run on every commit (~{X} min) +### P0 (Critical) -**Execution:** CI/CD on every commit, parallel workers, smoke tests first (<5 min) +**Criteria:** Blocks core functionality + High risk (≥6) + No workaround + Affects majority of users -**Purpose:** Critical path validation - catch build-breaking changes and security violations immediately +| Test ID | Requirement | Test Level | Risk Link | Notes | +|---------|-------------|------------|-----------|-------| +| **P0-001** | {Requirement} | {Level} | {R-ID} | {Notes} | +| **P0-002** | {Requirement} | {Level} | {R-ID} | {Notes} | -**Criteria:** Blocks core functionality OR High risk (≥6) OR No workaround - -**Key Smoke Tests** (subset of P0, run first for fast feedback): -- {Smoke test 1} - {Duration} -- {Smoke test 2} - {Duration} -- {Smoke test 3} - {Duration} - -| Requirement | Test Level | Risk Link | Test Count | Owner | Notes | -|-------------|------------|-----------|------------|-------|-------| -| {Requirement 1} | {Level} | {R-ID} | {N} | QA | {Notes} | -| {Requirement 2} | {Level} | {R-ID} | {N} | QA | {Notes} | - -**Total P0:** ~{N} tests (~{X} weeks) - -#### P0 Test Scenarios (Detailed) - -**1. {Test Category} ({N} tests) - {CRITICALITY if applicable}** - -- [ ] {Scenario 1 with checkbox} -- [ ] {Scenario 2} -- [ ] {Scenario 3} - -**2. {Test Category 2} ({N} tests)** - -- [ ] {Scenario 1} -- [ ] {Scenario 2} - -{Continue for all P0 categories} +**Total P0:** ~{N} tests --- -### P1 (High) - Run on PR to main (~{X} min additional) +### P1 (High) -**Execution:** CI/CD on pull requests to main branch, runs after P0 passes, parallel workers +**Criteria:** Important features + Medium risk (3-4) + Common workflows + Workaround exists but difficult -**Purpose:** Important feature coverage - algorithm accuracy, complex workflows, Admin Panel interactions +| Test ID | Requirement | Test Level | Risk Link | Notes | +|---------|-------------|------------|-----------|-------| +| **P1-001** | {Requirement} | {Level} | {R-ID} | {Notes} | +| **P1-002** | {Requirement} | {Level} | {R-ID} | {Notes} | -**Criteria:** Important features OR Medium risk (3-4) OR Common workflows - -| Requirement | Test Level | Risk Link | Test Count | Owner | Notes | -|-------------|------------|-----------|------------|-------|-------| -| {Requirement 1} | {Level} | {R-ID} | {N} | QA | {Notes} | -| {Requirement 2} | {Level} | {R-ID} | {N} | QA | {Notes} | - -**Total P1:** ~{N} tests (~{X} weeks) - -#### P1 Test Scenarios (Detailed) - -**1. {Test Category} ({N} tests)** - -- [ ] {Scenario 1} -- [ ] {Scenario 2} - -{Continue for all P1 categories} +**Total P1:** ~{N} tests --- -### P2 (Medium) - Run nightly/weekly (~{X} min) +### P2 (Medium) -**Execution:** Scheduled nightly run (or weekly for P3), full infrastructure, sequential execution acceptable +**Criteria:** Secondary features + Low risk (1-2) + Edge cases + Regression prevention -**Purpose:** Edge case coverage, error handling, data integrity validation - slow feedback acceptable +| Test ID | Requirement | Test Level | Risk Link | Notes | +|---------|-------------|------------|-----------|-------| +| **P2-001** | {Requirement} | {Level} | {R-ID} | {Notes} | -**Criteria:** Secondary features OR Low risk (1-2) OR Edge cases - -| Requirement | Test Level | Risk Link | Test Count | Owner | Notes | -|-------------|------------|-----------|------------|-------|-------| -| {Requirement 1} | {Level} | {R-ID} | {N} | QA | {Notes} | -| {Requirement 2} | {Level} | {R-ID} | {N} | QA | {Notes} | - -**Total P2:** ~{N} tests (~{X} weeks) +**Total P2:** ~{N} tests --- -### P3 (Low) - Run on-demand (exploratory) +### P3 (Low) -**Execution:** Manual trigger or weekly scheduled run, performance testing +**Criteria:** Nice-to-have + Exploratory + Performance benchmarks + Documentation validation -**Purpose:** Full regression, performance benchmarks, accessibility validation - no time pressure +| Test ID | Requirement | Test Level | Notes | +|---------|-------------|------------|-------| +| **P3-001** | {Requirement} | {Level} | {Notes} | -**Criteria:** Nice-to-have OR Exploratory OR Performance benchmarks - -| Requirement | Test Level | Test Count | Owner | Notes | -|-------------|------------|------------|-------|-------| -| {Requirement 1} | {Level} | {N} | QA | {Notes} | -| {Requirement 2} | {Level} | {N} | QA | {Notes} | - -**Total P3:** ~{N} tests (~{X} days) +**Total P3:** ~{N} tests --- -### Coverage Matrix (Requirements → Tests) +## Execution Strategy -| Requirement | Test Level | Priority | Risk Link | Test Count | Owner | -|-------------|------------|----------|-----------|------------|-------| -| {Requirement 1} | {Level} | {P0-P3} | {R-ID} | {N} | {Owner} | -| {Requirement 2} | {Level} | {P0-P3} | {R-ID} | {N} | {Owner} | +**Philosophy:** Run everything in PRs unless there's significant infrastructure overhead. Playwright with parallelization is extremely fast (100s of tests in ~10-15 min). + +**Organized by TOOL TYPE:** + +### Every PR: Playwright Tests (~10-15 min) + +**All functional tests** (from any priority level): +- All E2E, API, integration, unit tests using Playwright +- Parallelized across {N} shards +- Total: ~{N} Playwright tests (includes P0, P1, P2, P3) + +**Why run in PRs:** Fast feedback, no expensive infrastructure + +### Nightly: k6 Performance Tests (~30-60 min) + +**All performance tests** (from any priority level): +- Load, stress, spike, endurance tests +- Total: ~{N} k6 tests (may include P0, P1, P2) + +**Why defer to nightly:** Expensive infrastructure (k6 Cloud), long-running (10-40 min per test) + +### Weekly: Chaos & Long-Running (~hours) + +**Special infrastructure tests** (from any priority level): +- Multi-region failover (requires AWS Fault Injection Simulator) +- Disaster recovery (backup restore, 4+ hours) +- Endurance tests (4+ hours runtime) + +**Why defer to weekly:** Very expensive infrastructure, very long-running, infrequent validation sufficient + +**Manual tests** (excluded from automation): +- DevOps validation (deployment, monitoring) +- Finance validation (cost alerts) +- Documentation validation --- -## Sprint 0 Setup Requirements +## QA Effort Estimate -**IMPORTANT:** These items **BLOCK test development**. Complete in Sprint 0 before QA can write tests. +**QA test development effort only** (excludes DevOps, Backend, Data Eng, Finance work): -### Architecture/Backend Blockers (from Architecture doc) +| Priority | Count | Effort Range | Notes | +|----------|-------|--------------|-------| +| P0 | ~{N} | ~{X}-{Y} weeks | Complex setup (security, performance, multi-step) | +| P1 | ~{N} | ~{X}-{Y} weeks | Standard coverage (integration, API tests) | +| P2 | ~{N} | ~{X}-{Y} days | Edge cases, simple validation | +| P3 | ~{N} | ~{X}-{Y} days | Exploratory, benchmarks | +| **Total** | ~{N} | **~{X}-{Y} weeks** | **1 QA engineer, full-time** | -**Source:** See Architecture doc "Quick Guide" for detailed mitigation plans +**Assumptions:** +- Includes test design, implementation, debugging, CI integration +- Excludes ongoing maintenance (~10% effort) +- Assumes test infrastructure (factories, fixtures) ready -1. **{Blocker 1}** 🚨 **BLOCKER** - {Owner} - - {What needs to be provided} - - **Details:** Architecture doc {Risk-ID} mitigation plan - -2. **{Blocker 2}** 🚨 **BLOCKER** - {Owner} - - {What needs to be provided} - - **Details:** Architecture doc {Risk-ID} mitigation plan - -### QA Test Infrastructure - -1. **{Factory/Fixture Name}** - QA - - Faker-based generator: `{function_signature}` - - Auto-cleanup after tests - -2. **{Entity} Fixtures** - QA - - Seed scripts for {states/scenarios} - - Isolated {id_pattern} per test - -### Test Environments - -**Local:** {Setup details - Docker, LocalStack, etc.} - -**CI/CD:** {Setup details - shared infrastructure, parallel workers, artifacts} - -**Staging:** {Setup details - shared multi-tenant, nightly E2E} - -**Production:** {Setup details - feature flags, canary transactions} - -**Sprint 0 NFR Gates** (MUST complete before integration testing): -- [ ] {Gate 1}: {Description} (Owner) 🚨 -- [ ] {Gate 2}: {Description} (Owner) 🚨 -- [ ] {Gate 3}: {Description} (Owner) 🚨 - -### Sprint 1 Items (Not Sprint 0) - -- **{Item 1}** ({Owner}): {Description} -- **{Item 2}** ({Owner}): {Description} - -**Sprint 1 NFR Gates** (MUST complete before GA): -- [ ] {Gate 1}: {Description} (Owner) -- [ ] {Gate 2}: {Description} (Owner) +**Dependencies from other teams:** +- See "Dependencies & Test Blockers" section for what QA needs from Backend, DevOps, Data Eng --- -## NFR Readiness Summary +## Appendix A: Code Examples & Tagging -**Based on Architecture Doc Risk Assessment** +**Playwright Tags for Selective Execution:** -| NFR Category | Status | Evidence Status | Blocker | Next Action | -|--------------|--------|-----------------|---------|-------------| -| **Testability & Automation** | {Status} | {Evidence} | {Sprint} | {Action} | -| **Test Data Strategy** | {Status} | {Evidence} | {Sprint} | {Action} | -| **Scalability & Availability** | {Status} | {Evidence} | {Sprint} | {Action} | -| **Disaster Recovery** | {Status} | {Evidence} | {Sprint} | {Action} | -| **Security** | {Status} | {Evidence} | {Sprint} | {Action} | -| **Monitorability, Debuggability & Manageability** | {Status} | {Evidence} | {Sprint} | {Action} | -| **QoS & QoE** | {Status} | {Evidence} | {Sprint} | {Action} | -| **Deployability** | {Status} | {Evidence} | {Sprint} | {Action} | +```typescript +import { test } from '@seontechnologies/playwright-utils/api-request/fixtures'; +import { expect } from '@playwright/test'; -**Total:** {N} PASS, {N} CONCERNS across {N} categories +// P0 critical test +test('@P0 @API @Security unauthenticated request returns 401', async ({ apiRequest }) => { + const { status, body } = await apiRequest({ + method: 'POST', + path: '/api/endpoint', + body: { data: 'test' }, + skipAuth: true, + }); + + expect(status).toBe(401); + expect(body.error).toContain('unauthorized'); +}); + +// P1 integration test +test('@P1 @Integration data syncs correctly', async ({ apiRequest }) => { + // Seed data + await apiRequest({ + method: 'POST', + path: '/api/seed', + body: { /* test data */ }, + }); + + // Validate + const { status, body } = await apiRequest({ + method: 'GET', + path: '/api/resource', + }); + + expect(status).toBe(200); + expect(body).toHaveProperty('data'); +}); +``` + +**Run specific tags:** + +```bash +# Run only P0 tests +npx playwright test --grep @P0 + +# Run P0 + P1 tests +npx playwright test --grep "@P0|@P1" + +# Run only security tests +npx playwright test --grep @Security + +# Run all Playwright tests in PR (default) +npx playwright test +``` --- -**End of QA Document** +## Appendix B: Knowledge Base References -**Next Steps for QA Team:** -1. Verify Sprint 0 blockers resolved (coordinate with Architecture team if not) -2. Set up test infrastructure (factories, fixtures, environments) -3. Begin test implementation following priority order (P0 → P1 → P2 → P3) -4. Run smoke tests first for fast feedback -5. Track progress using test scenario checklists above +- **Risk Governance**: `risk-governance.md` - Risk scoring methodology +- **Test Priorities Matrix**: `test-priorities-matrix.md` - P0-P3 criteria +- **Test Levels Framework**: `test-levels-framework.md` - E2E vs API vs Unit selection +- **Test Quality**: `test-quality.md` - Definition of Done (no hard waits, <300 lines, <1.5 min) -**Next Steps for Architecture Team:** -1. Monitor Sprint 0 blocker resolution -2. Provide support for QA infrastructure setup if needed -3. Review test results and address any newly discovered testability gaps +--- + +**Generated by:** BMad TEA Agent +**Workflow:** `_bmad/bmm/testarch/test-design` +**Version:** 4.0 (BMad v6)