26 KiB
UAT Validation Integration Architecture
Overview
This document describes how UAT validation integrates with the epic-chain workflow to provide automated quality gates, self-healing fix loops, and comprehensive validation reporting.
Integration Points
┌──────────────────────────────────────────────────────────────────────────────────┐
│ EPIC CHAIN WITH UAT VALIDATION + SELF-HEALING │
├──────────────────────────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────────────────────────────────────────────────────────────────────────┐ │
│ │ PER EPIC LOOP │ │
│ │ │ │
│ │ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐ │ │
│ │ │ Phase 1 │──►│ Phase 2 │──►│ Phase 3 │──►│ Phase 4 │──►│ Phase 5 │ │ │
│ │ │ Dev │ │ Review │ │ Commit │ │ UAT │ │ Gate │ │ │
│ │ │ │ │ │ │ │ │ Gen │ │ Check │ │ │
│ │ └─────────┘ └─────────┘ └─────────┘ └─────────┘ └────┬────┘ │ │
│ │ │ │ │
│ └────────────────────────────────────────────────────────────────┼────────────┘ │
│ │ │
│ ┌─────────────┴───────┐ │
│ │ GATE DECISION │ │
│ └──────────┬──────────┘ │
│ │ │
│ ┌──────────────────┬──────────────┴──────────┐ │
│ │ │ │ │
│ ▼ ▼ ▼ │
│ ┌──────┐ ┌──────┐ ┌──────────┐ │
│ │ PASS │ │ FAIL │ │ MAX │ │
│ │ │ │ │ │ RETRIES │ │
│ └──┬───┘ └──┬───┘ └────┬─────┘ │
│ │ │ │ │
│ │ ▼ ▼ │
│ │ ┌────────────────────────┐ ┌────────────┐ │
│ │ │ SELF-HEALING │ │ HALT + │ │
│ │ │ │ │ NOTIFY │ │
│ │ │ ┌──────────────────┐ │ └────────────┘ │
│ │ │ │ Quick Dev Fix │ │ │
│ │ │ │ (Barry Agent) │ │ │
│ │ │ │ │ │ │
│ │ │ │ • Load failures │ │ │
│ │ │ │ • Generate fix │ │ │
│ │ │ │ • Commit changes │ │ │
│ │ │ └────────┬─────────┘ │ │
│ │ │ │ │ │
│ │ │ ▼ │ │
│ │ │ ┌──────────────────┐ │ │
│ │ │ │ Re-validate │ │ │
│ │ │ │ UAT Gate │──┼──► Back to GATE │
│ │ │ └──────────────────┘ │ │
│ │ │ │ │
│ │ └────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌──────────┐ │
│ │ Next │ │
│ │ Epic │ │
│ └──────────┘ │
│ │
│ ┌─────────────────────────────────────────────────────────────────────────────┐ │
│ │ CHAIN COMPLETION │ │
│ │ │ │
│ │ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────────┐ │ │
│ │ │ Aggregate │──►│ Generate │──►│ Final UAT │ │ │
│ │ │ Metrics │ │ Chain Report │ │ Summary │ │ │
│ │ └─────────────────┘ └─────────────────┘ └─────────────────────┘ │ │
│ │ │ │
│ └─────────────────────────────────────────────────────────────────────────────┘ │
│ │
└──────────────────────────────────────────────────────────────────────────────────┘
Self-Healing Loop: UAT Failure → Quick Dev → Re-validate
When UAT validation fails, the system automatically triggers a quick-dev session to fix the identified issues.
Flow Detail
UAT Gate Check
│
├── PASS ──────────────────────────────► Continue to Next Epic
│
└── FAIL
│
▼
┌─────────────────────────────────────────────────────────────┐
│ FAILURE ANALYSIS │
│ │
│ 1. Collect failed scenarios with: │
│ - Scenario ID and description │
│ - Expected vs actual output │
│ - Error messages / stack traces │
│ - Related story acceptance criteria │
│ │
│ 2. Generate fix context document: │
│ docs/sprint-artifacts/uat-fix-context-{epic}-{attempt}.md │
└─────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ QUICK DEV SESSION │
│ (Barry - Quick Flow Solo Dev) │
│ │
│ Input: uat-fix-context-{epic}-{attempt}.md │
│ │
│ Process: │
│ 1. Load fix context (failed scenarios + error details) │
│ 2. Analyze root cause for each failure │
│ 3. Implement targeted fixes │
│ 4. Run self-check (step-04) │
│ 5. Commit with message: "fix(epic-{id}): UAT fix #{n}" │
│ │
│ Output: │
│ - Code changes committed │
│ - Fix summary in story dev record │
└─────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ RE-VALIDATE │
│ │
│ Run UAT Gate Check again on same scenarios │
│ │
│ Outcomes: │
│ - PASS → Continue to next epic │
│ - FAIL + attempts < max_retries → Loop back to Quick Dev │
│ - FAIL + attempts >= max_retries → HALT chain │
└─────────────────────────────────────────────────────────────┘
Configuration
# In epic-chain config
uat:
gate_enabled: true
gate_mode: quick # quick | full | skip
# Self-healing configuration
self_heal:
enabled: true
max_retries: 2 # Maximum fix attempts per epic
fix_workflow: quick-dev # Workflow to use for fixes
fix_agent: barry # Agent to invoke
# What to include in fix context
include_in_context:
- failed_scenarios
- error_output
- related_stories
- acceptance_criteria
- recent_commits # Last 3 commits for context
# Escalation
on_max_retries: halt # halt | continue_with_warning | notify_human
notification_channel: null # Optional: slack, email, etc.
Fix Context Document Template
Generated when UAT fails, consumed by Quick Dev:
# UAT Fix Context - Epic {epic_id} (Attempt {n})
## Failed Scenarios
### Scenario {id}: {name}
**Expected Result:**
{expected}
**Actual Result:**
{actual}
**Error Output:**
{stderr or error message}
**Related Story:** {story_id}
**Acceptance Criteria:**
- {criteria from story}
---
## Fix Instructions
Address the following failures in priority order:
1. **{scenario_id}**: {one-line description of what's broken}
- Root cause hint: {if determinable}
- Files likely involved: {if determinable}
## Constraints
- Only fix the identified failures
- Do not refactor unrelated code
- Run tests after each fix
- Commit with message format: `fix(epic-{id}): {description}`
UAT Scenario Classification
Based on the UAT sample document, scenarios fall into three categories:
Automatable (Execute via Shell)
| Scenario Type | Example | Automation Method |
|---|---|---|
| CLI commands | npx heimdall --version |
Shell execution, check exit code + output |
| Build verification | npm run build |
Shell execution, parse output for success |
| API health checks | curl /health |
HTTP request, validate JSON response |
| Database status | npx heimdall db status |
Shell execution, parse structured output |
| Configuration validation | npx heimdall config validate |
Shell execution, check for "valid" in output |
Semi-Automated (Execute + Manual Verify)
| Scenario Type | Example | Approach |
|---|---|---|
| Email delivery | npx heimdall test-send |
Execute command, log message ID, flag for inbox check |
| File creation | heimdall config init |
Execute, verify file exists, show contents for review |
| Worker processes | heimdall start |
Start, verify startup message, terminate after timeout |
Manual Only
| Scenario Type | Example | Approach |
|---|---|---|
| External service setup | Railway deployment | Document steps, skip in automation |
| Visual verification | UI appearance | Generate screenshots if possible, flag for review |
| Multi-step human flows | Full onboarding journey | Provide checklist, require human sign-off |
Gate Check Implementation
Quick Gate (Default)
Runs only automatable scenarios from the "Minimum Requirements" section:
# uat-gate-config.yaml
gate_mode: quick
timeout_per_scenario: 30 # seconds
fail_threshold: 0 # any failure = gate fail
scenarios_to_run:
- type: "cli_command"
match: "Expected Results" sections with CLI commands
- type: "health_check"
match: "/health endpoint" scenarios
- type: "validation"
match: "validate" command scenarios
skip_scenarios:
- contains: "email inbox"
- contains: "Railway"
- contains: "browser"
- contains: "terminal window"
Full Gate
Runs all automatable scenarios plus flags semi-automated for review:
gate_mode: full
include_semi_automated: true
generate_manual_checklist: true
Data Flow
Per-Epic Metrics Collection
# Written to: {sprint_artifacts}/metrics/epic-{id}-metrics.yaml
epic_id: "1"
epic_name: "Foundation, CLI & Deployment"
execution:
start_time: "2026-01-02T13:40:00Z"
end_time: "2026-01-02T15:10:00Z"
duration_seconds: 5400
stories:
total: 7
completed: 7
failed: 0
skipped: 0
uat:
document_generated: true
document_path: "docs/uat/epic-1-uat.md"
scenarios:
total: 9
automatable: 6
semi_automated: 2
manual_only: 1
validation:
gate_executed: true
gate_mode: "quick"
results:
passed: 6
failed: 0
skipped: 3
gate_status: "PASS"
blocking_issues: []
# Self-healing loop tracking
fix_attempts: 0
fix_history: []
# Example when fixes were needed:
# fix_attempts: 2
# fix_history:
# - attempt: 1
# failed_scenarios: ["scenario-3", "scenario-5"]
# fix_context: "docs/sprint-artifacts/uat-fix-context-1-1.md"
# fix_commit: "abc123"
# result: "partial" # 1 of 2 fixed
# - attempt: 2
# failed_scenarios: ["scenario-5"]
# fix_context: "docs/sprint-artifacts/uat-fix-context-1-2.md"
# fix_commit: "def456"
# result: "success" # all fixed
issues:
- type: "signaling_mismatch"
story: "1-3"
severity: "low"
resolved: true
Chain Report Aggregation
# Read from: {sprint_artifacts}/metrics/epic-*-metrics.yaml
# Write to: {sprint_artifacts}/chain-execution-report.md
chain:
total_epics: 8
total_stories: 58
total_duration_seconds: 63000
epics:
- id: "1"
stories: 7
duration: 5400
uat_gate: "PASS"
# ... etc
uat_summary:
total_scenarios: 72
automatable: 48
auto_passed: 45
auto_failed: 3
manual_pending: 24
gate_results:
passed: 7
failed: 1
blocked_chain: false
Workflow File Changes
Modified: epic-chain/workflow.yaml
# Add to variables section:
variables:
# ... existing ...
# UAT Gate Configuration
uat_gate_enabled: true
uat_gate_mode: "quick" # quick | full | skip
uat_gate_blocking: false # If true, halts chain on failure
# Report Configuration
generate_chain_report: true
chain_report_file: "{sprint_artifacts}/chain-execution-report.md"
metrics_folder: "{sprint_artifacts}/metrics"
New: step-05-uat-gate.md
# UAT Gate Check
## Purpose
Validate epic implementation against automatable UAT scenarios before proceeding.
## Inputs
- UAT document: `docs/uat/epic-{id}-uat.md`
- Gate config: `{uat_gate_mode}`
## Process
1. Parse UAT document for test scenarios
2. Identify automatable scenarios (CLI commands, API calls, file checks)
3. Execute each in isolated shell
4. Collect results with stdout/stderr evidence
5. Determine gate status
## Outputs
- Gate result: PASS | FAIL
- Metrics update: `{metrics_folder}/epic-{id}-metrics.yaml`
## Exit Conditions
- PASS: Continue to next epic
- FAIL + blocking=false: Log warning, continue
- FAIL + blocking=true: Halt chain, require intervention
New: step-06-generate-report.md
# Chain Report Generation
## Purpose
Generate comprehensive execution report after chain completion.
## Inputs
- All metrics files: `{metrics_folder}/epic-*-metrics.yaml`
- Chain plan: `{chain_plan_file}`
- Template: `chain-report-template.md`
## Process
1. Load all epic metrics
2. Calculate aggregates (totals, averages, percentages)
3. Build dependency graph visualization
4. Compile issues list
5. Generate token/cost estimates
6. Render report template
## Outputs
- `{chain_report_file}` - Complete execution report
Agent Interaction Model
SM Agent (Orchestrator) - With Self-Healing
User: *epic-chain 1-8
SM: Loading chain plan...
Found 8 epics, 58 stories
Dependencies analyzed: sequential with branches
Starting Epic 1 (Foundation)...
[Dev → Review → Commit → UAT Gen → Gate Check]
Epic 1: COMPLETE (7/7 stories, UAT PASS)
Starting Epic 2 (Event Ingestion)...
[Dev → Review → Commit → UAT Gen → Gate Check]
⚠ UAT GATE FAILED - 2 scenarios failed
Initiating self-healing loop (attempt 1/2)...
Generating fix context → docs/sprint-artifacts/uat-fix-context-2-1.md
[Invoking Barry for quick-dev fix session]
Barry: Loading fix context...
Failure 1: API endpoint returning 404
Failure 2: Missing auth header validation
Implementing fixes...
✓ Fixed route registration in server.ts
✓ Added auth middleware check
Committed: fix(epic-2): UAT fix #1
Re-validating UAT gate...
✓ All scenarios now passing
Epic 2: COMPLETE (5/5 stories, UAT PASS after 1 fix)
Starting Epic 3 (Workflow Engine)...
...
Chain Complete.
Generating execution report...
Report: docs/sprint-artifacts/chain-execution-report.md
Summary:
- 8/8 epics completed
- 2 epics required self-healing fixes
- Total fix attempts: 3
- All UAT gates now passing
UAT Validator Agent (Validation)
User: *uat-validate 1
Quinn: Loading UAT document for Epic 1...
Found 9 scenarios:
- 6 automatable
- 2 semi-automated
- 1 manual only
Executing automatable scenarios...
✓ Scenario 1: Project Initialization
Command: npx heimdall --version
Result: PASS (output: "1.0.0")
✓ Scenario 3: Database Migration
Command: npx heimdall db migrate
Result: PASS (output contains "initialized successfully")
⚠ Scenario 7: Test Email Sending
Command: npx heimdall test-send --to test@example.com
Result: SEMI-AUTO (executed, requires inbox verification)
Message ID: re_abc123
○ Scenario 9: Railway Deployment
Result: SKIPPED (manual only)
Summary: 6/6 automatable PASSED
Recommendation: APPROVE (pending manual verification of 3 scenarios)
Self-Healing Loop Example (Failure → Fix → Pass)
Quinn: UAT Gate Check for Epic 2...
✓ Scenario 1: Event Ingestion Endpoint
Command: curl -X POST http://localhost:3000/api/v1/events
Result: PASS (201 Created)
✗ Scenario 2: API Key Authentication
Command: curl -H "X-API-Key: test" http://localhost:3000/api/v1/events
Expected: 200 OK with auth header validated
Actual: 401 Unauthorized
Error: "Missing authentication middleware"
✗ Scenario 3: Events Database Table
Command: npx heimdall db status
Expected: "events table: exists"
Actual: Exit code 1
Error: "relation 'events' does not exist"
UAT_GATE_RESULT: FAIL
CRITICAL_PASSED: 1/3
BLOCKING_ISSUES: [scenario-2, scenario-3]
Generating fix context for quick-dev...
Output: docs/sprint-artifacts/uat-fix-context-2-1.md
---
Barry: Loading fix context for Epic 2 (Attempt 1)...
Issue 1: API Key Authentication failing
- Root cause: Auth middleware not registered on route
- Fix: Add authMiddleware to route handler chain
Issue 2: Events table missing
- Root cause: Migration file exists but wasn't run
- Fix: Ensure migration runs in db:migrate command
Implementing fixes...
✓ Updated packages/api/src/routes/events.ts
✓ Updated packages/cli/src/commands/db.ts
Running self-check...
✓ Tests passing
Committing: fix(epic-2): add auth middleware, fix migration order
---
Quinn: Re-validating UAT Gate for Epic 2...
✓ Scenario 1: Event Ingestion Endpoint - PASS
✓ Scenario 2: API Key Authentication - PASS (was FAIL)
✓ Scenario 3: Events Database Table - PASS (was FAIL)
UAT_GATE_RESULT: PASS
CRITICAL_PASSED: 3/3
Epic 2 approved after 1 fix attempt.
Configuration Options
Default Configuration
# {project-root}/.bmad/bmm/config.yaml additions
epic_chain:
# UAT Settings
uat:
gate_enabled: true
gate_mode: quick # quick | full | skip
gate_blocking: false # Stop chain on failure?
timeout_seconds: 30 # Per-scenario timeout
# Report Settings
report:
enabled: true
format: markdown # markdown | html | both
include_token_estimates: true
include_dependency_graph: true
# Metrics Settings
metrics:
enabled: true
per_story_timing: true
track_retries: true
Per-Run Override
# Override gate settings for a specific run
./bmad/scripts/epic-chain.sh 1-8 --uat-gate=full --uat-blocking=true
Summary
This integration provides:
- Automated Quality Gates - Verify implementations meet acceptance criteria before proceeding
- Self-Healing Fix Loop - Failed UAT automatically triggers quick-dev to fix issues and re-validate
- Comprehensive Reporting - Generate detailed execution reports with metrics, timing, fix history, and issues
- Flexible Configuration - Adjust gate strictness, retry limits, and escalation behavior per project/run
- Clear Traceability - Every test scenario maps back to story acceptance criteria, every fix links to failure
- Graceful Degradation - Semi-automated and manual scenarios documented but not blocking
Agent Responsibilities
| Agent | Role | Key Actions |
|---|---|---|
| SM (Bob) | Chain Orchestrator | Runs epic-chain, coordinates phases, triggers fix loops |
| Quinn | UAT Validator | Executes scenarios, generates fix context on failure |
| Barry | Quick Dev Fixer | Receives fix context, implements targeted fixes, commits |
Self-Healing Flow
UAT Fail → Generate Fix Context → Quick Dev Fix → Re-validate → Pass/Retry/Halt
The maximum retry count (default: 2) prevents infinite loops. After max retries, the chain halts and requires human intervention, ensuring issues are surfaced rather than ignored.