feat: add UAT validation workflow with self-healing fix loop

- Add uat-validator agent (Quinn) with triggers for validation, reporting, and fix context generation
- Add 5-validation/uat-validate workflow with scenario classification and shell execution
- Add SM agent trigger [UV] for uat-validate workflow
- Add architecture docs for UAT integration with epic-chain
- Support automatic quick-dev fix sessions when UAT gate fails

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
Caleb 2026-01-05 14:52:32 -06:00
parent 66186e1438
commit 2f9dc39c0b
8 changed files with 2050 additions and 0 deletions

View File

@ -0,0 +1,349 @@
# Epic Chain Execution Report Generator - Proposal
## Overview
This proposal describes how to automatically generate a comprehensive Epic Chain Execution Report at the end of each epic chain run, similar to the sample `epic-chain-execution-report.md`.
---
## 1. Report Generation Strategy
### When to Generate
The report should be generated as **Phase 5** of the epic-chain workflow, after all epics complete:
```
Epic 1 → Epic 2 → ... → Epic N → [Report Generation] → [Optional: UAT Gate]
```
### Data Sources
The report aggregates data from multiple sources created during execution:
| Source | Location | Data Extracted |
|--------|----------|----------------|
| Chain Plan | `{sprint_artifacts}/chain-plan.yaml` | Epic order, dependencies, total stories |
| Execution Logs | `{sprint_artifacts}/epic-{id}-execution.md` | Per-epic timing, status, issues |
| Story Files | `docs/stories/*.md` | Story count, completion status |
| UAT Documents | `docs/uat/epic-{id}-uat.md` | UAT generation confirmation |
| Git Log | `git log --oneline` | Commit count per epic |
| Handoffs | `docs/handoffs/*.md` | Cross-epic context transfers |
---
## 2. Workflow Integration
### Option A: Add Phase to Epic Chain (Recommended)
Modify `epic-chain/workflow.yaml` to include a report generation step:
```yaml
# In workflow.yaml variables section
variables:
# ... existing variables ...
# Report configuration
chain_report_file: "{sprint_artifacts}/chain-execution-report.md"
generate_report: true
report_detail_level: "full" # summary | standard | full
# Add step reference
steps:
# ... existing steps ...
- step: generate-report
file: step-06-generate-report.md
when: "chain_complete"
outputs:
- "{chain_report_file}"
```
### Option B: Separate Workflow (Alternative)
Create `epic-chain-report/workflow.yaml` triggered post-chain:
```yaml
name: epic-chain-report
description: "Generate execution report from completed epic chain"
trigger: "post-chain"
input_file_patterns:
chain_plan:
path: "{sprint_artifacts}/chain-plan.yaml"
required: true
execution_logs:
pattern: "{sprint_artifacts}/epic-*-execution.md"
load_strategy: "FULL_LOAD"
```
---
## 3. Report Template Structure
### Proposed Template: `chain-report-template.md`
```markdown
# {project_name} - Epic Chain Execution Report
## Executive Summary
**Project:** {project_name}
**Execution Method:** BMAD Epic Chain (automated AI-driven development)
**Status:** {chain_status}
| Metric | Value |
|--------|-------|
| Total Epics | {epic_count} |
| Total Stories | {story_count} |
| Start Time | {start_time} |
| End Time | {end_time} |
| Total Duration | {duration} |
| Average per Story | {avg_story_time} |
---
## Timeline
### Epic Execution Duration
| Epic | Name | Stories | Duration | Status |
|------|------|---------|----------|--------|
{epic_timeline_rows}
| **Total** | | **{story_count}** | **{duration}** | **{completion_pct}%** |
---
## Dependency Graph
{dependency_graph_mermaid}
### Explicit Dependencies
| Epic | Depends On | Reason |
|------|------------|--------|
{dependency_table_rows}
---
## What Was Built
{per_epic_summary}
---
## Issues Encountered
{issues_section}
---
## Artifacts Generated
| Artifact | Location | Description |
|----------|----------|-------------|
| Story Files | `docs/stories/` | {story_count} completed stories |
| UAT Documents | `docs/uat/` | {epic_count} UAT test documents |
| Epic Files | `docs/epics/` | {epic_count} epic definitions |
| Handoffs | `docs/handoffs/` | Cross-epic context documents |
| Chain Plan | `{chain_plan_file}` | Execution plan with dependencies |
---
## Metrics
### Estimated Token Usage
| Epic | Stories | Est. Calls | Est. Input | Est. Output | Est. Total |
|------|---------|------------|------------|-------------|------------|
{token_estimate_rows}
### Cost Estimates
| Model | Input Cost | Output Cost | Total |
|-------|------------|-------------|-------|
| Claude Sonnet 3.5 | ~${sonnet_input} | ~${sonnet_output} | ~${sonnet_total} |
| Claude Opus | ~${opus_input} | ~${opus_output} | ~${opus_total} |
---
## UAT Validation Status
| Epic | UAT Doc | Automatable | Auto-Passed | Manual Required | Status |
|------|---------|-------------|-------------|-----------------|--------|
{uat_status_rows}
---
## Next Steps
1. **Review UAT Documents** - Review the {epic_count} UAT documents in `docs/uat/`
2. **Execute UAT Validation** - Run `/uat-validator` for automated scenario testing
3. **Manual Acceptance Testing** - Execute manual test scenarios
4. **Code Review** - Review generated code for refinements
5. **Deploy to Staging** - Deploy complete system to staging environment
---
*Report generated: {generation_timestamp}*
*BMAD Method v{bmad_version}*
```
---
## 4. Data Collection During Execution
### Metrics to Track Per Epic
Add to `epic-execute` workflow to collect data for the report:
```yaml
# Proposed: epic-metrics.yaml (created per epic)
epic_id: 1
epic_name: "Foundation, CLI & Deployment"
stories:
total: 7
completed: 7
failed: 0
skipped: 0
timing:
start_time: "2026-01-02T13:40:00Z"
end_time: "2026-01-02T15:10:00Z"
duration_seconds: 5400
avg_story_seconds: 771
issues:
- story: "1-3"
type: "signaling_mismatch"
description: "Completed but didn't output expected phrase"
resolution: "manual_status_update"
dependencies:
requires: []
enables: ["2", "5"]
artifacts:
stories_created: 7
uat_generated: true
commits: 7
```
### Collection Script Enhancement
The orchestration script (`epic-chain.sh`) should:
1. **Start timer** at chain initialization
2. **Per epic**: Record start/end times, story counts, issues
3. **Write metrics** to `{sprint_artifacts}/epic-{id}-metrics.yaml`
4. **On completion**: Trigger report generation step
---
## 5. UAT Validation Integration
### Gate Check Before Next Epic (Optional)
```yaml
# In epic-chain workflow
chain_mode: "dependency-aware"
uat_gate:
enabled: true
mode: "quick" # quick | full | skip
blocking: false # If true, stops chain on UAT failure
# After each epic completes:
# 1. Generate UAT doc (already in epic-execute)
# 2. Run uat-quick validation
# 3. Record results in metrics
# 4. Continue or halt based on blocking setting
```
### Validation Flow
```
Epic Complete
Generate UAT Doc
Run UAT Quick ──────┐
(automatable only) │
│ │
▼ ▼
PASS FAIL
│ │
▼ ▼
Continue blocking=true? ──► HALT CHAIN
▼ (blocking=false)
Log Warning
Continue
```
---
## 6. Implementation Phases
### Phase 1: Metrics Collection
- [ ] Add timing instrumentation to `epic-execute.sh`
- [ ] Create `epic-metrics.yaml` output per epic
- [ ] Store in `{sprint_artifacts}/metrics/`
### Phase 2: Report Generation
- [ ] Create `step-06-generate-report.md` for epic-chain
- [ ] Build `chain-report-template.md` template
- [ ] Add report generation to workflow.yaml
### Phase 3: UAT Integration
- [ ] Create UAT Validator agent (see `uat-validator.agent.yaml`)
- [ ] Add `uat-validate/workflow.yaml`
- [ ] Integrate gate check into epic-chain
### Phase 4: Visualization
- [ ] Add Mermaid dependency graph generation
- [ ] Add timeline visualization
- [ ] Consider HTML report option
---
## 7. Report Generation Agent Action
For the SM agent or a dedicated Report Generator, add this action:
```yaml
- trigger: CR or fuzzy match on chain-report
action: |
Generate Epic Chain Execution Report:
1. Load chain-plan.yaml for epic list and dependencies
2. For each epic, load epic-{id}-metrics.yaml
3. Aggregate timing, story counts, issues
4. Generate dependency graph (Mermaid format)
5. Calculate token/cost estimates
6. Load UAT validation results if available
7. Render template with collected data
8. Output to {sprint_artifacts}/chain-execution-report.md
description: "[CR] Generate comprehensive execution report for completed epic chain"
```
---
## 8. Sample Output
See `/epic-chain-execution-report.md` for a complete example of the target output format. Key sections:
- Executive summary with totals
- Timeline table with per-epic duration
- Dependency graph (ASCII or Mermaid)
- What was built (per epic)
- Issues encountered
- Artifacts generated
- Token/cost estimates
- Next steps
---
## Questions for Decision
1. **Report timing**: Generate after each epic (incremental) or only at chain end?
2. **UAT gate**: Should failed UAT block the chain or just warn?
3. **Token tracking**: Actual counts (requires API integration) or estimates?
4. **Report format**: Markdown only, or also HTML/PDF export?
5. **Integration with SM**: Add to SM agent menu, or create dedicated reporter agent?

View File

@ -0,0 +1,361 @@
# Epic Workflows Improvement Plan v1
**Date:** 2026-01-02
**Workflows Reviewed:** epic-execute, epic-chain
**Status:** Active
---
## Overview
This document captures the review findings and improvement roadmap for the epic-execute and epic-chain workflows. These workflows automate story execution with context isolation between development and review phases.
---
## What's Working Well
### 1. Context Isolation Architecture
The decision to run dev and review in separate Claude contexts is the key innovation:
- Prevents reviewer bias from seeing implementation struggles
- Maximizes context window for each phase
- Simulates real code review where reviewers see code "cold"
- Uses git staging as the communication medium between phases
### 2. Severity-Based Fix Policy
The issue severity system (HIGH/MEDIUM/LOW) with threshold-based fixing is pragmatic:
- Prevents over-engineering on minor issues
- Ensures critical issues always get fixed
- Documents low-severity for future cleanup sprints
**Location:** `step-03-code-review.md:17-27`
### 3. Structured Documentation Trail
The Dev Agent Record and Code Review Record sections create an auditable history:
- Understanding why decisions were made
- Debugging issues later
- Training/improving the workflow
### 4. Chain Dependency Analysis
Epic-chain's analysis phase detecting both explicit and implicit dependencies shows good foresight.
**Location:** `instructions.md:57-88`
### 5. Shell Scripts Quality
- Clean argument parsing
- Proper error handling with `set -e`
- Good logging with timestamps
- Flexible story discovery (multiple naming conventions)
- Resume capability with `--start-from`
---
## Improvement Areas
### HIGH Priority
#### 1. Security: `--dangerously-skip-permissions` Flag
**Location:** `epic-execute.sh:291-292`
```bash
result=$(claude --dangerously-skip-permissions -p "$dev_prompt" 2>&1) || true
```
**Problem:** This bypasses safety checks and is concerning for production use.
**Proposed Fix:**
- Document the security implications clearly in README
- Add a `--require-approval` mode that doesn't use this flag
- Have the script detect and prompt for dangerous operations
- Consider environment variable to explicitly opt-in: `BMAD_ALLOW_DANGEROUS=true`
---
#### 2. Missing Test Execution Validation
**Location:** `epic-execute.sh` (dev phase)
**Problem:** The dev prompt says "Run tests and fix any failures" but the shell script doesn't verify tests actually passed. The completion signal (`IMPLEMENTATION COMPLETE`) is trusted without validation.
**Proposed Fix:**
```bash
# After dev phase, before review
execute_test_verification() {
local test_cmd="${TEST_COMMAND:-npm test}"
log ">>> VERIFYING TESTS"
if ! $test_cmd 2>&1; then
log_error "Tests failing after dev phase"
return 1
fi
log_success "Tests passing"
return 0
}
```
---
#### 3. Add Pre-flight Confirmation
**Location:** `epic-execute.sh` (after story discovery)
**Problem:** No validation step shows the user which stories will be executed before starting.
**Proposed Fix:**
```bash
# After discovering stories, before execution
display_execution_plan() {
echo ""
log "Execution Plan:"
for story in "${STORIES[@]}"; do
echo " - $(basename "$story")"
done
echo ""
if [ "$AUTO_APPROVE" != true ]; then
read -p "Proceed with execution? (y/n) " -n 1 -r
echo
if [[ ! $REPLY =~ ^[Yy]$ ]]; then
log "Execution cancelled by user"
exit 0
fi
fi
}
```
---
### MEDIUM Priority
#### 4. Context Handoff is Placeholder
**Location:** `epic-chain.sh:411-432`
**Problem:** Current handoff only lists files changed:
```bash
$(git diff --name-only HEAD~${story_count} HEAD 2>/dev/null | head -20)
```
The documented template in `instructions.md:288-312` describes rich context (patterns, decisions, gotchas) but this isn't generated.
**Proposed Fix:**
```bash
generate_rich_handoff() {
local epic_id="$1"
local next_epic="$2"
local handoff_file="$3"
local handoff_prompt="You are generating a context handoff document.
## Task
Create a handoff from Epic $epic_id to Epic $next_epic.
## Recently Modified Files
$(git diff --name-only HEAD~${story_count} HEAD 2>/dev/null)
## Epic Content
$(cat "${EPIC_FILES_LIST[$current_idx]}")
## Generate a handoff document with:
1. Patterns Established - coding conventions, architectural decisions
2. Key Decisions - major technical choices with rationale
3. Gotchas & Lessons Learned - issues encountered, workarounds
4. Files to Reference - key files that establish patterns
5. Test Patterns - testing conventions used
Output as markdown."
claude -p "$handoff_prompt" > "$handoff_file"
}
```
---
#### 5. No Rollback Mechanism
**Problem:** If review fails or execution gets interrupted mid-story, there's no easy way to rollback.
**Proposed Fix:**
```bash
# At start of epic execution
create_checkpoint() {
CHECKPOINT=$(git rev-parse HEAD)
echo "$CHECKPOINT" > "/tmp/bmad-checkpoint-$EPIC_ID"
log "Checkpoint created: $CHECKPOINT"
}
# On failure or user abort
rollback_to_checkpoint() {
if [ -f "/tmp/bmad-checkpoint-$EPIC_ID" ]; then
local checkpoint=$(cat "/tmp/bmad-checkpoint-$EPIC_ID")
read -p "Rollback to checkpoint $checkpoint? (y/n) " -n 1 -r
echo
if [[ $REPLY =~ ^[Yy]$ ]]; then
git reset --hard "$checkpoint"
log_success "Rolled back to checkpoint"
fi
fi
}
```
---
#### 6. Wire Up Configuration File
**Location:** `config/default-config.yaml` exists but isn't used
**Problem:** The configuration documented in `workflow.md:104-122` isn't actually loaded by the shell script.
**Proposed Fix:**
```bash
# Load configuration
load_config() {
local config_file="$BMAD_DIR/_cfg/epic-execute.yaml"
if [ -f "$config_file" ]; then
# Parse YAML (requires yq or similar)
AUTO_COMMIT=$(yq '.auto_commit // true' "$config_file")
RUN_TESTS_BEFORE_REVIEW=$(yq '.run_tests_before_review // true' "$config_file")
REVIEW_MODE=$(yq '.review_mode // "standard"' "$config_file")
log "Loaded config from $config_file"
else
# Defaults
AUTO_COMMIT=true
RUN_TESTS_BEFORE_REVIEW=true
REVIEW_MODE="standard"
fi
}
```
---
#### 7. Remove or Implement `--parallel` Flag
**Location:** `epic-execute.sh:11, 93-96`
**Problem:** The `--parallel` flag exists in argument parsing but isn't implemented.
**Proposed Fix:** Either:
- Remove the flag entirely until implemented
- Add a clear error: `log_error "--parallel not yet implemented"`
- Implement parallel execution for independent stories
---
### LOW Priority
#### 8. Prompt Duplication
**Problem:** Prompts are duplicated between step files (documentation) and shell script (execution).
**Proposed Fix:** Source prompts from step files:
```bash
build_dev_prompt() {
local story_file="$1"
local template="$WORKFLOW_DIR/steps/step-02-dev-story.md"
# Extract prompt template section
# Substitute variables
export story_id=$(basename "$story_file" .md)
export story_file_contents=$(cat "$story_file")
cat "$template" | envsubst
}
```
---
#### 9. Missing sprint-status.yaml Update
**Location:** `workflow.md:73` mentions this but it's not implemented
**Proposed Fix:** Add after successful completion:
```bash
update_sprint_status() {
local status_file="$PROJECT_ROOT/docs/sprints/sprint-status.yaml"
if [ -f "$status_file" ]; then
# Update epic status to completed
# This requires yq or similar YAML tool
yq -i ".epics.\"$EPIC_ID\".status = \"done\"" "$status_file"
yq -i ".epics.\"$EPIC_ID\".completed_at = \"$(date -Iseconds)\"" "$status_file"
fi
}
```
---
#### 10. Story Discovery Edge Cases
**Location:** `epic-execute.sh:181-206`
**Problem:**
- Relies on consistent naming conventions
- Content grep could false-positive
- No warning when stories found in unexpected locations
**Proposed Fix:** Add source tracking and validation:
```bash
# Track where each story was found
declare -A STORY_SOURCES
for story in "${STORIES[@]}"; do
source_dir=$(dirname "$story")
STORY_SOURCES["$story"]="$source_dir"
done
# Warn about unexpected locations
for story in "${STORIES[@]}"; do
if [[ "${STORY_SOURCES[$story]}" != "$STORIES_DIR" ]]; then
log_warn "Story found in non-standard location: $story"
fi
done
```
---
## Implementation Roadmap
### Phase 1: Critical Fixes
- [ ] Add test verification step
- [ ] Add pre-flight confirmation
- [ ] Document `--dangerously-skip-permissions` risks
### Phase 2: Reliability
- [ ] Implement rollback mechanism
- [ ] Wire up configuration file
- [ ] Fix or remove `--parallel` flag
### Phase 3: Quality of Life
- [ ] Generate rich context handoffs
- [ ] Source prompts from step files
- [ ] Add sprint-status.yaml updates
### Phase 4: Advanced Features
- [ ] Implement parallel story execution
- [ ] Add `--interactive` mode for step-by-step approval
- [ ] Track execution metrics (time per story, fix rate)
---
## Ratings Summary
| Aspect | Rating | Notes |
|--------|--------|-------|
| Architecture | Excellent | Context isolation is the right approach |
| Documentation | Very Good | Clear workflow diagrams, step files |
| Shell Scripts | Good | Well-structured, needs hardening |
| Error Handling | Fair | Basic coverage, needs rollback |
| Security | Needs Work | `--dangerously-skip-permissions` |
| Completeness | Good | Some features documented but not implemented |
---
## References
- `src/modules/bmm/workflows/4-implementation/epic-execute/`
- `src/modules/bmm/workflows/4-implementation/epic-chain/`
- `scripts/epic-execute.sh`
- `scripts/epic-chain.sh`

View File

@ -0,0 +1,670 @@
# UAT Validation Integration Architecture
## Overview
This document describes how UAT validation integrates with the epic-chain workflow to provide automated quality gates, self-healing fix loops, and comprehensive validation reporting.
---
## Integration Points
```
┌──────────────────────────────────────────────────────────────────────────────────┐
│ EPIC CHAIN WITH UAT VALIDATION + SELF-HEALING │
├──────────────────────────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────────────────────────────────────────────────────────────────────────┐ │
│ │ PER EPIC LOOP │ │
│ │ │ │
│ │ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐ │ │
│ │ │ Phase 1 │──►│ Phase 2 │──►│ Phase 3 │──►│ Phase 4 │──►│ Phase 5 │ │ │
│ │ │ Dev │ │ Review │ │ Commit │ │ UAT │ │ Gate │ │ │
│ │ │ │ │ │ │ │ │ Gen │ │ Check │ │ │
│ │ └─────────┘ └─────────┘ └─────────┘ └─────────┘ └────┬────┘ │ │
│ │ │ │ │
│ └────────────────────────────────────────────────────────────────┼────────────┘ │
│ │ │
│ ┌─────────────┴───────┐ │
│ │ GATE DECISION │ │
│ └──────────┬──────────┘ │
│ │ │
│ ┌──────────────────┬──────────────┴──────────┐ │
│ │ │ │ │
│ ▼ ▼ ▼ │
│ ┌──────┐ ┌──────┐ ┌──────────┐ │
│ │ PASS │ │ FAIL │ │ MAX │ │
│ │ │ │ │ │ RETRIES │ │
│ └──┬───┘ └──┬───┘ └────┬─────┘ │
│ │ │ │ │
│ │ ▼ ▼ │
│ │ ┌────────────────────────┐ ┌────────────┐ │
│ │ │ SELF-HEALING │ │ HALT + │ │
│ │ │ │ │ NOTIFY │ │
│ │ │ ┌──────────────────┐ │ └────────────┘ │
│ │ │ │ Quick Dev Fix │ │ │
│ │ │ │ (Barry Agent) │ │ │
│ │ │ │ │ │ │
│ │ │ │ • Load failures │ │ │
│ │ │ │ • Generate fix │ │ │
│ │ │ │ • Commit changes │ │ │
│ │ │ └────────┬─────────┘ │ │
│ │ │ │ │ │
│ │ │ ▼ │ │
│ │ │ ┌──────────────────┐ │ │
│ │ │ │ Re-validate │ │ │
│ │ │ │ UAT Gate │──┼──► Back to GATE │
│ │ │ └──────────────────┘ │ │
│ │ │ │ │
│ │ └────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌──────────┐ │
│ │ Next │ │
│ │ Epic │ │
│ └──────────┘ │
│ │
│ ┌─────────────────────────────────────────────────────────────────────────────┐ │
│ │ CHAIN COMPLETION │ │
│ │ │ │
│ │ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────────┐ │ │
│ │ │ Aggregate │──►│ Generate │──►│ Final UAT │ │ │
│ │ │ Metrics │ │ Chain Report │ │ Summary │ │ │
│ │ └─────────────────┘ └─────────────────┘ └─────────────────────┘ │ │
│ │ │ │
│ └─────────────────────────────────────────────────────────────────────────────┘ │
│ │
└──────────────────────────────────────────────────────────────────────────────────┘
```
---
## Self-Healing Loop: UAT Failure → Quick Dev → Re-validate
When UAT validation fails, the system automatically triggers a quick-dev session to fix the identified issues.
### Flow Detail
```
UAT Gate Check
├── PASS ──────────────────────────────► Continue to Next Epic
└── FAIL
┌─────────────────────────────────────────────────────────────┐
│ FAILURE ANALYSIS │
│ │
│ 1. Collect failed scenarios with: │
│ - Scenario ID and description │
│ - Expected vs actual output │
│ - Error messages / stack traces │
│ - Related story acceptance criteria │
│ │
│ 2. Generate fix context document: │
│ docs/sprint-artifacts/uat-fix-context-{epic}-{attempt}.md │
└─────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│ QUICK DEV SESSION │
│ (Barry - Quick Flow Solo Dev) │
│ │
│ Input: uat-fix-context-{epic}-{attempt}.md │
│ │
│ Process: │
│ 1. Load fix context (failed scenarios + error details) │
│ 2. Analyze root cause for each failure │
│ 3. Implement targeted fixes │
│ 4. Run self-check (step-04) │
│ 5. Commit with message: "fix(epic-{id}): UAT fix #{n}" │
│ │
│ Output: │
│ - Code changes committed │
│ - Fix summary in story dev record │
└─────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│ RE-VALIDATE │
│ │
│ Run UAT Gate Check again on same scenarios │
│ │
│ Outcomes: │
│ - PASS → Continue to next epic │
│ - FAIL + attempts < max_retries Loop back to Quick Dev
│ - FAIL + attempts >= max_retries → HALT chain │
└─────────────────────────────────────────────────────────────┘
```
### Configuration
```yaml
# In epic-chain config
uat:
gate_enabled: true
gate_mode: quick # quick | full | skip
# Self-healing configuration
self_heal:
enabled: true
max_retries: 2 # Maximum fix attempts per epic
fix_workflow: quick-dev # Workflow to use for fixes
fix_agent: barry # Agent to invoke
# What to include in fix context
include_in_context:
- failed_scenarios
- error_output
- related_stories
- acceptance_criteria
- recent_commits # Last 3 commits for context
# Escalation
on_max_retries: halt # halt | continue_with_warning | notify_human
notification_channel: null # Optional: slack, email, etc.
```
### Fix Context Document Template
Generated when UAT fails, consumed by Quick Dev:
```markdown
# UAT Fix Context - Epic {epic_id} (Attempt {n})
## Failed Scenarios
### Scenario {id}: {name}
**Expected Result:**
{expected}
**Actual Result:**
{actual}
**Error Output:**
```
{stderr or error message}
```
**Related Story:** {story_id}
**Acceptance Criteria:**
- {criteria from story}
---
## Fix Instructions
Address the following failures in priority order:
1. **{scenario_id}**: {one-line description of what's broken}
- Root cause hint: {if determinable}
- Files likely involved: {if determinable}
## Constraints
- Only fix the identified failures
- Do not refactor unrelated code
- Run tests after each fix
- Commit with message format: `fix(epic-{id}): {description}`
```
---
## UAT Scenario Classification
Based on the UAT sample document, scenarios fall into three categories:
### Automatable (Execute via Shell)
| Scenario Type | Example | Automation Method |
|---------------|---------|-------------------|
| CLI commands | `npx heimdall --version` | Shell execution, check exit code + output |
| Build verification | `npm run build` | Shell execution, parse output for success |
| API health checks | `curl /health` | HTTP request, validate JSON response |
| Database status | `npx heimdall db status` | Shell execution, parse structured output |
| Configuration validation | `npx heimdall config validate` | Shell execution, check for "valid" in output |
### Semi-Automated (Execute + Manual Verify)
| Scenario Type | Example | Approach |
|---------------|---------|----------|
| Email delivery | `npx heimdall test-send` | Execute command, log message ID, flag for inbox check |
| File creation | `heimdall config init` | Execute, verify file exists, show contents for review |
| Worker processes | `heimdall start` | Start, verify startup message, terminate after timeout |
### Manual Only
| Scenario Type | Example | Approach |
|---------------|---------|----------|
| External service setup | Railway deployment | Document steps, skip in automation |
| Visual verification | UI appearance | Generate screenshots if possible, flag for review |
| Multi-step human flows | Full onboarding journey | Provide checklist, require human sign-off |
---
## Gate Check Implementation
### Quick Gate (Default)
Runs only automatable scenarios from the "Minimum Requirements" section:
```yaml
# uat-gate-config.yaml
gate_mode: quick
timeout_per_scenario: 30 # seconds
fail_threshold: 0 # any failure = gate fail
scenarios_to_run:
- type: "cli_command"
match: "Expected Results" sections with CLI commands
- type: "health_check"
match: "/health endpoint" scenarios
- type: "validation"
match: "validate" command scenarios
skip_scenarios:
- contains: "email inbox"
- contains: "Railway"
- contains: "browser"
- contains: "terminal window"
```
### Full Gate
Runs all automatable scenarios plus flags semi-automated for review:
```yaml
gate_mode: full
include_semi_automated: true
generate_manual_checklist: true
```
---
## Data Flow
### Per-Epic Metrics Collection
```yaml
# Written to: {sprint_artifacts}/metrics/epic-{id}-metrics.yaml
epic_id: "1"
epic_name: "Foundation, CLI & Deployment"
execution:
start_time: "2026-01-02T13:40:00Z"
end_time: "2026-01-02T15:10:00Z"
duration_seconds: 5400
stories:
total: 7
completed: 7
failed: 0
skipped: 0
uat:
document_generated: true
document_path: "docs/uat/epic-1-uat.md"
scenarios:
total: 9
automatable: 6
semi_automated: 2
manual_only: 1
validation:
gate_executed: true
gate_mode: "quick"
results:
passed: 6
failed: 0
skipped: 3
gate_status: "PASS"
blocking_issues: []
# Self-healing loop tracking
fix_attempts: 0
fix_history: []
# Example when fixes were needed:
# fix_attempts: 2
# fix_history:
# - attempt: 1
# failed_scenarios: ["scenario-3", "scenario-5"]
# fix_context: "docs/sprint-artifacts/uat-fix-context-1-1.md"
# fix_commit: "abc123"
# result: "partial" # 1 of 2 fixed
# - attempt: 2
# failed_scenarios: ["scenario-5"]
# fix_context: "docs/sprint-artifacts/uat-fix-context-1-2.md"
# fix_commit: "def456"
# result: "success" # all fixed
issues:
- type: "signaling_mismatch"
story: "1-3"
severity: "low"
resolved: true
```
### Chain Report Aggregation
```yaml
# Read from: {sprint_artifacts}/metrics/epic-*-metrics.yaml
# Write to: {sprint_artifacts}/chain-execution-report.md
chain:
total_epics: 8
total_stories: 58
total_duration_seconds: 63000
epics:
- id: "1"
stories: 7
duration: 5400
uat_gate: "PASS"
# ... etc
uat_summary:
total_scenarios: 72
automatable: 48
auto_passed: 45
auto_failed: 3
manual_pending: 24
gate_results:
passed: 7
failed: 1
blocked_chain: false
```
---
## Workflow File Changes
### Modified: `epic-chain/workflow.yaml`
```yaml
# Add to variables section:
variables:
# ... existing ...
# UAT Gate Configuration
uat_gate_enabled: true
uat_gate_mode: "quick" # quick | full | skip
uat_gate_blocking: false # If true, halts chain on failure
# Report Configuration
generate_chain_report: true
chain_report_file: "{sprint_artifacts}/chain-execution-report.md"
metrics_folder: "{sprint_artifacts}/metrics"
```
### New: `step-05-uat-gate.md`
```markdown
# UAT Gate Check
## Purpose
Validate epic implementation against automatable UAT scenarios before proceeding.
## Inputs
- UAT document: `docs/uat/epic-{id}-uat.md`
- Gate config: `{uat_gate_mode}`
## Process
1. Parse UAT document for test scenarios
2. Identify automatable scenarios (CLI commands, API calls, file checks)
3. Execute each in isolated shell
4. Collect results with stdout/stderr evidence
5. Determine gate status
## Outputs
- Gate result: PASS | FAIL
- Metrics update: `{metrics_folder}/epic-{id}-metrics.yaml`
## Exit Conditions
- PASS: Continue to next epic
- FAIL + blocking=false: Log warning, continue
- FAIL + blocking=true: Halt chain, require intervention
```
### New: `step-06-generate-report.md`
```markdown
# Chain Report Generation
## Purpose
Generate comprehensive execution report after chain completion.
## Inputs
- All metrics files: `{metrics_folder}/epic-*-metrics.yaml`
- Chain plan: `{chain_plan_file}`
- Template: `chain-report-template.md`
## Process
1. Load all epic metrics
2. Calculate aggregates (totals, averages, percentages)
3. Build dependency graph visualization
4. Compile issues list
5. Generate token/cost estimates
6. Render report template
## Outputs
- `{chain_report_file}` - Complete execution report
```
---
## Agent Interaction Model
### SM Agent (Orchestrator) - With Self-Healing
```
User: *epic-chain 1-8
SM: Loading chain plan...
Found 8 epics, 58 stories
Dependencies analyzed: sequential with branches
Starting Epic 1 (Foundation)...
[Dev → Review → Commit → UAT Gen → Gate Check]
Epic 1: COMPLETE (7/7 stories, UAT PASS)
Starting Epic 2 (Event Ingestion)...
[Dev → Review → Commit → UAT Gen → Gate Check]
⚠ UAT GATE FAILED - 2 scenarios failed
Initiating self-healing loop (attempt 1/2)...
Generating fix context → docs/sprint-artifacts/uat-fix-context-2-1.md
[Invoking Barry for quick-dev fix session]
Barry: Loading fix context...
Failure 1: API endpoint returning 404
Failure 2: Missing auth header validation
Implementing fixes...
✓ Fixed route registration in server.ts
✓ Added auth middleware check
Committed: fix(epic-2): UAT fix #1
Re-validating UAT gate...
✓ All scenarios now passing
Epic 2: COMPLETE (5/5 stories, UAT PASS after 1 fix)
Starting Epic 3 (Workflow Engine)...
...
Chain Complete.
Generating execution report...
Report: docs/sprint-artifacts/chain-execution-report.md
Summary:
- 8/8 epics completed
- 2 epics required self-healing fixes
- Total fix attempts: 3
- All UAT gates now passing
```
### UAT Validator Agent (Validation)
```
User: *uat-validate 1
Quinn: Loading UAT document for Epic 1...
Found 9 scenarios:
- 6 automatable
- 2 semi-automated
- 1 manual only
Executing automatable scenarios...
✓ Scenario 1: Project Initialization
Command: npx heimdall --version
Result: PASS (output: "1.0.0")
✓ Scenario 3: Database Migration
Command: npx heimdall db migrate
Result: PASS (output contains "initialized successfully")
⚠ Scenario 7: Test Email Sending
Command: npx heimdall test-send --to test@example.com
Result: SEMI-AUTO (executed, requires inbox verification)
Message ID: re_abc123
○ Scenario 9: Railway Deployment
Result: SKIPPED (manual only)
Summary: 6/6 automatable PASSED
Recommendation: APPROVE (pending manual verification of 3 scenarios)
```
### Self-Healing Loop Example (Failure → Fix → Pass)
```
Quinn: UAT Gate Check for Epic 2...
✓ Scenario 1: Event Ingestion Endpoint
Command: curl -X POST http://localhost:3000/api/v1/events
Result: PASS (201 Created)
✗ Scenario 2: API Key Authentication
Command: curl -H "X-API-Key: test" http://localhost:3000/api/v1/events
Expected: 200 OK with auth header validated
Actual: 401 Unauthorized
Error: "Missing authentication middleware"
✗ Scenario 3: Events Database Table
Command: npx heimdall db status
Expected: "events table: exists"
Actual: Exit code 1
Error: "relation 'events' does not exist"
UAT_GATE_RESULT: FAIL
CRITICAL_PASSED: 1/3
BLOCKING_ISSUES: [scenario-2, scenario-3]
Generating fix context for quick-dev...
Output: docs/sprint-artifacts/uat-fix-context-2-1.md
---
Barry: Loading fix context for Epic 2 (Attempt 1)...
Issue 1: API Key Authentication failing
- Root cause: Auth middleware not registered on route
- Fix: Add authMiddleware to route handler chain
Issue 2: Events table missing
- Root cause: Migration file exists but wasn't run
- Fix: Ensure migration runs in db:migrate command
Implementing fixes...
✓ Updated packages/api/src/routes/events.ts
✓ Updated packages/cli/src/commands/db.ts
Running self-check...
✓ Tests passing
Committing: fix(epic-2): add auth middleware, fix migration order
---
Quinn: Re-validating UAT Gate for Epic 2...
✓ Scenario 1: Event Ingestion Endpoint - PASS
✓ Scenario 2: API Key Authentication - PASS (was FAIL)
✓ Scenario 3: Events Database Table - PASS (was FAIL)
UAT_GATE_RESULT: PASS
CRITICAL_PASSED: 3/3
Epic 2 approved after 1 fix attempt.
```
---
## Configuration Options
### Default Configuration
```yaml
# {project-root}/.bmad/bmm/config.yaml additions
epic_chain:
# UAT Settings
uat:
gate_enabled: true
gate_mode: quick # quick | full | skip
gate_blocking: false # Stop chain on failure?
timeout_seconds: 30 # Per-scenario timeout
# Report Settings
report:
enabled: true
format: markdown # markdown | html | both
include_token_estimates: true
include_dependency_graph: true
# Metrics Settings
metrics:
enabled: true
per_story_timing: true
track_retries: true
```
### Per-Run Override
```bash
# Override gate settings for a specific run
./bmad/scripts/epic-chain.sh 1-8 --uat-gate=full --uat-blocking=true
```
---
## Summary
This integration provides:
1. **Automated Quality Gates** - Verify implementations meet acceptance criteria before proceeding
2. **Self-Healing Fix Loop** - Failed UAT automatically triggers quick-dev to fix issues and re-validate
3. **Comprehensive Reporting** - Generate detailed execution reports with metrics, timing, fix history, and issues
4. **Flexible Configuration** - Adjust gate strictness, retry limits, and escalation behavior per project/run
5. **Clear Traceability** - Every test scenario maps back to story acceptance criteria, every fix links to failure
6. **Graceful Degradation** - Semi-automated and manual scenarios documented but not blocking
### Agent Responsibilities
| Agent | Role | Key Actions |
|-------|------|-------------|
| **SM (Bob)** | Chain Orchestrator | Runs epic-chain, coordinates phases, triggers fix loops |
| **Quinn** | UAT Validator | Executes scenarios, generates fix context on failure |
| **Barry** | Quick Dev Fixer | Receives fix context, implements targeted fixes, commits |
### Self-Healing Flow
```
UAT Fail → Generate Fix Context → Quick Dev Fix → Re-validate → Pass/Retry/Halt
```
The maximum retry count (default: 2) prevents infinite loops. After max retries, the chain halts and requires human intervention, ensuring issues are surfaced rather than ignored.

View File

@ -53,3 +53,7 @@ agent:
- trigger: EC or fuzzy match on epic-chain
workflow: "{project-root}/_bmad/bmm/workflows/4-implementation/epic-chain/workflow.yaml"
description: "[EC] Analyze and execute multiple epics in sequence with dependency detection"
- trigger: UV or fuzzy match on uat-validate
workflow: "{project-root}/_bmad/bmm/workflows/5-validation/uat-validate/workflow.yaml"
description: "[UV] Validate epic against UAT scenarios with self-healing fix loop on failure"

View File

@ -0,0 +1,108 @@
# UAT Validator Agent Definition
# Mock/Proposal - integrates UAT validation into epic chain execution
agent:
metadata:
id: "_bmad/bmm/agents/uat-validator.md"
name: Quinn
title: UAT Validator
icon:
module: bmm
hasSidecar: false
persona:
role: User Acceptance Testing Specialist + Quality Gate Enforcer
identity: Meticulous QA professional with deep experience in end-to-end testing, user journey validation, and acceptance criteria verification. Expert at translating technical implementations into user-facing test scenarios and identifying gaps between requirements and reality.
communication_style: "Methodical and evidence-based. Every test has a clear purpose, every result documented with proof. Finds issues before users do."
principles: |
- UAT validates user value, not implementation details
- Acceptance criteria are the contract between dev and stakeholder
- Test execution is repeatable and traceable
- Issues categorized by business impact, not technical severity
- Automation where possible, human judgment where necessary
critical_actions:
- "Always load the UAT document for the epic being validated before any test execution"
- "Map each test scenario back to specific story acceptance criteria"
- "Execute automatable scenarios (CLI commands, API calls, health checks) directly via shell"
- "Document all test results with pass/fail status and evidence (output, screenshots, logs)"
- "Generate validation report with clear go/no-go recommendation"
- "Find if this exists, if it does, always treat it as the bible I plan and execute against: `**/project-context.md`"
# Knowledge sources for test execution
conversational_knowledge:
- name: "uat-automation-patterns"
description: "Patterns for automating common UAT scenario types"
path: "{project-root}/_bmad/bmm/data/uat-automation-patterns.yaml"
menu:
# Primary: Execute UAT validation for a completed epic
- trigger: UV or fuzzy match on uat-validate
workflow: "{project-root}/_bmad/bmm/workflows/5-validation/uat-validate/workflow.yaml"
description: "[UV] Execute UAT scenarios and validate epic against acceptance criteria (triggers self-healing on failure)"
# Review and summarize UAT results
- trigger: UR or fuzzy match on uat-report
exec: "{project-root}/_bmad/bmm/workflows/5-validation/uat-report/workflow.md"
description: "[UR] Generate UAT validation report with pass/fail summary and recommendations"
# Quick validation - just check automatable scenarios
- trigger: UQ or fuzzy match on uat-quick
action: |
Execute only the automatable UAT scenarios for the specified epic:
1. Load UAT document from docs/uat/epic-{id}-uat.md
2. Identify scenarios that can be automated (CLI commands, API endpoints, health checks)
3. Execute each automatable scenario in sequence
4. Document pass/fail for each with output evidence
5. Report summary: X of Y automatable scenarios passed
Skip scenarios requiring: manual UI interaction, external service verification, human judgment
description: "[UQ] Quick validation - execute only automatable UAT scenarios"
# Gate check - binary pass/fail for chain continuation
- trigger: UG or fuzzy match on uat-gate
action: |
Perform UAT gate check to determine if epic chain should continue:
1. Load UAT document and success criteria summary
2. Execute critical path scenarios (marked as required)
3. Check all "Minimum Requirements for Sign-off" items
4. Return: GATE_PASS (all critical passed) or GATE_FAIL (any critical failed)
Output format for script parsing:
UAT_GATE_RESULT: PASS|FAIL
CRITICAL_PASSED: X/Y
BLOCKING_ISSUES: [list if any]
On FAIL: Generate fix context document for quick-dev self-healing loop.
description: "[UG] UAT gate check - binary pass/fail for epic chain continuation"
# Fix context generation for self-healing loop
- trigger: UF or fuzzy match on uat-fix-context
action: |
Generate fix context document from failed UAT scenarios:
1. Load failed scenario results from last UAT gate check
2. For each failure:
- Extract scenario ID, name, expected vs actual
- Capture error output / stack traces
- Link to related story and acceptance criteria
3. Prioritize failures by severity (blocking first)
4. Generate root cause hints where determinable
5. Output to: docs/sprint-artifacts/uat-fix-context-{epic}-{attempt}.md
This document becomes the input for Barry's quick-dev fix session.
description: "[UF] Generate fix context document for quick-dev self-healing"
# Scenario generator from stories
- trigger: US or fuzzy match on uat-scenarios
action: |
Generate UAT test scenarios from completed story acceptance criteria:
1. Load all story files for the specified epic
2. Extract acceptance criteria from each story
3. Transform criteria into testable scenarios with:
- Clear preconditions
- Step-by-step actions
- Expected results
- Pass/fail criteria
4. Categorize as: automatable | semi-automated | manual-only
5. Output to docs/uat/epic-{id}-uat.md
description: "[US] Generate UAT scenarios from story acceptance criteria"
webskip: true

View File

@ -0,0 +1,290 @@
# UAT Validate Workflow Instructions
## Purpose
Execute User Acceptance Testing scenarios against a completed epic to validate that implementations meet acceptance criteria. On failure, generate fix context for the self-healing quick-dev loop.
## Workflow Overview
```
Load UAT Doc → Classify Scenarios → Execute Automatable → Evaluate Gate → Report/Fix
```
---
## Phase 1: Load and Classify
### 1.1 Load UAT Document
Load the UAT document from `{uat_docs_location}/epic-{epic_id}-uat.md`.
**Required sections to parse:**
- Test Scenarios (numbered list with steps)
- Success Criteria (checkbox items)
- Prerequisites (environment requirements)
### 1.2 Classify Each Scenario
For each test scenario, classify based on indicators:
| Classification | Indicators | Action |
|----------------|------------|--------|
| **Automatable** | `npx`, `npm run`, `curl`, `--version`, `/health`, `db status`, `config validate` | Execute via shell |
| **Semi-automated** | `test-send`, `email`, `inbox`, `check your` | Execute + flag for manual verify |
| **Manual only** | `Railway`, `dashboard`, `browser`, `two terminal`, `visual` | Skip, add to checklist |
### 1.3 Output Classification Summary
```yaml
scenarios:
total: 9
automatable: 6
semi_automated: 2
manual_only: 1
automatable_scenarios:
- id: 1
name: "Project Initialization"
command: "npx heimdall --version"
expected: "displays a version number"
- id: 3
name: "Database Migration"
command: "npx heimdall db migrate"
expected: "success message"
# ...
```
---
## Phase 2: Execute Scenarios
### 2.1 Execute Each Automatable Scenario
For each automatable scenario:
1. **Extract command** from scenario steps (look for code blocks or CLI references)
2. **Set timeout** based on `{timeout_per_scenario}` (default: 30s)
3. **Execute command** via shell
4. **Capture output** (stdout, stderr, exit code)
5. **Evaluate result** against expected outcome
### 2.2 Result Evaluation Rules
| Condition | Result |
|-----------|--------|
| Exit code 0 + output matches expected | PASS |
| Exit code 0 + output doesn't match | FAIL (unexpected output) |
| Exit code non-zero | FAIL (command error) |
| Timeout exceeded | FAIL (timeout) |
| Command not found | FAIL (missing dependency) |
### 2.3 Expected Output Matching
Use flexible matching:
- **Contains match**: Expected text appears anywhere in output
- **Regex match**: Pattern matches output
- **Exit code only**: Just verify success (exit 0)
Example matching rules:
```yaml
- scenario: "Database Migration"
command: "npx heimdall db migrate"
match_type: "contains"
expected: ["initialized successfully", "migration complete", "already up to date"]
# PASS if any of these appear in output
```
### 2.4 Record Execution Results
```yaml
execution_results:
- scenario_id: 1
name: "Project Initialization"
command: "npx heimdall --version"
status: "PASS"
exit_code: 0
output: "1.0.0"
duration_ms: 1250
- scenario_id: 3
name: "Database Migration"
command: "npx heimdall db migrate"
status: "FAIL"
exit_code: 1
output: ""
stderr: "Error: relation 'events' does not exist"
duration_ms: 3400
```
---
## Phase 3: Evaluate Gate
### 3.1 Determine Gate Status
**Gate Mode: Quick** (default)
- Only evaluate automatable scenarios
- All automatable must pass for GATE_PASS
**Gate Mode: Full**
- Evaluate automatable + semi-automated
- Semi-automated failures are warnings, not blockers
### 3.2 Gate Decision Logic
```
if all_automatable_passed:
GATE_RESULT = PASS
else:
GATE_RESULT = FAIL
if self_heal_enabled and attempts < max_retries:
generate_fix_context()
trigger_quick_dev()
elif attempts >= max_retries:
halt_chain()
```
### 3.3 Generate Fix Context (On Failure)
When gate fails, create fix context document at `{fix_context_file}`:
**Include for each failure:**
1. Scenario ID and name
2. Command that was executed
3. Expected result
4. Actual result (stdout/stderr)
5. Exit code
6. Related story ID (if determinable)
7. Acceptance criteria from story
**Use template:** `{installed_path}/uat-fix-context-template.md`
---
## Phase 4: Report Results
### 4.1 Update Metrics File
Update or create `{metrics_file}` with validation results:
```yaml
validation:
gate_executed: true
gate_mode: "quick"
timestamp: "{date}"
results:
passed: 5
failed: 1
skipped: 3
gate_status: "FAIL"
blocking_issues:
- scenario_id: 3
name: "Database Migration"
error: "relation 'events' does not exist"
fix_attempts: 1
fix_history:
- attempt: 1
timestamp: "{date}"
failed_scenarios: ["scenario-3"]
fix_context: "docs/sprint-artifacts/uat-fix-context-1-1.md"
result: "pending"
```
### 4.2 Output Gate Result
Output in parseable format for orchestration scripts:
```
UAT_GATE_RESULT: FAIL
CRITICAL_PASSED: 5/6
BLOCKING_ISSUES: [scenario-3]
FIX_CONTEXT: docs/sprint-artifacts/uat-fix-context-1-1.md
FIX_ATTEMPT: 1/2
```
### 4.3 Console Summary
Also output human-readable summary:
```
UAT Validation Results - Epic 1
================================
Scenarios Executed: 6/9
- Automatable: 6 (executed)
- Semi-automated: 2 (flagged for manual)
- Manual only: 1 (skipped)
Results:
✓ Scenario 1: Project Initialization - PASS
✓ Scenario 2: Configuration Setup - PASS
✗ Scenario 3: Database Migration - FAIL
Error: relation 'events' does not exist
✓ Scenario 4: Connection Validation - PASS
✓ Scenario 5: Worker Process Startup - PASS
✓ Scenario 6: Job Queue Testing - PASS
Gate Status: FAIL (5/6 passed)
Initiating self-healing fix loop (attempt 1/2)...
Fix context generated: docs/sprint-artifacts/uat-fix-context-1-1.md
```
---
## Self-Healing Integration
### Triggering Quick Dev
When gate fails and `self_heal_enabled: true`:
1. Generate fix context document
2. Invoke quick-dev workflow with fix context as input
3. After quick-dev completes, re-run UAT validate
4. Repeat until PASS or max_retries reached
### Orchestration Script Signal
For shell orchestration, output these signals:
```bash
# On PASS
echo "UAT_GATE_RESULT: PASS"
exit 0
# On FAIL with fix attempt
echo "UAT_GATE_RESULT: FAIL"
echo "UAT_FIX_REQUIRED: true"
echo "UAT_FIX_CONTEXT: {path}"
exit 1
# On FAIL after max retries
echo "UAT_GATE_RESULT: FAIL"
echo "UAT_MAX_RETRIES: true"
echo "UAT_HALT_CHAIN: true"
exit 2
```
---
## Configuration Reference
| Variable | Default | Description |
|----------|---------|-------------|
| `gate_mode` | `quick` | Which scenarios to evaluate (quick/full/skip) |
| `timeout_per_scenario` | `30` | Seconds before scenario timeout |
| `self_heal_enabled` | `true` | Trigger quick-dev on failure |
| `max_retries` | `2` | Fix attempts before halting |
| `on_max_retries` | `halt` | Action when max retries exceeded |
---
## Error Handling
| Error | Action |
|-------|--------|
| UAT document not found | FAIL with clear error message |
| No automatable scenarios | WARN, gate passes (nothing to validate) |
| All scenarios manual | WARN, generate manual checklist, gate passes |
| Scenario command unclear | Skip scenario, log warning |
| Shell execution fails | Record as FAIL with error details |

View File

@ -0,0 +1,112 @@
# UAT Fix Context - Epic {epic_id} (Attempt {attempt})
**Generated:** {timestamp}
**Epic:** {epic_name}
**Gate Result:** FAIL ({passed}/{total} scenarios passed)
---
## Summary
This document contains the context needed to fix UAT failures for Epic {epic_id}. Load this document and implement targeted fixes for each failure listed below.
**Failures to fix:** {failure_count}
**Fix attempt:** {attempt} of {max_retries}
---
## Failed Scenarios
{#each failed_scenario}
### Scenario {scenario_id}: {scenario_name}
**Command Executed:**
```bash
{command}
```
**Expected Result:**
{expected_result}
**Actual Result:**
```
{actual_output}
```
**Error Output:**
```
{stderr}
```
**Exit Code:** {exit_code}
**Related Story:** {story_id}
**Acceptance Criteria:**
{#each acceptance_criteria}
- {criterion}
{/each}
**Root Cause Hint:**
{root_cause_hint}
**Files Likely Involved:**
{#each likely_files}
- `{file_path}`
{/each}
---
{/each}
## Fix Instructions
Address the failures above in priority order. For each fix:
1. **Analyze** - Understand why the scenario failed
2. **Locate** - Find the relevant code files
3. **Fix** - Implement the minimum change to resolve the failure
4. **Verify** - Run the scenario command locally to confirm fix
5. **Commit** - Use message format: `fix(epic-{epic_id}): {description}`
### Priority Order
{#each failed_scenario}
{priority}. **Scenario {scenario_id}**: {one_line_description}
{/each}
---
## Constraints
- Only fix the identified failures - do not refactor unrelated code
- Run the specific failing commands to verify each fix
- Run project tests after all fixes: `npm test`
- Commit with conventional format: `fix(epic-{epic_id}): {description}`
- If a fix requires changes that would break other scenarios, document the tradeoff
---
## Context Files
The following files may provide additional context:
| File | Purpose |
|------|---------|
| `{uat_doc_path}` | Full UAT document with all scenarios |
| `{story_files}` | Story files with complete acceptance criteria |
| `{architecture_doc}` | System architecture reference |
---
## After Fixing
Once all fixes are committed, the UAT validation will automatically re-run.
- **If all pass:** Epic continues to next phase
- **If failures remain:** Another fix context will be generated (attempt {next_attempt})
- **If max retries exceeded:** Chain halts for human intervention
---
*Generated by UAT Validate Workflow*
*BMAD Method - Epic Chain Self-Healing*

View File

@ -0,0 +1,156 @@
# UAT Validate Workflow
name: uat-validate
description: "Execute User Acceptance Testing scenarios against a completed epic, validate implementations meet acceptance criteria, and trigger self-healing fix loops on failures"
author: "BMad"
# Critical variables from config
config_source: "{project-root}/_bmad/bmm/config.yaml"
user_name: "{config_source}:user_name"
communication_language: "{config_source}:communication_language"
date: system-generated
planning_artifacts: "{config_source}:planning_artifacts"
implementation_artifacts: "{config_source}:implementation_artifacts"
sprint_artifacts: "{config_source}:sprint_artifacts"
output_folder: "{sprint_artifacts}"
# Workflow components
installed_path: "{project-root}/_bmad/bmm/workflows/5-validation/uat-validate"
instructions: "{installed_path}/instructions.md"
fix_context_template: "{installed_path}/uat-fix-context-template.md"
# Variables and inputs
variables:
# Project context
project_context: "**/project-context.md"
project_name: "{config_source}:project_name"
# UAT document locations
uat_docs_location: "{planning_artifacts}/uat"
uat_doc_pattern: "epic-{epic_id}-uat.md"
# Story locations (for acceptance criteria reference)
stories_location: "{implementation_artifacts}"
# Metrics output
metrics_folder: "{sprint_artifacts}/metrics"
metrics_file: "{metrics_folder}/epic-{epic_id}-metrics.yaml"
# Fix context output (for self-healing loop)
fix_context_file: "{sprint_artifacts}/uat-fix-context-{epic_id}-{attempt}.md"
# Gate configuration
gate_mode: "quick" # quick | full | skip
timeout_per_scenario: 30 # seconds per scenario execution
# Self-healing configuration
self_heal_enabled: true
max_retries: 2 # maximum fix attempts before halting
fix_workflow: "quick-dev" # workflow to invoke for fixes
on_max_retries: "halt" # halt | continue_with_warning | notify_human
# Scenario classification patterns
scenario_patterns:
automatable:
description: "Scenarios that can be fully automated via shell execution"
indicators:
- "npx"
- "npm run"
- "curl"
- "wget"
- "--version"
- "db status"
- "db migrate"
- "config validate"
- "/health"
- "test-queue"
validation_method: "shell_execution"
semi_automated:
description: "Scenarios that can be executed but require manual verification"
indicators:
- "test-send"
- "email"
- "inbox"
- "check your"
- "verify in browser"
validation_method: "execute_and_flag"
manual_only:
description: "Scenarios requiring full human interaction"
indicators:
- "Railway"
- "dashboard"
- "two terminal"
- "side by side"
- "browser"
- "visual"
validation_method: "skip_with_checklist"
# Input file patterns
input_file_patterns:
uat_document:
description: "UAT document for the epic being validated"
pattern: "{uat_docs_location}/epic-{epic_id}-uat.md"
load_strategy: "FULL_LOAD"
required: true
stories:
description: "Story files for acceptance criteria reference"
pattern: "{stories_location}/story-{epic_id}.*.md"
load_strategy: "METADATA_ONLY"
required: false
epic_metrics:
description: "Existing metrics file if re-validating"
pattern: "{metrics_folder}/epic-{epic_id}-metrics.yaml"
load_strategy: "FULL_LOAD"
required: false
# Output files
outputs:
gate_result:
description: "Gate pass/fail result for script parsing"
format: |
UAT_GATE_RESULT: {PASS|FAIL}
CRITICAL_PASSED: {n}/{total}
BLOCKING_ISSUES: [{scenario_ids}]
FIX_CONTEXT: {path_if_generated}
metrics_update:
description: "Updated metrics file with validation results"
path: "{metrics_file}"
fix_context:
description: "Fix context document for quick-dev (only on failure)"
path: "{fix_context_file}"
condition: "gate_result == FAIL"
# Workflow phases
phases:
- name: "load"
description: "Load UAT document and classify scenarios"
outputs:
- scenario_list
- classification_summary
- name: "execute"
description: "Execute automatable scenarios via shell"
outputs:
- execution_results
- pass_count
- fail_count
- name: "evaluate"
description: "Determine gate status and generate fix context if needed"
outputs:
- gate_result
- fix_context (conditional)
- name: "report"
description: "Update metrics and output results"
outputs:
- metrics_file
- gate_output
standalone: true
web_bundle: false