feat: add UAT validation workflow with self-healing fix loop

- Add uat-validator agent (Quinn) with triggers for validation, reporting, and fix context generation - Add 5-validation/uat-validate workflow with scenario classification and shell execution - Add SM agent trigger [UV] for uat-validate workflow - Add architecture docs for UAT integration with epic-chain - Support automatic quick-dev fix sessions when UAT gate fails 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-05 14:52:32 -06:00 · 2026-01-05 14:52:32 -06:00 · 2f9dc39c0b
parent 66186e1438
commit 2f9dc39c0b
8 changed files with 2050 additions and 0 deletions
--- a/docs/improvements/epic-chain-report-proposal.md
+++ b/docs/improvements/epic-chain-report-proposal.md
@ -0,0 +1,349 @@
+# Epic Chain Execution Report Generator - Proposal
+
+## Overview
+
+This proposal describes how to automatically generate a comprehensive Epic Chain Execution Report at the end of each epic chain run, similar to the sample `epic-chain-execution-report.md`.
+
+---
+
+## 1. Report Generation Strategy
+
+### When to Generate
+
+The report should be generated as **Phase 5** of the epic-chain workflow, after all epics complete:
+
+```
+Epic 1 → Epic 2 → ... → Epic N → [Report Generation] → [Optional: UAT Gate]
+```
+
+### Data Sources
+
+The report aggregates data from multiple sources created during execution:
+
+| Source | Location | Data Extracted |
+|--------|----------|----------------|
+| Chain Plan | `{sprint_artifacts}/chain-plan.yaml` | Epic order, dependencies, total stories |
+| Execution Logs | `{sprint_artifacts}/epic-{id}-execution.md` | Per-epic timing, status, issues |
+| Story Files | `docs/stories/*.md` | Story count, completion status |
+| UAT Documents | `docs/uat/epic-{id}-uat.md` | UAT generation confirmation |
+| Git Log | `git log --oneline` | Commit count per epic |
+| Handoffs | `docs/handoffs/*.md` | Cross-epic context transfers |
+
+---
+
+## 2. Workflow Integration
+
+### Option A: Add Phase to Epic Chain (Recommended)
+
+Modify `epic-chain/workflow.yaml` to include a report generation step:
+
+```yaml
+# In workflow.yaml variables section
+variables:
+  # ... existing variables ...
+
+  # Report configuration
+  chain_report_file: "{sprint_artifacts}/chain-execution-report.md"
+  generate_report: true
+  report_detail_level: "full"  # summary | standard | full
+
+# Add step reference
+steps:
+  # ... existing steps ...
+  - step: generate-report
+    file: step-06-generate-report.md
+    when: "chain_complete"
+    outputs:
+      - "{chain_report_file}"
+```
+
+### Option B: Separate Workflow (Alternative)
+
+Create `epic-chain-report/workflow.yaml` triggered post-chain:
+
+```yaml
+name: epic-chain-report
+description: "Generate execution report from completed epic chain"
+trigger: "post-chain"
+
+input_file_patterns:
+  chain_plan:
+    path: "{sprint_artifacts}/chain-plan.yaml"
+    required: true
+  execution_logs:
+    pattern: "{sprint_artifacts}/epic-*-execution.md"
+    load_strategy: "FULL_LOAD"
+```
+
+---
+
+## 3. Report Template Structure
+
+### Proposed Template: `chain-report-template.md`
+
+```markdown
+# {project_name} - Epic Chain Execution Report
+
+## Executive Summary
+
+**Project:** {project_name}
+**Execution Method:** BMAD Epic Chain (automated AI-driven development)
+**Status:** {chain_status}
+
+| Metric | Value |
+|--------|-------|
+| Total Epics | {epic_count} |
+| Total Stories | {story_count} |
+| Start Time | {start_time} |
+| End Time | {end_time} |
+| Total Duration | {duration} |
+| Average per Story | {avg_story_time} |
+
+---
+
+## Timeline
+
+### Epic Execution Duration
+
+| Epic | Name | Stories | Duration | Status |
+|------|------|---------|----------|--------|
+{epic_timeline_rows}
+| **Total** | | **{story_count}** | **{duration}** | **{completion_pct}%** |
+
+---
+
+## Dependency Graph
+
+{dependency_graph_mermaid}
+
+### Explicit Dependencies
+
+| Epic | Depends On | Reason |
+|------|------------|--------|
+{dependency_table_rows}
+
+---
+
+## What Was Built
+
+{per_epic_summary}
+
+---
+
+## Issues Encountered
+
+{issues_section}
+
+---
+
+## Artifacts Generated
+
+| Artifact | Location | Description |
+|----------|----------|-------------|
+| Story Files | `docs/stories/` | {story_count} completed stories |
+| UAT Documents | `docs/uat/` | {epic_count} UAT test documents |
+| Epic Files | `docs/epics/` | {epic_count} epic definitions |
+| Handoffs | `docs/handoffs/` | Cross-epic context documents |
+| Chain Plan | `{chain_plan_file}` | Execution plan with dependencies |
+
+---
+
+## Metrics
+
+### Estimated Token Usage
+
+| Epic | Stories | Est. Calls | Est. Input | Est. Output | Est. Total |
+|------|---------|------------|------------|-------------|------------|
+{token_estimate_rows}
+
+### Cost Estimates
+
+| Model | Input Cost | Output Cost | Total |
+|-------|------------|-------------|-------|
+| Claude Sonnet 3.5 | ~${sonnet_input} | ~${sonnet_output} | ~${sonnet_total} |
+| Claude Opus | ~${opus_input} | ~${opus_output} | ~${opus_total} |
+
+---
+
+## UAT Validation Status
+
+| Epic | UAT Doc | Automatable | Auto-Passed | Manual Required | Status |
+|------|---------|-------------|-------------|-----------------|--------|
+{uat_status_rows}
+
+---
+
+## Next Steps
+
+1. **Review UAT Documents** - Review the {epic_count} UAT documents in `docs/uat/`
+2. **Execute UAT Validation** - Run `/uat-validator` for automated scenario testing
+3. **Manual Acceptance Testing** - Execute manual test scenarios
+4. **Code Review** - Review generated code for refinements
+5. **Deploy to Staging** - Deploy complete system to staging environment
+
+---
+
+*Report generated: {generation_timestamp}*
+*BMAD Method v{bmad_version}*
+```
+
+---
+
+## 4. Data Collection During Execution
+
+### Metrics to Track Per Epic
+
+Add to `epic-execute` workflow to collect data for the report:
+
+```yaml
+# Proposed: epic-metrics.yaml (created per epic)
+epic_id: 1
+epic_name: "Foundation, CLI & Deployment"
+stories:
+  total: 7
+  completed: 7
+  failed: 0
+  skipped: 0
+timing:
+  start_time: "2026-01-02T13:40:00Z"
+  end_time: "2026-01-02T15:10:00Z"
+  duration_seconds: 5400
+  avg_story_seconds: 771
+issues:
+  - story: "1-3"
+    type: "signaling_mismatch"
+    description: "Completed but didn't output expected phrase"
+    resolution: "manual_status_update"
+dependencies:
+  requires: []
+  enables: ["2", "5"]
+artifacts:
+  stories_created: 7
+  uat_generated: true
+  commits: 7
+```
+
+### Collection Script Enhancement
+
+The orchestration script (`epic-chain.sh`) should:
+
+1. **Start timer** at chain initialization
+2. **Per epic**: Record start/end times, story counts, issues
+3. **Write metrics** to `{sprint_artifacts}/epic-{id}-metrics.yaml`
+4. **On completion**: Trigger report generation step
+
+---
+
+## 5. UAT Validation Integration
+
+### Gate Check Before Next Epic (Optional)
+
+```yaml
+# In epic-chain workflow
+chain_mode: "dependency-aware"
+uat_gate:
+  enabled: true
+  mode: "quick"  # quick | full | skip
+  blocking: false  # If true, stops chain on UAT failure
+
+# After each epic completes:
+# 1. Generate UAT doc (already in epic-execute)
+# 2. Run uat-quick validation
+# 3. Record results in metrics
+# 4. Continue or halt based on blocking setting
+```
+
+### Validation Flow
+
+```
+Epic Complete
+     │
+     ▼
+Generate UAT Doc
+     │
+     ▼
+Run UAT Quick ──────┐
+(automatable only)  │
+     │              │
+     ▼              ▼
+ PASS           FAIL
+   │              │
+   ▼              ▼
+Continue     blocking=true? ──► HALT CHAIN
+                │
+                ▼ (blocking=false)
+           Log Warning
+                │
+                ▼
+           Continue
+```
+
+---
+
+## 6. Implementation Phases
+
+### Phase 1: Metrics Collection
+- [ ] Add timing instrumentation to `epic-execute.sh`
+- [ ] Create `epic-metrics.yaml` output per epic
+- [ ] Store in `{sprint_artifacts}/metrics/`
+
+### Phase 2: Report Generation
+- [ ] Create `step-06-generate-report.md` for epic-chain
+- [ ] Build `chain-report-template.md` template
+- [ ] Add report generation to workflow.yaml
+
+### Phase 3: UAT Integration
+- [ ] Create UAT Validator agent (see `uat-validator.agent.yaml`)
+- [ ] Add `uat-validate/workflow.yaml`
+- [ ] Integrate gate check into epic-chain
+
+### Phase 4: Visualization
+- [ ] Add Mermaid dependency graph generation
+- [ ] Add timeline visualization
+- [ ] Consider HTML report option
+
+---
+
+## 7. Report Generation Agent Action
+
+For the SM agent or a dedicated Report Generator, add this action:
+
+```yaml
+- trigger: CR or fuzzy match on chain-report
+  action: |
+    Generate Epic Chain Execution Report:
+    1. Load chain-plan.yaml for epic list and dependencies
+    2. For each epic, load epic-{id}-metrics.yaml
+    3. Aggregate timing, story counts, issues
+    4. Generate dependency graph (Mermaid format)
+    5. Calculate token/cost estimates
+    6. Load UAT validation results if available
+    7. Render template with collected data
+    8. Output to {sprint_artifacts}/chain-execution-report.md
+  description: "[CR] Generate comprehensive execution report for completed epic chain"
+```
+
+---
+
+## 8. Sample Output
+
+See `/epic-chain-execution-report.md` for a complete example of the target output format. Key sections:
+
+- Executive summary with totals
+- Timeline table with per-epic duration
+- Dependency graph (ASCII or Mermaid)
+- What was built (per epic)
+- Issues encountered
+- Artifacts generated
+- Token/cost estimates
+- Next steps
+
+---
+
+## Questions for Decision
+
+1. **Report timing**: Generate after each epic (incremental) or only at chain end?
+2. **UAT gate**: Should failed UAT block the chain or just warn?
+3. **Token tracking**: Actual counts (requires API integration) or estimates?
+4. **Report format**: Markdown only, or also HTML/PDF export?
+5. **Integration with SM**: Add to SM agent menu, or create dedicated reporter agent?
--- a/docs/improvements/epic-workflows-v1.md
+++ b/docs/improvements/epic-workflows-v1.md
@ -0,0 +1,361 @@
+# Epic Workflows Improvement Plan v1
+
+**Date:** 2026-01-02
+**Workflows Reviewed:** epic-execute, epic-chain
+**Status:** Active
+
+---
+
+## Overview
+
+This document captures the review findings and improvement roadmap for the epic-execute and epic-chain workflows. These workflows automate story execution with context isolation between development and review phases.
+
+---
+
+## What's Working Well
+
+### 1. Context Isolation Architecture
+The decision to run dev and review in separate Claude contexts is the key innovation:
+- Prevents reviewer bias from seeing implementation struggles
+- Maximizes context window for each phase
+- Simulates real code review where reviewers see code "cold"
+- Uses git staging as the communication medium between phases
+
+### 2. Severity-Based Fix Policy
+The issue severity system (HIGH/MEDIUM/LOW) with threshold-based fixing is pragmatic:
+- Prevents over-engineering on minor issues
+- Ensures critical issues always get fixed
+- Documents low-severity for future cleanup sprints
+
+**Location:** `step-03-code-review.md:17-27`
+
+### 3. Structured Documentation Trail
+The Dev Agent Record and Code Review Record sections create an auditable history:
+- Understanding why decisions were made
+- Debugging issues later
+- Training/improving the workflow
+
+### 4. Chain Dependency Analysis
+Epic-chain's analysis phase detecting both explicit and implicit dependencies shows good foresight.
+
+**Location:** `instructions.md:57-88`
+
+### 5. Shell Scripts Quality
+- Clean argument parsing
+- Proper error handling with `set -e`
+- Good logging with timestamps
+- Flexible story discovery (multiple naming conventions)
+- Resume capability with `--start-from`
+
+---
+
+## Improvement Areas
+
+### HIGH Priority
+
+#### 1. Security: `--dangerously-skip-permissions` Flag
+
+**Location:** `epic-execute.sh:291-292`
+
+```bash
+result=$(claude --dangerously-skip-permissions -p "$dev_prompt" 2>&1) || true
+```
+
+**Problem:** This bypasses safety checks and is concerning for production use.
+
+**Proposed Fix:**
+- Document the security implications clearly in README
+- Add a `--require-approval` mode that doesn't use this flag
+- Have the script detect and prompt for dangerous operations
+- Consider environment variable to explicitly opt-in: `BMAD_ALLOW_DANGEROUS=true`
+
+---
+
+#### 2. Missing Test Execution Validation
+
+**Location:** `epic-execute.sh` (dev phase)
+
+**Problem:** The dev prompt says "Run tests and fix any failures" but the shell script doesn't verify tests actually passed. The completion signal (`IMPLEMENTATION COMPLETE`) is trusted without validation.
+
+**Proposed Fix:**
+```bash
+# After dev phase, before review
+execute_test_verification() {
+    local test_cmd="${TEST_COMMAND:-npm test}"
+
+    log ">>> VERIFYING TESTS"
+
+    if ! $test_cmd 2>&1; then
+        log_error "Tests failing after dev phase"
+        return 1
+    fi
+
+    log_success "Tests passing"
+    return 0
+}
+```
+
+---
+
+#### 3. Add Pre-flight Confirmation
+
+**Location:** `epic-execute.sh` (after story discovery)
+
+**Problem:** No validation step shows the user which stories will be executed before starting.
+
+**Proposed Fix:**
+```bash
+# After discovering stories, before execution
+display_execution_plan() {
+    echo ""
+    log "Execution Plan:"
+    for story in "${STORIES[@]}"; do
+        echo "  - $(basename "$story")"
+    done
+    echo ""
+
+    if [ "$AUTO_APPROVE" != true ]; then
+        read -p "Proceed with execution? (y/n) " -n 1 -r
+        echo
+        if [[ ! $REPLY =~ ^[Yy]$ ]]; then
+            log "Execution cancelled by user"
+            exit 0
+        fi
+    fi
+}
+```
+
+---
+
+### MEDIUM Priority
+
+#### 4. Context Handoff is Placeholder
+
+**Location:** `epic-chain.sh:411-432`
+
+**Problem:** Current handoff only lists files changed:
+```bash
+$(git diff --name-only HEAD~${story_count} HEAD 2>/dev/null | head -20)
+```
+
+The documented template in `instructions.md:288-312` describes rich context (patterns, decisions, gotchas) but this isn't generated.
+
+**Proposed Fix:**
+```bash
+generate_rich_handoff() {
+    local epic_id="$1"
+    local next_epic="$2"
+    local handoff_file="$3"
+
+    local handoff_prompt="You are generating a context handoff document.
+
+## Task
+Create a handoff from Epic $epic_id to Epic $next_epic.
+
+## Recently Modified Files
+$(git diff --name-only HEAD~${story_count} HEAD 2>/dev/null)
+
+## Epic Content
+$(cat "${EPIC_FILES_LIST[$current_idx]}")
+
+## Generate a handoff document with:
+1. Patterns Established - coding conventions, architectural decisions
+2. Key Decisions - major technical choices with rationale
+3. Gotchas & Lessons Learned - issues encountered, workarounds
+4. Files to Reference - key files that establish patterns
+5. Test Patterns - testing conventions used
+
+Output as markdown."
+
+    claude -p "$handoff_prompt" > "$handoff_file"
+}
+```
+
+---
+
+#### 5. No Rollback Mechanism
+
+**Problem:** If review fails or execution gets interrupted mid-story, there's no easy way to rollback.
+
+**Proposed Fix:**
+```bash
+# At start of epic execution
+create_checkpoint() {
+    CHECKPOINT=$(git rev-parse HEAD)
+    echo "$CHECKPOINT" > "/tmp/bmad-checkpoint-$EPIC_ID"
+    log "Checkpoint created: $CHECKPOINT"
+}
+
+# On failure or user abort
+rollback_to_checkpoint() {
+    if [ -f "/tmp/bmad-checkpoint-$EPIC_ID" ]; then
+        local checkpoint=$(cat "/tmp/bmad-checkpoint-$EPIC_ID")
+        read -p "Rollback to checkpoint $checkpoint? (y/n) " -n 1 -r
+        echo
+        if [[ $REPLY =~ ^[Yy]$ ]]; then
+            git reset --hard "$checkpoint"
+            log_success "Rolled back to checkpoint"
+        fi
+    fi
+}
+```
+
+---
+
+#### 6. Wire Up Configuration File
+
+**Location:** `config/default-config.yaml` exists but isn't used
+
+**Problem:** The configuration documented in `workflow.md:104-122` isn't actually loaded by the shell script.
+
+**Proposed Fix:**
+```bash
+# Load configuration
+load_config() {
+    local config_file="$BMAD_DIR/_cfg/epic-execute.yaml"
+
+    if [ -f "$config_file" ]; then
+        # Parse YAML (requires yq or similar)
+        AUTO_COMMIT=$(yq '.auto_commit // true' "$config_file")
+        RUN_TESTS_BEFORE_REVIEW=$(yq '.run_tests_before_review // true' "$config_file")
+        REVIEW_MODE=$(yq '.review_mode // "standard"' "$config_file")
+        log "Loaded config from $config_file"
+    else
+        # Defaults
+        AUTO_COMMIT=true
+        RUN_TESTS_BEFORE_REVIEW=true
+        REVIEW_MODE="standard"
+    fi
+}
+```
+
+---
+
+#### 7. Remove or Implement `--parallel` Flag
+
+**Location:** `epic-execute.sh:11, 93-96`
+
+**Problem:** The `--parallel` flag exists in argument parsing but isn't implemented.
+
+**Proposed Fix:** Either:
+- Remove the flag entirely until implemented
+- Add a clear error: `log_error "--parallel not yet implemented"`
+- Implement parallel execution for independent stories
+
+---
+
+### LOW Priority
+
+#### 8. Prompt Duplication
+
+**Problem:** Prompts are duplicated between step files (documentation) and shell script (execution).
+
+**Proposed Fix:** Source prompts from step files:
+```bash
+build_dev_prompt() {
+    local story_file="$1"
+    local template="$WORKFLOW_DIR/steps/step-02-dev-story.md"
+
+    # Extract prompt template section
+    # Substitute variables
+    export story_id=$(basename "$story_file" .md)
+    export story_file_contents=$(cat "$story_file")
+
+    cat "$template" | envsubst
+}
+```
+
+---
+
+#### 9. Missing sprint-status.yaml Update
+
+**Location:** `workflow.md:73` mentions this but it's not implemented
+
+**Proposed Fix:** Add after successful completion:
+```bash
+update_sprint_status() {
+    local status_file="$PROJECT_ROOT/docs/sprints/sprint-status.yaml"
+
+    if [ -f "$status_file" ]; then
+        # Update epic status to completed
+        # This requires yq or similar YAML tool
+        yq -i ".epics.\"$EPIC_ID\".status = \"done\"" "$status_file"
+        yq -i ".epics.\"$EPIC_ID\".completed_at = \"$(date -Iseconds)\"" "$status_file"
+    fi
+}
+```
+
+---
+
+#### 10. Story Discovery Edge Cases
+
+**Location:** `epic-execute.sh:181-206`
+
+**Problem:**
+- Relies on consistent naming conventions
+- Content grep could false-positive
+- No warning when stories found in unexpected locations
+
+**Proposed Fix:** Add source tracking and validation:
+```bash
+# Track where each story was found
+declare -A STORY_SOURCES
+
+for story in "${STORIES[@]}"; do
+    source_dir=$(dirname "$story")
+    STORY_SOURCES["$story"]="$source_dir"
+done
+
+# Warn about unexpected locations
+for story in "${STORIES[@]}"; do
+    if [[ "${STORY_SOURCES[$story]}" != "$STORIES_DIR" ]]; then
+        log_warn "Story found in non-standard location: $story"
+    fi
+done
+```
+
+---
+
+## Implementation Roadmap
+
+### Phase 1: Critical Fixes
+- [ ] Add test verification step
+- [ ] Add pre-flight confirmation
+- [ ] Document `--dangerously-skip-permissions` risks
+
+### Phase 2: Reliability
+- [ ] Implement rollback mechanism
+- [ ] Wire up configuration file
+- [ ] Fix or remove `--parallel` flag
+
+### Phase 3: Quality of Life
+- [ ] Generate rich context handoffs
+- [ ] Source prompts from step files
+- [ ] Add sprint-status.yaml updates
+
+### Phase 4: Advanced Features
+- [ ] Implement parallel story execution
+- [ ] Add `--interactive` mode for step-by-step approval
+- [ ] Track execution metrics (time per story, fix rate)
+
+---
+
+## Ratings Summary
+
+| Aspect | Rating | Notes |
+|--------|--------|-------|
+| Architecture | Excellent | Context isolation is the right approach |
+| Documentation | Very Good | Clear workflow diagrams, step files |
+| Shell Scripts | Good | Well-structured, needs hardening |
+| Error Handling | Fair | Basic coverage, needs rollback |
+| Security | Needs Work | `--dangerously-skip-permissions` |
+| Completeness | Good | Some features documented but not implemented |
+
+---
+
+## References
+
+- `src/modules/bmm/workflows/4-implementation/epic-execute/`
+- `src/modules/bmm/workflows/4-implementation/epic-chain/`
+- `scripts/epic-execute.sh`
+- `scripts/epic-chain.sh`
--- a/docs/improvements/uat-integration-architecture.md
+++ b/docs/improvements/uat-integration-architecture.md
@ -0,0 +1,670 @@
+# UAT Validation Integration Architecture
+
+## Overview
+
+This document describes how UAT validation integrates with the epic-chain workflow to provide automated quality gates, self-healing fix loops, and comprehensive validation reporting.
+
+---
+
+## Integration Points
+
+```
+┌──────────────────────────────────────────────────────────────────────────────────┐
+│                    EPIC CHAIN WITH UAT VALIDATION + SELF-HEALING                  │
+├──────────────────────────────────────────────────────────────────────────────────┤
+│                                                                                   │
+│  ┌─────────────────────────────────────────────────────────────────────────────┐ │
+│  │                            PER EPIC LOOP                                     │ │
+│  │                                                                              │ │
+│  │  ┌─────────┐   ┌─────────┐   ┌─────────┐   ┌─────────┐   ┌─────────┐       │ │
+│  │  │ Phase 1 │──►│ Phase 2 │──►│ Phase 3 │──►│ Phase 4 │──►│ Phase 5 │       │ │
+│  │  │  Dev    │   │ Review  │   │ Commit  │   │  UAT    │   │  Gate   │       │ │
+│  │  │         │   │         │   │         │   │  Gen    │   │ Check   │       │ │
+│  │  └─────────┘   └─────────┘   └─────────┘   └─────────┘   └────┬────┘       │ │
+│  │                                                                │            │ │
+│  └────────────────────────────────────────────────────────────────┼────────────┘ │
+│                                                                   │              │
+│                                                     ┌─────────────┴───────┐      │
+│                                                     │    GATE DECISION    │      │
+│                                                     └──────────┬──────────┘      │
+│                                                                │                 │
+│                              ┌──────────────────┬──────────────┴──────────┐      │
+│                              │                  │                         │      │
+│                              ▼                  ▼                         ▼      │
+│                           ┌──────┐          ┌──────┐               ┌──────────┐  │
+│                           │ PASS │          │ FAIL │               │ MAX      │  │
+│                           │      │          │      │               │ RETRIES  │  │
+│                           └──┬───┘          └──┬───┘               └────┬─────┘  │
+│                              │                 │                        │        │
+│                              │                 ▼                        ▼        │
+│                              │    ┌────────────────────────┐    ┌────────────┐   │
+│                              │    │      SELF-HEALING      │    │   HALT +   │   │
+│                              │    │                        │    │   NOTIFY   │   │
+│                              │    │  ┌──────────────────┐  │    └────────────┘   │
+│                              │    │  │  Quick Dev Fix   │  │                     │
+│                              │    │  │  (Barry Agent)   │  │                     │
+│                              │    │  │                  │  │                     │
+│                              │    │  │ • Load failures  │  │                     │
+│                              │    │  │ • Generate fix   │  │                     │
+│                              │    │  │ • Commit changes │  │                     │
+│                              │    │  └────────┬─────────┘  │                     │
+│                              │    │           │            │                     │
+│                              │    │           ▼            │                     │
+│                              │    │  ┌──────────────────┐  │                     │
+│                              │    │  │   Re-validate    │  │                     │
+│                              │    │  │   UAT Gate       │──┼──► Back to GATE     │
+│                              │    │  └──────────────────┘  │                     │
+│                              │    │                        │                     │
+│                              │    └────────────────────────┘                     │
+│                              │                                                   │
+│                              ▼                                                   │
+│                        ┌──────────┐                                              │
+│                        │  Next    │                                              │
+│                        │  Epic    │                                              │
+│                        └──────────┘                                              │
+│                                                                                   │
+│  ┌─────────────────────────────────────────────────────────────────────────────┐ │
+│  │                          CHAIN COMPLETION                                    │ │
+│  │                                                                              │ │
+│  │  ┌─────────────────┐   ┌─────────────────┐   ┌─────────────────────┐       │ │
+│  │  │  Aggregate      │──►│  Generate       │──►│  Final UAT         │       │ │
+│  │  │  Metrics        │   │  Chain Report   │   │  Summary           │       │ │
+│  │  └─────────────────┘   └─────────────────┘   └─────────────────────┘       │ │
+│  │                                                                              │ │
+│  └─────────────────────────────────────────────────────────────────────────────┘ │
+│                                                                                   │
+└──────────────────────────────────────────────────────────────────────────────────┘
+```
+
+---
+
+## Self-Healing Loop: UAT Failure → Quick Dev → Re-validate
+
+When UAT validation fails, the system automatically triggers a quick-dev session to fix the identified issues.
+
+### Flow Detail
+
+```
+UAT Gate Check
+     │
+     ├── PASS ──────────────────────────────► Continue to Next Epic
+     │
+     └── FAIL
+          │
+          ▼
+     ┌─────────────────────────────────────────────────────────────┐
+     │                    FAILURE ANALYSIS                          │
+     │                                                              │
+     │  1. Collect failed scenarios with:                          │
+     │     - Scenario ID and description                           │
+     │     - Expected vs actual output                             │
+     │     - Error messages / stack traces                         │
+     │     - Related story acceptance criteria                     │
+     │                                                              │
+     │  2. Generate fix context document:                          │
+     │     docs/sprint-artifacts/uat-fix-context-{epic}-{attempt}.md │
+     └─────────────────────────────────────────────────────────────┘
+          │
+          ▼
+     ┌─────────────────────────────────────────────────────────────┐
+     │                    QUICK DEV SESSION                         │
+     │                    (Barry - Quick Flow Solo Dev)             │
+     │                                                              │
+     │  Input: uat-fix-context-{epic}-{attempt}.md                 │
+     │                                                              │
+     │  Process:                                                    │
+     │  1. Load fix context (failed scenarios + error details)     │
+     │  2. Analyze root cause for each failure                     │
+     │  3. Implement targeted fixes                                │
+     │  4. Run self-check (step-04)                                │
+     │  5. Commit with message: "fix(epic-{id}): UAT fix #{n}"     │
+     │                                                              │
+     │  Output:                                                     │
+     │  - Code changes committed                                   │
+     │  - Fix summary in story dev record                          │
+     └─────────────────────────────────────────────────────────────┘
+          │
+          ▼
+     ┌─────────────────────────────────────────────────────────────┐
+     │                    RE-VALIDATE                               │
+     │                                                              │
+     │  Run UAT Gate Check again on same scenarios                 │
+     │                                                              │
+     │  Outcomes:                                                   │
+     │  - PASS → Continue to next epic                             │
+     │  - FAIL + attempts < max_retries → Loop back to Quick Dev   │
+     │  - FAIL + attempts >= max_retries → HALT chain              │
+     └─────────────────────────────────────────────────────────────┘
+```
+
+### Configuration
+
+```yaml
+# In epic-chain config
+uat:
+  gate_enabled: true
+  gate_mode: quick              # quick | full | skip
+
+  # Self-healing configuration
+  self_heal:
+    enabled: true
+    max_retries: 2              # Maximum fix attempts per epic
+    fix_workflow: quick-dev     # Workflow to use for fixes
+    fix_agent: barry            # Agent to invoke
+
+    # What to include in fix context
+    include_in_context:
+      - failed_scenarios
+      - error_output
+      - related_stories
+      - acceptance_criteria
+      - recent_commits         # Last 3 commits for context
+
+    # Escalation
+    on_max_retries: halt        # halt | continue_with_warning | notify_human
+    notification_channel: null  # Optional: slack, email, etc.
+```
+
+### Fix Context Document Template
+
+Generated when UAT fails, consumed by Quick Dev:
+
+```markdown
+# UAT Fix Context - Epic {epic_id} (Attempt {n})
+
+## Failed Scenarios
+
+### Scenario {id}: {name}
+
+**Expected Result:**
+{expected}
+
+**Actual Result:**
+{actual}
+
+**Error Output:**
+```
+{stderr or error message}
+```
+
+**Related Story:** {story_id}
+**Acceptance Criteria:**
+- {criteria from story}
+
+---
+
+## Fix Instructions
+
+Address the following failures in priority order:
+
+1. **{scenario_id}**: {one-line description of what's broken}
+   - Root cause hint: {if determinable}
+   - Files likely involved: {if determinable}
+
+## Constraints
+
+- Only fix the identified failures
+- Do not refactor unrelated code
+- Run tests after each fix
+- Commit with message format: `fix(epic-{id}): {description}`
+```
+
+---
+
+## UAT Scenario Classification
+
+Based on the UAT sample document, scenarios fall into three categories:
+
+### Automatable (Execute via Shell)
+
+| Scenario Type | Example | Automation Method |
+|---------------|---------|-------------------|
+| CLI commands | `npx heimdall --version` | Shell execution, check exit code + output |
+| Build verification | `npm run build` | Shell execution, parse output for success |
+| API health checks | `curl /health` | HTTP request, validate JSON response |
+| Database status | `npx heimdall db status` | Shell execution, parse structured output |
+| Configuration validation | `npx heimdall config validate` | Shell execution, check for "valid" in output |
+
+### Semi-Automated (Execute + Manual Verify)
+
+| Scenario Type | Example | Approach |
+|---------------|---------|----------|
+| Email delivery | `npx heimdall test-send` | Execute command, log message ID, flag for inbox check |
+| File creation | `heimdall config init` | Execute, verify file exists, show contents for review |
+| Worker processes | `heimdall start` | Start, verify startup message, terminate after timeout |
+
+### Manual Only
+
+| Scenario Type | Example | Approach |
+|---------------|---------|----------|
+| External service setup | Railway deployment | Document steps, skip in automation |
+| Visual verification | UI appearance | Generate screenshots if possible, flag for review |
+| Multi-step human flows | Full onboarding journey | Provide checklist, require human sign-off |
+
+---
+
+## Gate Check Implementation
+
+### Quick Gate (Default)
+
+Runs only automatable scenarios from the "Minimum Requirements" section:
+
+```yaml
+# uat-gate-config.yaml
+gate_mode: quick
+timeout_per_scenario: 30  # seconds
+fail_threshold: 0         # any failure = gate fail
+
+scenarios_to_run:
+  - type: "cli_command"
+    match: "Expected Results" sections with CLI commands
+  - type: "health_check"
+    match: "/health endpoint" scenarios
+  - type: "validation"
+    match: "validate" command scenarios
+
+skip_scenarios:
+  - contains: "email inbox"
+  - contains: "Railway"
+  - contains: "browser"
+  - contains: "terminal window"
+```
+
+### Full Gate
+
+Runs all automatable scenarios plus flags semi-automated for review:
+
+```yaml
+gate_mode: full
+include_semi_automated: true
+generate_manual_checklist: true
+```
+
+---
+
+## Data Flow
+
+### Per-Epic Metrics Collection
+
+```yaml
+# Written to: {sprint_artifacts}/metrics/epic-{id}-metrics.yaml
+
+epic_id: "1"
+epic_name: "Foundation, CLI & Deployment"
+
+execution:
+  start_time: "2026-01-02T13:40:00Z"
+  end_time: "2026-01-02T15:10:00Z"
+  duration_seconds: 5400
+
+stories:
+  total: 7
+  completed: 7
+  failed: 0
+  skipped: 0
+
+uat:
+  document_generated: true
+  document_path: "docs/uat/epic-1-uat.md"
+  scenarios:
+    total: 9
+    automatable: 6
+    semi_automated: 2
+    manual_only: 1
+
+validation:
+  gate_executed: true
+  gate_mode: "quick"
+  results:
+    passed: 6
+    failed: 0
+    skipped: 3
+  gate_status: "PASS"
+  blocking_issues: []
+
+  # Self-healing loop tracking
+  fix_attempts: 0
+  fix_history: []
+  # Example when fixes were needed:
+  # fix_attempts: 2
+  # fix_history:
+  #   - attempt: 1
+  #     failed_scenarios: ["scenario-3", "scenario-5"]
+  #     fix_context: "docs/sprint-artifacts/uat-fix-context-1-1.md"
+  #     fix_commit: "abc123"
+  #     result: "partial"  # 1 of 2 fixed
+  #   - attempt: 2
+  #     failed_scenarios: ["scenario-5"]
+  #     fix_context: "docs/sprint-artifacts/uat-fix-context-1-2.md"
+  #     fix_commit: "def456"
+  #     result: "success"  # all fixed
+
+issues:
+  - type: "signaling_mismatch"
+    story: "1-3"
+    severity: "low"
+    resolved: true
+```
+
+### Chain Report Aggregation
+
+```yaml
+# Read from: {sprint_artifacts}/metrics/epic-*-metrics.yaml
+# Write to: {sprint_artifacts}/chain-execution-report.md
+
+chain:
+  total_epics: 8
+  total_stories: 58
+  total_duration_seconds: 63000
+
+  epics:
+    - id: "1"
+      stories: 7
+      duration: 5400
+      uat_gate: "PASS"
+    # ... etc
+
+  uat_summary:
+    total_scenarios: 72
+    automatable: 48
+    auto_passed: 45
+    auto_failed: 3
+    manual_pending: 24
+
+  gate_results:
+    passed: 7
+    failed: 1
+    blocked_chain: false
+```
+
+---
+
+## Workflow File Changes
+
+### Modified: `epic-chain/workflow.yaml`
+
+```yaml
+# Add to variables section:
+variables:
+  # ... existing ...
+
+  # UAT Gate Configuration
+  uat_gate_enabled: true
+  uat_gate_mode: "quick"      # quick | full | skip
+  uat_gate_blocking: false    # If true, halts chain on failure
+
+  # Report Configuration
+  generate_chain_report: true
+  chain_report_file: "{sprint_artifacts}/chain-execution-report.md"
+  metrics_folder: "{sprint_artifacts}/metrics"
+```
+
+### New: `step-05-uat-gate.md`
+
+```markdown
+# UAT Gate Check
+
+## Purpose
+Validate epic implementation against automatable UAT scenarios before proceeding.
+
+## Inputs
+- UAT document: `docs/uat/epic-{id}-uat.md`
+- Gate config: `{uat_gate_mode}`
+
+## Process
+1. Parse UAT document for test scenarios
+2. Identify automatable scenarios (CLI commands, API calls, file checks)
+3. Execute each in isolated shell
+4. Collect results with stdout/stderr evidence
+5. Determine gate status
+
+## Outputs
+- Gate result: PASS | FAIL
+- Metrics update: `{metrics_folder}/epic-{id}-metrics.yaml`
+
+## Exit Conditions
+- PASS: Continue to next epic
+- FAIL + blocking=false: Log warning, continue
+- FAIL + blocking=true: Halt chain, require intervention
+```
+
+### New: `step-06-generate-report.md`
+
+```markdown
+# Chain Report Generation
+
+## Purpose
+Generate comprehensive execution report after chain completion.
+
+## Inputs
+- All metrics files: `{metrics_folder}/epic-*-metrics.yaml`
+- Chain plan: `{chain_plan_file}`
+- Template: `chain-report-template.md`
+
+## Process
+1. Load all epic metrics
+2. Calculate aggregates (totals, averages, percentages)
+3. Build dependency graph visualization
+4. Compile issues list
+5. Generate token/cost estimates
+6. Render report template
+
+## Outputs
+- `{chain_report_file}` - Complete execution report
+```
+
+---
+
+## Agent Interaction Model
+
+### SM Agent (Orchestrator) - With Self-Healing
+
+```
+User: *epic-chain 1-8
+
+SM: Loading chain plan...
+    Found 8 epics, 58 stories
+    Dependencies analyzed: sequential with branches
+
+    Starting Epic 1 (Foundation)...
+    [Dev → Review → Commit → UAT Gen → Gate Check]
+    Epic 1: COMPLETE (7/7 stories, UAT PASS)
+
+    Starting Epic 2 (Event Ingestion)...
+    [Dev → Review → Commit → UAT Gen → Gate Check]
+
+    ⚠ UAT GATE FAILED - 2 scenarios failed
+
+    Initiating self-healing loop (attempt 1/2)...
+    Generating fix context → docs/sprint-artifacts/uat-fix-context-2-1.md
+
+    [Invoking Barry for quick-dev fix session]
+    Barry: Loading fix context...
+           Failure 1: API endpoint returning 404
+           Failure 2: Missing auth header validation
+           Implementing fixes...
+           ✓ Fixed route registration in server.ts
+           ✓ Added auth middleware check
+           Committed: fix(epic-2): UAT fix #1
+
+    Re-validating UAT gate...
+    ✓ All scenarios now passing
+
+    Epic 2: COMPLETE (5/5 stories, UAT PASS after 1 fix)
+
+    Starting Epic 3 (Workflow Engine)...
+    ...
+
+    Chain Complete.
+    Generating execution report...
+    Report: docs/sprint-artifacts/chain-execution-report.md
+
+    Summary:
+    - 8/8 epics completed
+    - 2 epics required self-healing fixes
+    - Total fix attempts: 3
+    - All UAT gates now passing
+```
+
+### UAT Validator Agent (Validation)
+
+```
+User: *uat-validate 1
+
+Quinn: Loading UAT document for Epic 1...
+       Found 9 scenarios:
+       - 6 automatable
+       - 2 semi-automated
+       - 1 manual only
+
+       Executing automatable scenarios...
+
+       ✓ Scenario 1: Project Initialization
+         Command: npx heimdall --version
+         Result: PASS (output: "1.0.0")
+
+       ✓ Scenario 3: Database Migration
+         Command: npx heimdall db migrate
+         Result: PASS (output contains "initialized successfully")
+
+       ⚠ Scenario 7: Test Email Sending
+         Command: npx heimdall test-send --to test@example.com
+         Result: SEMI-AUTO (executed, requires inbox verification)
+         Message ID: re_abc123
+
+       ○ Scenario 9: Railway Deployment
+         Result: SKIPPED (manual only)
+
+       Summary: 6/6 automatable PASSED
+       Recommendation: APPROVE (pending manual verification of 3 scenarios)
+```
+
+### Self-Healing Loop Example (Failure → Fix → Pass)
+
+```
+Quinn: UAT Gate Check for Epic 2...
+
+       ✓ Scenario 1: Event Ingestion Endpoint
+         Command: curl -X POST http://localhost:3000/api/v1/events
+         Result: PASS (201 Created)
+
+       ✗ Scenario 2: API Key Authentication
+         Command: curl -H "X-API-Key: test" http://localhost:3000/api/v1/events
+         Expected: 200 OK with auth header validated
+         Actual: 401 Unauthorized
+         Error: "Missing authentication middleware"
+
+       ✗ Scenario 3: Events Database Table
+         Command: npx heimdall db status
+         Expected: "events table: exists"
+         Actual: Exit code 1
+         Error: "relation 'events' does not exist"
+
+       UAT_GATE_RESULT: FAIL
+       CRITICAL_PASSED: 1/3
+       BLOCKING_ISSUES: [scenario-2, scenario-3]
+
+       Generating fix context for quick-dev...
+       Output: docs/sprint-artifacts/uat-fix-context-2-1.md
+
+---
+
+Barry: Loading fix context for Epic 2 (Attempt 1)...
+
+       Issue 1: API Key Authentication failing
+       - Root cause: Auth middleware not registered on route
+       - Fix: Add authMiddleware to route handler chain
+
+       Issue 2: Events table missing
+       - Root cause: Migration file exists but wasn't run
+       - Fix: Ensure migration runs in db:migrate command
+
+       Implementing fixes...
+       ✓ Updated packages/api/src/routes/events.ts
+       ✓ Updated packages/cli/src/commands/db.ts
+
+       Running self-check...
+       ✓ Tests passing
+
+       Committing: fix(epic-2): add auth middleware, fix migration order
+
+---
+
+Quinn: Re-validating UAT Gate for Epic 2...
+
+       ✓ Scenario 1: Event Ingestion Endpoint - PASS
+       ✓ Scenario 2: API Key Authentication - PASS (was FAIL)
+       ✓ Scenario 3: Events Database Table - PASS (was FAIL)
+
+       UAT_GATE_RESULT: PASS
+       CRITICAL_PASSED: 3/3
+
+       Epic 2 approved after 1 fix attempt.
+```
+
+---
+
+## Configuration Options
+
+### Default Configuration
+
+```yaml
+# {project-root}/.bmad/bmm/config.yaml additions
+
+epic_chain:
+  # UAT Settings
+  uat:
+    gate_enabled: true
+    gate_mode: quick          # quick | full | skip
+    gate_blocking: false      # Stop chain on failure?
+    timeout_seconds: 30       # Per-scenario timeout
+
+  # Report Settings
+  report:
+    enabled: true
+    format: markdown          # markdown | html | both
+    include_token_estimates: true
+    include_dependency_graph: true
+
+  # Metrics Settings
+  metrics:
+    enabled: true
+    per_story_timing: true
+    track_retries: true
+```
+
+### Per-Run Override
+
+```bash
+# Override gate settings for a specific run
+./bmad/scripts/epic-chain.sh 1-8 --uat-gate=full --uat-blocking=true
+```
+
+---
+
+## Summary
+
+This integration provides:
+
+1. **Automated Quality Gates** - Verify implementations meet acceptance criteria before proceeding
+2. **Self-Healing Fix Loop** - Failed UAT automatically triggers quick-dev to fix issues and re-validate
+3. **Comprehensive Reporting** - Generate detailed execution reports with metrics, timing, fix history, and issues
+4. **Flexible Configuration** - Adjust gate strictness, retry limits, and escalation behavior per project/run
+5. **Clear Traceability** - Every test scenario maps back to story acceptance criteria, every fix links to failure
+6. **Graceful Degradation** - Semi-automated and manual scenarios documented but not blocking
+
+### Agent Responsibilities
+
+| Agent | Role | Key Actions |
+|-------|------|-------------|
+| **SM (Bob)** | Chain Orchestrator | Runs epic-chain, coordinates phases, triggers fix loops |
+| **Quinn** | UAT Validator | Executes scenarios, generates fix context on failure |
+| **Barry** | Quick Dev Fixer | Receives fix context, implements targeted fixes, commits |
+
+### Self-Healing Flow
+
+```
+UAT Fail → Generate Fix Context → Quick Dev Fix → Re-validate → Pass/Retry/Halt
+```
+
+The maximum retry count (default: 2) prevents infinite loops. After max retries, the chain halts and requires human intervention, ensuring issues are surfaced rather than ignored.
--- a/src/modules/bmm/agents/sm.agent.yaml
+++ b/src/modules/bmm/agents/sm.agent.yaml
@ -53,3 +53,7 @@ agent:
    - trigger: EC or fuzzy match on epic-chain
      workflow: "{project-root}/_bmad/bmm/workflows/4-implementation/epic-chain/workflow.yaml"
      description: "[EC] Analyze and execute multiple epics in sequence with dependency detection"
+
+    - trigger: UV or fuzzy match on uat-validate
+      workflow: "{project-root}/_bmad/bmm/workflows/5-validation/uat-validate/workflow.yaml"
+      description: "[UV] Validate epic against UAT scenarios with self-healing fix loop on failure"
--- a/src/modules/bmm/agents/uat-validator.agent.yaml
+++ b/src/modules/bmm/agents/uat-validator.agent.yaml
@ -0,0 +1,108 @@
+# UAT Validator Agent Definition
+# Mock/Proposal - integrates UAT validation into epic chain execution
+
+agent:
+  metadata:
+    id: "_bmad/bmm/agents/uat-validator.md"
+    name: Quinn
+    title: UAT Validator
+    icon: ✅
+    module: bmm
+    hasSidecar: false
+
+  persona:
+    role: User Acceptance Testing Specialist + Quality Gate Enforcer
+    identity: Meticulous QA professional with deep experience in end-to-end testing, user journey validation, and acceptance criteria verification. Expert at translating technical implementations into user-facing test scenarios and identifying gaps between requirements and reality.
+    communication_style: "Methodical and evidence-based. Every test has a clear purpose, every result documented with proof. Finds issues before users do."
+    principles: |
+      - UAT validates user value, not implementation details
+      - Acceptance criteria are the contract between dev and stakeholder
+      - Test execution is repeatable and traceable
+      - Issues categorized by business impact, not technical severity
+      - Automation where possible, human judgment where necessary
+
+  critical_actions:
+    - "Always load the UAT document for the epic being validated before any test execution"
+    - "Map each test scenario back to specific story acceptance criteria"
+    - "Execute automatable scenarios (CLI commands, API calls, health checks) directly via shell"
+    - "Document all test results with pass/fail status and evidence (output, screenshots, logs)"
+    - "Generate validation report with clear go/no-go recommendation"
+    - "Find if this exists, if it does, always treat it as the bible I plan and execute against: `**/project-context.md`"
+
+  # Knowledge sources for test execution
+  conversational_knowledge:
+    - name: "uat-automation-patterns"
+      description: "Patterns for automating common UAT scenario types"
+      path: "{project-root}/_bmad/bmm/data/uat-automation-patterns.yaml"
+
+  menu:
+    # Primary: Execute UAT validation for a completed epic
+    - trigger: UV or fuzzy match on uat-validate
+      workflow: "{project-root}/_bmad/bmm/workflows/5-validation/uat-validate/workflow.yaml"
+      description: "[UV] Execute UAT scenarios and validate epic against acceptance criteria (triggers self-healing on failure)"
+
+    # Review and summarize UAT results
+    - trigger: UR or fuzzy match on uat-report
+      exec: "{project-root}/_bmad/bmm/workflows/5-validation/uat-report/workflow.md"
+      description: "[UR] Generate UAT validation report with pass/fail summary and recommendations"
+
+    # Quick validation - just check automatable scenarios
+    - trigger: UQ or fuzzy match on uat-quick
+      action: |
+        Execute only the automatable UAT scenarios for the specified epic:
+        1. Load UAT document from docs/uat/epic-{id}-uat.md
+        2. Identify scenarios that can be automated (CLI commands, API endpoints, health checks)
+        3. Execute each automatable scenario in sequence
+        4. Document pass/fail for each with output evidence
+        5. Report summary: X of Y automatable scenarios passed
+        Skip scenarios requiring: manual UI interaction, external service verification, human judgment
+      description: "[UQ] Quick validation - execute only automatable UAT scenarios"
+
+    # Gate check - binary pass/fail for chain continuation
+    - trigger: UG or fuzzy match on uat-gate
+      action: |
+        Perform UAT gate check to determine if epic chain should continue:
+        1. Load UAT document and success criteria summary
+        2. Execute critical path scenarios (marked as required)
+        3. Check all "Minimum Requirements for Sign-off" items
+        4. Return: GATE_PASS (all critical passed) or GATE_FAIL (any critical failed)
+        Output format for script parsing:
+        UAT_GATE_RESULT: PASS|FAIL
+        CRITICAL_PASSED: X/Y
+        BLOCKING_ISSUES: [list if any]
+
+        On FAIL: Generate fix context document for quick-dev self-healing loop.
+      description: "[UG] UAT gate check - binary pass/fail for epic chain continuation"
+
+    # Fix context generation for self-healing loop
+    - trigger: UF or fuzzy match on uat-fix-context
+      action: |
+        Generate fix context document from failed UAT scenarios:
+        1. Load failed scenario results from last UAT gate check
+        2. For each failure:
+           - Extract scenario ID, name, expected vs actual
+           - Capture error output / stack traces
+           - Link to related story and acceptance criteria
+        3. Prioritize failures by severity (blocking first)
+        4. Generate root cause hints where determinable
+        5. Output to: docs/sprint-artifacts/uat-fix-context-{epic}-{attempt}.md
+
+        This document becomes the input for Barry's quick-dev fix session.
+      description: "[UF] Generate fix context document for quick-dev self-healing"
+
+    # Scenario generator from stories
+    - trigger: US or fuzzy match on uat-scenarios
+      action: |
+        Generate UAT test scenarios from completed story acceptance criteria:
+        1. Load all story files for the specified epic
+        2. Extract acceptance criteria from each story
+        3. Transform criteria into testable scenarios with:
+           - Clear preconditions
+           - Step-by-step actions
+           - Expected results
+           - Pass/fail criteria
+        4. Categorize as: automatable | semi-automated | manual-only
+        5. Output to docs/uat/epic-{id}-uat.md
+      description: "[US] Generate UAT scenarios from story acceptance criteria"
+
+  webskip: true
--- a/src/modules/bmm/workflows/5-validation/uat-validate/instructions.md
+++ b/src/modules/bmm/workflows/5-validation/uat-validate/instructions.md
@ -0,0 +1,290 @@
+# UAT Validate Workflow Instructions
+
+## Purpose
+
+Execute User Acceptance Testing scenarios against a completed epic to validate that implementations meet acceptance criteria. On failure, generate fix context for the self-healing quick-dev loop.
+
+## Workflow Overview
+
+```
+Load UAT Doc → Classify Scenarios → Execute Automatable → Evaluate Gate → Report/Fix
+```
+
+---
+
+## Phase 1: Load and Classify
+
+### 1.1 Load UAT Document
+
+Load the UAT document from `{uat_docs_location}/epic-{epic_id}-uat.md`.
+
+**Required sections to parse:**
+- Test Scenarios (numbered list with steps)
+- Success Criteria (checkbox items)
+- Prerequisites (environment requirements)
+
+### 1.2 Classify Each Scenario
+
+For each test scenario, classify based on indicators:
+
+| Classification | Indicators | Action |
+|----------------|------------|--------|
+| **Automatable** | `npx`, `npm run`, `curl`, `--version`, `/health`, `db status`, `config validate` | Execute via shell |
+| **Semi-automated** | `test-send`, `email`, `inbox`, `check your` | Execute + flag for manual verify |
+| **Manual only** | `Railway`, `dashboard`, `browser`, `two terminal`, `visual` | Skip, add to checklist |
+
+### 1.3 Output Classification Summary
+
+```yaml
+scenarios:
+  total: 9
+  automatable: 6
+  semi_automated: 2
+  manual_only: 1
+
+automatable_scenarios:
+  - id: 1
+    name: "Project Initialization"
+    command: "npx heimdall --version"
+    expected: "displays a version number"
+  - id: 3
+    name: "Database Migration"
+    command: "npx heimdall db migrate"
+    expected: "success message"
+  # ...
+```
+
+---
+
+## Phase 2: Execute Scenarios
+
+### 2.1 Execute Each Automatable Scenario
+
+For each automatable scenario:
+
+1. **Extract command** from scenario steps (look for code blocks or CLI references)
+2. **Set timeout** based on `{timeout_per_scenario}` (default: 30s)
+3. **Execute command** via shell
+4. **Capture output** (stdout, stderr, exit code)
+5. **Evaluate result** against expected outcome
+
+### 2.2 Result Evaluation Rules
+
+| Condition | Result |
+|-----------|--------|
+| Exit code 0 + output matches expected | PASS |
+| Exit code 0 + output doesn't match | FAIL (unexpected output) |
+| Exit code non-zero | FAIL (command error) |
+| Timeout exceeded | FAIL (timeout) |
+| Command not found | FAIL (missing dependency) |
+
+### 2.3 Expected Output Matching
+
+Use flexible matching:
+- **Contains match**: Expected text appears anywhere in output
+- **Regex match**: Pattern matches output
+- **Exit code only**: Just verify success (exit 0)
+
+Example matching rules:
+```yaml
+- scenario: "Database Migration"
+  command: "npx heimdall db migrate"
+  match_type: "contains"
+  expected: ["initialized successfully", "migration complete", "already up to date"]
+  # PASS if any of these appear in output
+```
+
+### 2.4 Record Execution Results
+
+```yaml
+execution_results:
+  - scenario_id: 1
+    name: "Project Initialization"
+    command: "npx heimdall --version"
+    status: "PASS"
+    exit_code: 0
+    output: "1.0.0"
+    duration_ms: 1250
+
+  - scenario_id: 3
+    name: "Database Migration"
+    command: "npx heimdall db migrate"
+    status: "FAIL"
+    exit_code: 1
+    output: ""
+    stderr: "Error: relation 'events' does not exist"
+    duration_ms: 3400
+```
+
+---
+
+## Phase 3: Evaluate Gate
+
+### 3.1 Determine Gate Status
+
+**Gate Mode: Quick** (default)
+- Only evaluate automatable scenarios
+- All automatable must pass for GATE_PASS
+
+**Gate Mode: Full**
+- Evaluate automatable + semi-automated
+- Semi-automated failures are warnings, not blockers
+
+### 3.2 Gate Decision Logic
+
+```
+if all_automatable_passed:
+    GATE_RESULT = PASS
+else:
+    GATE_RESULT = FAIL
+    if self_heal_enabled and attempts < max_retries:
+        generate_fix_context()
+        trigger_quick_dev()
+    elif attempts >= max_retries:
+        halt_chain()
+```
+
+### 3.3 Generate Fix Context (On Failure)
+
+When gate fails, create fix context document at `{fix_context_file}`:
+
+**Include for each failure:**
+1. Scenario ID and name
+2. Command that was executed
+3. Expected result
+4. Actual result (stdout/stderr)
+5. Exit code
+6. Related story ID (if determinable)
+7. Acceptance criteria from story
+
+**Use template:** `{installed_path}/uat-fix-context-template.md`
+
+---
+
+## Phase 4: Report Results
+
+### 4.1 Update Metrics File
+
+Update or create `{metrics_file}` with validation results:
+
+```yaml
+validation:
+  gate_executed: true
+  gate_mode: "quick"
+  timestamp: "{date}"
+  results:
+    passed: 5
+    failed: 1
+    skipped: 3
+  gate_status: "FAIL"
+  blocking_issues:
+    - scenario_id: 3
+      name: "Database Migration"
+      error: "relation 'events' does not exist"
+
+  fix_attempts: 1
+  fix_history:
+    - attempt: 1
+      timestamp: "{date}"
+      failed_scenarios: ["scenario-3"]
+      fix_context: "docs/sprint-artifacts/uat-fix-context-1-1.md"
+      result: "pending"
+```
+
+### 4.2 Output Gate Result
+
+Output in parseable format for orchestration scripts:
+
+```
+UAT_GATE_RESULT: FAIL
+CRITICAL_PASSED: 5/6
+BLOCKING_ISSUES: [scenario-3]
+FIX_CONTEXT: docs/sprint-artifacts/uat-fix-context-1-1.md
+FIX_ATTEMPT: 1/2
+```
+
+### 4.3 Console Summary
+
+Also output human-readable summary:
+
+```
+UAT Validation Results - Epic 1
+================================
+
+Scenarios Executed: 6/9
+  - Automatable: 6 (executed)
+  - Semi-automated: 2 (flagged for manual)
+  - Manual only: 1 (skipped)
+
+Results:
+  ✓ Scenario 1: Project Initialization - PASS
+  ✓ Scenario 2: Configuration Setup - PASS
+  ✗ Scenario 3: Database Migration - FAIL
+    Error: relation 'events' does not exist
+  ✓ Scenario 4: Connection Validation - PASS
+  ✓ Scenario 5: Worker Process Startup - PASS
+  ✓ Scenario 6: Job Queue Testing - PASS
+
+Gate Status: FAIL (5/6 passed)
+Initiating self-healing fix loop (attempt 1/2)...
+Fix context generated: docs/sprint-artifacts/uat-fix-context-1-1.md
+```
+
+---
+
+## Self-Healing Integration
+
+### Triggering Quick Dev
+
+When gate fails and `self_heal_enabled: true`:
+
+1. Generate fix context document
+2. Invoke quick-dev workflow with fix context as input
+3. After quick-dev completes, re-run UAT validate
+4. Repeat until PASS or max_retries reached
+
+### Orchestration Script Signal
+
+For shell orchestration, output these signals:
+
+```bash
+# On PASS
+echo "UAT_GATE_RESULT: PASS"
+exit 0
+
+# On FAIL with fix attempt
+echo "UAT_GATE_RESULT: FAIL"
+echo "UAT_FIX_REQUIRED: true"
+echo "UAT_FIX_CONTEXT: {path}"
+exit 1
+
+# On FAIL after max retries
+echo "UAT_GATE_RESULT: FAIL"
+echo "UAT_MAX_RETRIES: true"
+echo "UAT_HALT_CHAIN: true"
+exit 2
+```
+
+---
+
+## Configuration Reference
+
+| Variable | Default | Description |
+|----------|---------|-------------|
+| `gate_mode` | `quick` | Which scenarios to evaluate (quick/full/skip) |
+| `timeout_per_scenario` | `30` | Seconds before scenario timeout |
+| `self_heal_enabled` | `true` | Trigger quick-dev on failure |
+| `max_retries` | `2` | Fix attempts before halting |
+| `on_max_retries` | `halt` | Action when max retries exceeded |
+
+---
+
+## Error Handling
+
+| Error | Action |
+|-------|--------|
+| UAT document not found | FAIL with clear error message |
+| No automatable scenarios | WARN, gate passes (nothing to validate) |
+| All scenarios manual | WARN, generate manual checklist, gate passes |
+| Scenario command unclear | Skip scenario, log warning |
+| Shell execution fails | Record as FAIL with error details |
--- a/src/modules/bmm/workflows/5-validation/uat-validate/uat-fix-context-template.md
+++ b/src/modules/bmm/workflows/5-validation/uat-validate/uat-fix-context-template.md
@ -0,0 +1,112 @@
+# UAT Fix Context - Epic {epic_id} (Attempt {attempt})
+
+**Generated:** {timestamp}
+**Epic:** {epic_name}
+**Gate Result:** FAIL ({passed}/{total} scenarios passed)
+
+---
+
+## Summary
+
+This document contains the context needed to fix UAT failures for Epic {epic_id}. Load this document and implement targeted fixes for each failure listed below.
+
+**Failures to fix:** {failure_count}
+**Fix attempt:** {attempt} of {max_retries}
+
+---
+
+## Failed Scenarios
+
+{#each failed_scenario}
+### Scenario {scenario_id}: {scenario_name}
+
+**Command Executed:**
+```bash
+{command}
+```
+
+**Expected Result:**
+{expected_result}
+
+**Actual Result:**
+```
+{actual_output}
+```
+
+**Error Output:**
+```
+{stderr}
+```
+
+**Exit Code:** {exit_code}
+
+**Related Story:** {story_id}
+
+**Acceptance Criteria:**
+{#each acceptance_criteria}
+- {criterion}
+{/each}
+
+**Root Cause Hint:**
+{root_cause_hint}
+
+**Files Likely Involved:**
+{#each likely_files}
+- `{file_path}`
+{/each}
+
+---
+{/each}
+
+## Fix Instructions
+
+Address the failures above in priority order. For each fix:
+
+1. **Analyze** - Understand why the scenario failed
+2. **Locate** - Find the relevant code files
+3. **Fix** - Implement the minimum change to resolve the failure
+4. **Verify** - Run the scenario command locally to confirm fix
+5. **Commit** - Use message format: `fix(epic-{epic_id}): {description}`
+
+### Priority Order
+
+{#each failed_scenario}
+{priority}. **Scenario {scenario_id}**: {one_line_description}
+{/each}
+
+---
+
+## Constraints
+
+- Only fix the identified failures - do not refactor unrelated code
+- Run the specific failing commands to verify each fix
+- Run project tests after all fixes: `npm test`
+- Commit with conventional format: `fix(epic-{epic_id}): {description}`
+- If a fix requires changes that would break other scenarios, document the tradeoff
+
+---
+
+## Context Files
+
+The following files may provide additional context:
+
+| File | Purpose |
+|------|---------|
+| `{uat_doc_path}` | Full UAT document with all scenarios |
+| `{story_files}` | Story files with complete acceptance criteria |
+| `{architecture_doc}` | System architecture reference |
+
+---
+
+## After Fixing
+
+Once all fixes are committed, the UAT validation will automatically re-run.
+
+- **If all pass:** Epic continues to next phase
+- **If failures remain:** Another fix context will be generated (attempt {next_attempt})
+- **If max retries exceeded:** Chain halts for human intervention
+
+---
+
+*Generated by UAT Validate Workflow*
+*BMAD Method - Epic Chain Self-Healing*
--- a/src/modules/bmm/workflows/5-validation/uat-validate/workflow.yaml
+++ b/src/modules/bmm/workflows/5-validation/uat-validate/workflow.yaml
@ -0,0 +1,156 @@
+# UAT Validate Workflow
+name: uat-validate
+description: "Execute User Acceptance Testing scenarios against a completed epic, validate implementations meet acceptance criteria, and trigger self-healing fix loops on failures"
+author: "BMad"
+
+# Critical variables from config
+config_source: "{project-root}/_bmad/bmm/config.yaml"
+user_name: "{config_source}:user_name"
+communication_language: "{config_source}:communication_language"
+date: system-generated
+planning_artifacts: "{config_source}:planning_artifacts"
+implementation_artifacts: "{config_source}:implementation_artifacts"
+sprint_artifacts: "{config_source}:sprint_artifacts"
+output_folder: "{sprint_artifacts}"
+
+# Workflow components
+installed_path: "{project-root}/_bmad/bmm/workflows/5-validation/uat-validate"
+instructions: "{installed_path}/instructions.md"
+fix_context_template: "{installed_path}/uat-fix-context-template.md"
+
+# Variables and inputs
+variables:
+  # Project context
+  project_context: "**/project-context.md"
+  project_name: "{config_source}:project_name"
+
+  # UAT document locations
+  uat_docs_location: "{planning_artifacts}/uat"
+  uat_doc_pattern: "epic-{epic_id}-uat.md"
+
+  # Story locations (for acceptance criteria reference)
+  stories_location: "{implementation_artifacts}"
+
+  # Metrics output
+  metrics_folder: "{sprint_artifacts}/metrics"
+  metrics_file: "{metrics_folder}/epic-{epic_id}-metrics.yaml"
+
+  # Fix context output (for self-healing loop)
+  fix_context_file: "{sprint_artifacts}/uat-fix-context-{epic_id}-{attempt}.md"
+
+  # Gate configuration
+  gate_mode: "quick" # quick | full | skip
+  timeout_per_scenario: 30 # seconds per scenario execution
+
+  # Self-healing configuration
+  self_heal_enabled: true
+  max_retries: 2 # maximum fix attempts before halting
+  fix_workflow: "quick-dev" # workflow to invoke for fixes
+  on_max_retries: "halt" # halt | continue_with_warning | notify_human
+
+# Scenario classification patterns
+scenario_patterns:
+  automatable:
+    description: "Scenarios that can be fully automated via shell execution"
+    indicators:
+      - "npx"
+      - "npm run"
+      - "curl"
+      - "wget"
+      - "--version"
+      - "db status"
+      - "db migrate"
+      - "config validate"
+      - "/health"
+      - "test-queue"
+    validation_method: "shell_execution"
+
+  semi_automated:
+    description: "Scenarios that can be executed but require manual verification"
+    indicators:
+      - "test-send"
+      - "email"
+      - "inbox"
+      - "check your"
+      - "verify in browser"
+    validation_method: "execute_and_flag"
+
+  manual_only:
+    description: "Scenarios requiring full human interaction"
+    indicators:
+      - "Railway"
+      - "dashboard"
+      - "two terminal"
+      - "side by side"
+      - "browser"
+      - "visual"
+    validation_method: "skip_with_checklist"
+
+# Input file patterns
+input_file_patterns:
+  uat_document:
+    description: "UAT document for the epic being validated"
+    pattern: "{uat_docs_location}/epic-{epic_id}-uat.md"
+    load_strategy: "FULL_LOAD"
+    required: true
+
+  stories:
+    description: "Story files for acceptance criteria reference"
+    pattern: "{stories_location}/story-{epic_id}.*.md"
+    load_strategy: "METADATA_ONLY"
+    required: false
+
+  epic_metrics:
+    description: "Existing metrics file if re-validating"
+    pattern: "{metrics_folder}/epic-{epic_id}-metrics.yaml"
+    load_strategy: "FULL_LOAD"
+    required: false
+
+# Output files
+outputs:
+  gate_result:
+    description: "Gate pass/fail result for script parsing"
+    format: |
+      UAT_GATE_RESULT: {PASS|FAIL}
+      CRITICAL_PASSED: {n}/{total}
+      BLOCKING_ISSUES: [{scenario_ids}]
+      FIX_CONTEXT: {path_if_generated}
+
+  metrics_update:
+    description: "Updated metrics file with validation results"
+    path: "{metrics_file}"
+
+  fix_context:
+    description: "Fix context document for quick-dev (only on failure)"
+    path: "{fix_context_file}"
+    condition: "gate_result == FAIL"
+
+# Workflow phases
+phases:
+  - name: "load"
+    description: "Load UAT document and classify scenarios"
+    outputs:
+      - scenario_list
+      - classification_summary
+
+  - name: "execute"
+    description: "Execute automatable scenarios via shell"
+    outputs:
+      - execution_results
+      - pass_count
+      - fail_count
+
+  - name: "evaluate"
+    description: "Determine gate status and generate fix context if needed"
+    outputs:
+      - gate_result
+      - fix_context (conditional)
+
+  - name: "report"
+    description: "Update metrics and output results"
+    outputs:
+      - metrics_file
+      - gate_output
+
+standalone: true
+web_bundle: false