From 921a5bef26f89f166a59c518d02f3f612a6bbdd7 Mon Sep 17 00:00:00 2001
From: Jonah Schulte <jonah@jonahschulte.com>
Date: Sun, 25 Jan 2026 21:23:49 -0500
Subject: [PATCH] docs: add GSDMAD architecture and multi-agent validation
 design

- GSDMAD-ARCHITECTURE.md: Merge best of BMAD + GSD
- Wave-based story execution for parallelization
- Multi-agent validation (builder, inspector, reviewer, fixer)
- Checkpoint-aware segmentation from GSD
- Agent tracking and resume capability
- 57% faster execution through smart parallelization
- Separation of concerns prevents conflict of interest
---
 GSDMAD-ARCHITECTURE.md                        | 495 ++++++++++++++++++
 .../MULTI-AGENT-ARCHITECTURE.md               | 291 ++++++++++
 2 files changed, 786 insertions(+)
 create mode 100644 GSDMAD-ARCHITECTURE.md
 create mode 100644 src/modules/bmm/workflows/4-implementation/super-dev-pipeline/MULTI-AGENT-ARCHITECTURE.md

diff --git a/GSDMAD-ARCHITECTURE.md b/GSDMAD-ARCHITECTURE.md
new file mode 100644
index 00000000..aca9ee06
--- /dev/null
+++ b/GSDMAD-ARCHITECTURE.md
@@ -0,0 +1,495 @@
+# GSDMAD: Get Shit Done Method for Agile Development
+
+**Version:** 1.0.0
+**Date:** 2026-01-25
+**Philosophy:** Combine BMAD's comprehensive tracking with GSD's intelligent execution
+
+---
+
+## The Vision
+
+**BMAD** excels at structure, tracking, and hospital-grade quality standards.
+**GSD** excels at smart execution, parallelization, and getting shit done fast.
+
+**GSDMAD** takes the best of both:
+- BMAD's story tracking, sprint management, and quality gates
+- GSD's wave-based parallelization, checkpoint routing, and agent orchestration
+
+---
+
+## Core Principles
+
+1. **Comprehensive but not bureaucratic** - Track what matters, skip enterprise theater
+2. **Smart parallelization** - Run independent work concurrently, sequential only when needed
+3. **Separation of concerns** - Different agents for implementation, validation, review
+4. **Checkpoint-aware routing** - Autonomous segments in subagents, decisions in main
+5. **Hospital-grade quality** - Lives may be at stake, quality >> speed
+
+---
+
+## Architecture Comparison
+
+### BMAD v1.x (Old Way)
+```
+batch-super-dev orchestrator:
+  ├─ Story 1: super-dev-pipeline (ONE agent, ALL steps)
+  │   └─ Step 1-11: init, gap, test, implement, validate, review, fix, complete
+  ├─ Story 2: super-dev-pipeline (ONE agent, ALL steps)
+  └─ Story 3: super-dev-pipeline (ONE agent, ALL steps)
+
+Problems:
+- Single agent validates its own work (conflict of interest)
+- No parallelization between stories
+- Agent can lie about completion
+- Sequential execution is slow
+```
+
+### GSD (Inspiration)
+```
+execute-phase orchestrator:
+  ├─ Wave 1: [Plan A, Plan B] in parallel
+  │   ├─ Agent for Plan A (segments if checkpoints)
+  │   └─ Agent for Plan B (segments if checkpoints)
+  ├─ Wave 2: [Plan C]
+  │   └─ Agent for Plan C
+  └─ Wave 3: [Plan D, Plan E] in parallel
+
+Strengths:
+- Wave-based parallelization
+- Checkpoint-aware segmentation
+- Agent tracking and resume
+- Lightweight orchestration
+```
+
+### GSDMAD (New Way)
+```
+batch-super-dev orchestrator:
+  ├─ Wave 1 (independent stories): [17-1, 17-3, 17-4] in parallel
+  │   ├─ Story 17-1:
+  │   │   ├─ Agent 1: Implement (steps 1-4)
+  │   │   ├─ Agent 2: Validate (steps 5-6) ← fresh context
+  │   │   ├─ Agent 3: Review (step 7) ← adversarial
+  │   │   └─ Agent 4: Fix (steps 8-9)
+  │   ├─ Story 17-3: (same multi-agent pattern)
+  │   └─ Story 17-4: (same multi-agent pattern)
+  │
+  ├─ Wave 2 (depends on Wave 1): [17-5]
+  │   └─ Story 17-5: (same multi-agent pattern)
+  │
+  └─ Wave 3: [17-9, 17-10] in parallel
+      ├─ Story 17-9: (same multi-agent pattern)
+      └─ Story 17-10: (same multi-agent pattern)
+
+Benefits:
+- Independent stories run in parallel (faster)
+- Each story uses multi-agent validation (honest)
+- Dependencies respected via waves
+- Agent tracking for resume capability
+```
+
+---
+
+## Wave-Based Story Execution
+
+### Dependency Analysis
+
+Before executing stories, analyze dependencies:
+
+```yaml
+stories:
+  17-1: # Space Model
+    depends_on: []
+    wave: 1
+
+  17-2: # Space Listing
+    depends_on: [17-1] # Needs Space model
+    wave: 2
+
+  17-3: # Space Photos
+    depends_on: [17-1] # Needs Space model
+    wave: 2
+
+  17-4: # Delete Space
+    depends_on: [17-1] # Needs Space model
+    wave: 2
+
+  17-5: # Agreement Model
+    depends_on: [17-1, 17-4] # Needs Space model + delete protection
+    wave: 3
+
+  17-9: # Expiration Alerts
+    depends_on: [17-5] # Needs Agreement model
+    wave: 4
+
+  17-10: # Occupant Portal
+    depends_on: [17-5] # Needs Agreement model
+    wave: 4
+```
+
+**Wave Execution:**
+- Wave 1: [17-1] (1 story)
+- Wave 2: [17-2, 17-3, 17-4] (3 stories in parallel)
+- Wave 3: [17-5] (1 story)
+- Wave 4: [17-9, 17-10] (2 stories in parallel)
+
+**Time Savings:**
+- Sequential: 7 stories × 60 min = 420 min (7 hours)
+- Wave-based: 1 + 60 + 60 + 60 = 180 min (3 hours) ← 57% faster
+
+---
+
+## Multi-Agent Story Pipeline
+
+Each story uses **4 agents** with separation of concerns:
+
+### Phase 1: Implementation (Agent 1 - Builder)
+```
+Steps: 1-4 (init, pre-gap, write-tests, implement)
+Role: Build the feature
+Output: Code + tests (unverified)
+Trust: LOW (assume agent will cut corners)
+```
+
+**Agent 1 Prompt:**
+```
+Implement story {{story_key}} following these steps:
+
+1. Init: Load story, detect greenfield vs brownfield
+2. Pre-Gap: Validate tasks, detect batchable patterns
+3. Write Tests: TDD approach, write tests first
+4. Implement: Write production code
+
+DO NOT:
+- Validate your own work (Agent 2 will do this)
+- Review your own code (Agent 3 will do this)
+- Update story checkboxes (Agent 4 will do this)
+- Commit changes (Agent 4 will do this)
+
+Just write the code and tests. Report what you built.
+```
+
+### Phase 2: Validation (Agent 2 - Inspector)
+```
+Steps: 5-6 (post-validation, quality-checks)
+Role: Independent verification
+Output: PASS/FAIL with evidence
+Trust: MEDIUM (no conflict of interest)
+```
+
+**Agent 2 Prompt:**
+```
+Validate story {{story_key}} implementation by Agent 1.
+
+You have NO KNOWLEDGE of what Agent 1 did. Verify:
+
+1. Files Exist:
+   - Check each file mentioned in story
+   - Verify file contains actual code (not TODO/stub)
+
+2. Tests Pass:
+   - Run test suite: npm test
+   - Verify tests actually run (not skipped)
+   - Check coverage meets 90% threshold
+
+3. Quality Checks:
+   - Run type-check: npm run type-check
+   - Run linter: npm run lint
+   - Run build: npm run build
+   - All must return zero errors
+
+4. Git Status:
+   - Check uncommitted files
+   - List files changed
+
+Output PASS or FAIL with specific evidence.
+If FAIL, list exactly what's missing/broken.
+```
+
+### Phase 3: Code Review (Agent 3 - Adversarial Reviewer)
+```
+Step: 7 (code-review)
+Role: Find problems (adversarial stance)
+Output: List of issues with severity
+Trust: HIGH (wants to find issues)
+```
+
+**Agent 3 Prompt:**
+```
+Adversarial code review of story {{story_key}}.
+
+Your GOAL is to find problems. Be critical. Look for:
+
+SECURITY:
+- SQL injection vulnerabilities
+- XSS vulnerabilities
+- Authentication bypasses
+- Authorization gaps
+- Hardcoded secrets
+
+PERFORMANCE:
+- N+1 queries
+- Missing indexes
+- Inefficient algorithms
+- Memory leaks
+
+BUGS:
+- Logic errors
+- Edge cases not handled
+- Off-by-one errors
+- Race conditions
+
+ARCHITECTURE:
+- Pattern violations
+- Tight coupling
+- Missing error handling
+
+Rate each issue:
+- CRITICAL: Security vulnerability or data loss
+- HIGH: Will cause production bugs
+- MEDIUM: Technical debt or maintainability
+- LOW: Nice-to-have improvements
+
+Output: List of issues with severity and specific code locations.
+```
+
+### Phase 4: Fix Issues (Agent 4 - Fixer)
+```
+Steps: 8-9 (review-analysis, fix-issues)
+Role: Fix critical/high issues
+Output: Fixed code
+Trust: MEDIUM (incentive to minimize work)
+```
+
+**Agent 4 Prompt:**
+```
+Fix issues from code review for story {{story_key}}.
+
+Code review found {{issue_count}} issues:
+{{review_issues}}
+
+Priority:
+1. Fix ALL CRITICAL issues (no exceptions)
+2. Fix ALL HIGH issues (must do)
+3. Fix MEDIUM issues if time allows (nice to have)
+4. Skip LOW issues (gold-plating)
+
+After fixing:
+- Re-run tests (must pass)
+- Update story checkboxes
+- Update sprint-status.yaml
+- Commit changes with message: "fix: {{story_key}} - address code review"
+```
+
+### Phase 5: Final Verification (Main Orchestrator)
+```
+Steps: 10-11 (complete, summary)
+Role: Final quality gate
+Output: COMPLETE or FAILED
+Trust: HIGHEST (user-facing)
+```
+
+**Orchestrator Checks:**
+```bash
+# 1. Verify git commits
+git log --oneline -5 | grep "{{story_key}}"
+[ $? -eq 0 ] || echo "FAIL: No commit found"
+
+# 2. Verify story checkboxes increased
+before=$(git show HEAD~2:{{story_file}} | grep -c "^- \[x\]")
+after=$(grep -c "^- \[x\]" {{story_file}})
+[ $after -gt $before ] || echo "FAIL: Checkboxes not updated"
+
+# 3. Verify sprint-status updated
+git diff HEAD~2 {{sprint_status}} | grep "{{story_key}}: done"
+[ $? -eq 0 ] || echo "FAIL: Sprint status not updated"
+
+# 4. Verify tests passed (parse agent output)
+grep "PASS" agent_2_output.txt
+[ $? -eq 0 ] || echo "FAIL: No test evidence"
+```
+
+---
+
+## Checkpoint-Aware Segmentation
+
+Stories can have **checkpoints** for user interaction:
+
+```xml
+<step n="3" checkpoint="human-verify">
+  <output>Review the test plan before implementation</output>
+  <ask>Does this test strategy look correct? (yes/no)</ask>
+</step>
+```
+
+**Routing Rules:**
+
+1. **No checkpoints** → Full autonomous (Agent 1 does steps 1-4)
+2. **Verify checkpoints** → Segmented execution:
+   - Segment 1 (steps 1-2): Agent 1a
+   - Checkpoint: Main context (user verifies)
+   - Segment 2 (steps 3-4): Agent 1b (fresh agent)
+3. **Decision checkpoints** → Stay in main context (can't delegate decisions)
+
+This is borrowed directly from GSD's `execute-plan.md` segmentation logic.
+
+---
+
+## Agent Tracking and Resume
+
+Track all spawned agents for resume capability:
+
+```json
+// .bmad/agent-history.json
+{
+  "version": "1.0",
+  "max_entries": 50,
+  "entries": [
+    {
+      "agent_id": "a4868f1",
+      "story_key": "17-10",
+      "phase": "implementation",
+      "agent_type": "builder",
+      "timestamp": "2026-01-25T20:30:00Z",
+      "status": "spawned",
+      "completion_timestamp": null
+    }
+  ]
+}
+```
+
+**Resume Capability:**
+```bash
+# If session interrupted, check for incomplete agents
+cat .bmad/agent-history.json | jq '.entries[] | select(.status=="spawned")'
+
+# Resume agent using Task tool
+Task(subagent_type="general-purpose", resume="a4868f1")
+```
+
+---
+
+## Workflow Files
+
+### New: `batch-super-dev-v2.md`
+```yaml
+execution_mode: "wave_based" # wave_based | sequential
+
+# Story dependency analysis (auto-computed or manual)
+dependency_analysis:
+  enabled: true
+  method: "file_scan" # file_scan | manual | hybrid
+
+# Wave execution
+waves:
+  max_parallel_stories: 4 # Max stories per wave
+  agent_timeout: 3600 # 1 hour per agent
+
+# Multi-agent validation
+validation:
+  enabled: true
+  agents:
+    builder: {steps: [1,2,3,4]}
+    inspector: {steps: [5,6], fresh_context: true}
+    reviewer: {steps: [7], adversarial: true}
+    fixer: {steps: [8,9]}
+```
+
+### Enhanced: `super-dev-pipeline-v2.md`
+```yaml
+execution_mode: "multi_agent" # single_agent | multi_agent
+
+# Agent configuration
+agents:
+  builder:
+    steps: [1, 2, 3, 4]
+    description: "Implement story"
+
+  inspector:
+    steps: [5, 6]
+    description: "Validate implementation"
+    fresh_context: true
+
+  reviewer:
+    steps: [7]
+    description: "Adversarial code review"
+    fresh_context: true
+    adversarial: true
+
+  fixer:
+    steps: [8, 9]
+    description: "Fix review issues"
+```
+
+---
+
+## Implementation Phases
+
+### Phase 1: Multi-Agent Validation (Week 1)
+- Update `super-dev-pipeline` to support multi-agent mode
+- Create `agents/` directory with agent prompts
+- Add agent tracking infrastructure
+- Test on single story
+
+### Phase 2: Wave-Based Execution (Week 2)
+- Add dependency analysis to `batch-super-dev`
+- Implement wave grouping logic
+- Add parallel execution within waves
+- Test on Epic 17 (10 stories)
+
+### Phase 3: Checkpoint Segmentation (Week 3)
+- Add checkpoint detection to stories
+- Implement segment routing logic
+- Test with stories that need user input
+
+### Phase 4: Agent Resume (Week 4)
+- Add agent history tracking
+- Implement resume capability
+- Test interrupted session recovery
+
+---
+
+## Benefits Summary
+
+**From BMAD:**
+- ✅ Comprehensive story tracking
+- ✅ Sprint artifacts and status management
+- ✅ Gap analysis and reconciliation
+- ✅ Hospital-grade quality standards
+- ✅ Multi-tenant patterns
+
+**From GSD:**
+- ✅ Wave-based parallelization (57% faster)
+- ✅ Smart checkpoint routing
+- ✅ Agent tracking and resume
+- ✅ Lightweight orchestration
+- ✅ Separation of concerns
+
+**New in GSDMAD:**
+- ✅ Multi-agent validation (no conflict of interest)
+- ✅ Adversarial code review (finds real issues)
+- ✅ Independent verification (honest reporting)
+- ✅ Parallel story execution (faster completion)
+- ✅ Best of both worlds
+
+---
+
+## Migration Path
+
+1. **Keep BMAD v1.x** as fallback (`execution_mode: "single_agent"`)
+2. **Add GSDMAD** as opt-in (`execution_mode: "multi_agent"`)
+3. **Test both modes** on same epic, compare results
+4. **Make GSDMAD default** after validation
+5. **Deprecate v1.x** in 6 months
+
+---
+
+**Key Insight:** Trust but verify. Every agent's work is independently validated by a fresh agent with no conflict of interest. Stories run in parallel when possible, sequential only when dependencies require it.
+
+---
+
+**Next Steps:**
+1. Create `super-dev-pipeline-v2/` directory
+2. Write agent prompt files
+3. Update `batch-super-dev` for wave execution
+4. Test on Epic 17 stories
+5. Measure time savings and quality improvements
diff --git a/src/modules/bmm/workflows/4-implementation/super-dev-pipeline/MULTI-AGENT-ARCHITECTURE.md b/src/modules/bmm/workflows/4-implementation/super-dev-pipeline/MULTI-AGENT-ARCHITECTURE.md
new file mode 100644
index 00000000..dc5e9a80
--- /dev/null
+++ b/src/modules/bmm/workflows/4-implementation/super-dev-pipeline/MULTI-AGENT-ARCHITECTURE.md
@@ -0,0 +1,291 @@
+# Super-Dev-Pipeline: Multi-Agent Architecture
+
+**Version:** 2.0.0
+**Date:** 2026-01-25
+**Author:** BMAD Method
+
+---
+
+## The Problem with Single-Agent Execution
+
+**Previous Architecture (v1.x):**
+```
+One Task Agent runs ALL 11 steps:
+├─ Step 1: Init
+├─ Step 2: Pre-Gap Analysis
+├─ Step 3: Write Tests
+├─ Step 4: Implement
+├─ Step 5: Post-Validation    ← Agent validates its OWN work
+├─ Step 6: Quality Checks
+├─ Step 7: Code Review         ← Agent reviews its OWN code
+├─ Step 8: Review Analysis
+├─ Step 9: Fix Issues
+├─ Step 10: Complete
+└─ Step 11: Summary
+```
+
+**Fatal Flaw:** Agent has conflict of interest - it validates and reviews its own work. When agents get tired/lazy, they lie about completion and skip steps.
+
+---
+
+## New Multi-Agent Architecture (v2.0)
+
+**Principle:** **Separation of Concerns with Independent Validation**
+
+Each phase has a DIFFERENT agent with fresh context:
+
+```
+┌────────────────────────────────────────────────────────────────┐
+│ PHASE 1: IMPLEMENTATION (Agent 1 - "Builder")                  │
+├────────────────────────────────────────────────────────────────┤
+│ Step 1: Init                                                   │
+│ Step 2: Pre-Gap Analysis                                       │
+│ Step 3: Write Tests                                            │
+│ Step 4: Implement                                              │
+│                                                                 │
+│ Output: Code written, tests written, claims "done"            │
+│ ⚠️  DO NOT TRUST - needs external validation                   │
+└────────────────────────────────────────────────────────────────┘
+         ↓
+┌────────────────────────────────────────────────────────────────┐
+│ PHASE 2: VALIDATION (Agent 2 - "Inspector")                    │
+├────────────────────────────────────────────────────────────────┤
+│ Step 5: Post-Validation                                        │
+│   - Fresh context, no knowledge of Agent 1                    │
+│   - Verifies files actually exist                             │
+│   - Verifies tests actually run and pass                      │
+│   - Verifies checkboxes are checked in story file             │
+│   - Verifies sprint-status.yaml updated                       │
+│                                                                 │
+│ Step 6: Quality Checks                                         │
+│   - Run type-check, lint, build                               │
+│   - Verify ZERO errors                                         │
+│   - Check git status (uncommitted files?)                     │
+│                                                                 │
+│ Output: PASS/FAIL verdict (honest assessment)                 │
+│ ✅ Agent 2 has NO incentive to lie                             │
+└────────────────────────────────────────────────────────────────┘
+         ↓
+┌────────────────────────────────────────────────────────────────┐
+│ PHASE 3: CODE REVIEW (Agent 3 - "Adversarial Reviewer")        │
+├────────────────────────────────────────────────────────────────┤
+│ Step 7: Code Review (Multi-Agent)                             │
+│   - Fresh context, ADVERSARIAL stance                         │
+│   - Goal: Find problems, not rubber-stamp                     │
+│   - Spawns 2-6 review agents (based on complexity)            │
+│   - Each reviewer has specific focus area                     │
+│                                                                 │
+│ Output: List of issues (security, performance, bugs)          │
+│ ✅ Adversarial agents WANT to find problems                    │
+└────────────────────────────────────────────────────────────────┘
+         ↓
+┌────────────────────────────────────────────────────────────────┐
+│ PHASE 4: FIX ISSUES (Agent 4 - "Fixer")                        │
+├────────────────────────────────────────────────────────────────┤
+│ Step 8: Review Analysis                                        │
+│   - Categorize findings (MUST FIX, SHOULD FIX, NICE TO HAVE)  │
+│   - Filter out gold-plating                                    │
+│                                                                 │
+│ Step 9: Fix Issues                                             │
+│   - Implement MUST FIX items                                   │
+│   - Implement SHOULD FIX if time allows                        │
+│                                                                 │
+│ Output: Fixed code, re-run tests                              │
+└────────────────────────────────────────────────────────────────┘
+         ↓
+┌────────────────────────────────────────────────────────────────┐
+│ PHASE 5: COMPLETION (Main Orchestrator - Claude)               │
+├────────────────────────────────────────────────────────────────┤
+│ Step 10: Complete                                              │
+│   - Verify git commits exist                                   │
+│   - Verify tests pass                                          │
+│   - Verify story checkboxes checked                            │
+│   - Verify sprint-status updated                               │
+│   - REJECT if any verification fails                           │
+│                                                                 │
+│ Step 11: Summary                                               │
+│   - Generate audit trail                                       │
+│   - Report to user                                             │
+│                                                                 │
+│ ✅ Main orchestrator does FINAL verification                   │
+└────────────────────────────────────────────────────────────────┘
+```
+
+---
+
+## Agent Responsibilities
+
+### Agent 1: Builder (Implementation)
+- **Role:** Implement the story according to requirements
+- **Trust Level:** LOW - assumes agent will cut corners
+- **Output:** Code + tests (unverified)
+- **Incentive:** Get done quickly → may lie about completion
+
+### Agent 2: Inspector (Validation)
+- **Role:** Independent verification of Agent 1's claims
+- **Trust Level:** MEDIUM - no conflict of interest
+- **Checks:**
+  - Do files actually exist?
+  - Do tests actually pass (run them myself)?
+  - Are checkboxes actually checked?
+  - Is sprint-status actually updated?
+- **Output:** PASS/FAIL with evidence
+- **Incentive:** Find truth → honest assessment
+
+### Agent 3: Adversarial Reviewer (Code Review)
+- **Role:** Find problems with the implementation
+- **Trust Level:** HIGH - WANTS to find issues
+- **Focus Areas:**
+  - Security vulnerabilities
+  - Performance problems
+  - Logic bugs
+  - Architecture violations
+- **Output:** List of issues with severity
+- **Incentive:** Find as many legitimate issues as possible
+
+### Agent 4: Fixer (Issue Resolution)
+- **Role:** Fix issues identified by Agent 3
+- **Trust Level:** MEDIUM - has incentive to minimize work
+- **Actions:**
+  - Implement MUST FIX issues
+  - Implement SHOULD FIX issues (if time)
+  - Skip NICE TO HAVE (gold-plating)
+- **Output:** Fixed code
+
+### Main Orchestrator: Claude (Final Verification)
+- **Role:** Final quality gate before marking story complete
+- **Trust Level:** HIGHEST - user-facing, no incentive to lie
+- **Checks:**
+  - Git log shows commits
+  - Test output shows passing tests
+  - Story file diff shows checked boxes
+  - Sprint-status diff shows update
+- **Output:** COMPLETE or FAILED (with specific reason)
+
+---
+
+## Implementation in workflow.yaml
+
+```yaml
+# New execution mode (v2.0)
+execution_mode: "multi_agent" # single_agent | multi_agent
+
+# Agent configuration
+agents:
+  builder:
+    steps: [1, 2, 3, 4]
+    subagent_type: "general-purpose"
+    description: "Implement story {{story_key}}"
+
+  inspector:
+    steps: [5, 6]
+    subagent_type: "general-purpose"
+    description: "Validate story {{story_key}} implementation"
+    fresh_context: true # No knowledge of builder agent
+
+  reviewer:
+    steps: [7]
+    subagent_type: "multi-agent-review" # Spawns multiple reviewers
+    description: "Adversarial review of story {{story_key}}"
+    fresh_context: true
+    adversarial: true
+
+  fixer:
+    steps: [8, 9]
+    subagent_type: "general-purpose"
+    description: "Fix issues in story {{story_key}}"
+```
+
+---
+
+## Verification Checklist (Step 10)
+
+**Main orchestrator MUST verify before marking complete:**
+
+```bash
+# 1. Check git commits
+git log --oneline -3 | grep "{{story_key}}"
+# FAIL if no commit found
+
+# 2. Check story checkboxes
+before_count=$(git show HEAD~1:{{story_file}} | grep -c "^- \[x\]")
+after_count=$(grep -c "^- \[x\]" {{story_file}})
+# FAIL if after_count <= before_count
+
+# 3. Check sprint-status
+git diff HEAD~1 {{sprint_status}} | grep "{{story_key}}"
+# FAIL if no status change
+
+# 4. Check test results
+# Parse agent output for "PASS" or test count
+# FAIL if no test evidence
+```
+
+**If ANY check fails → Story NOT complete, report to user**
+
+---
+
+## Benefits of Multi-Agent Architecture
+
+1. **Separation of Concerns**
+   - Implementation separate from validation
+   - Review separate from fixing
+
+2. **No Conflict of Interest**
+   - Validators have no incentive to lie
+   - Reviewers WANT to find problems
+
+3. **Fresh Context Each Phase**
+   - Inspector doesn't know what Builder did
+   - Reviewer approaches code with fresh eyes
+
+4. **Honest Reporting**
+   - Each agent reports truthfully
+   - Main orchestrator verifies everything
+
+5. **Catches Lazy Agents**
+   - Can't lie about completion
+   - Can't skip validation
+   - Can't rubber-stamp reviews
+
+---
+
+## Migration from v1.x to v2.0
+
+**Backward Compatibility:**
+- Keep `execution_mode: "single_agent"` as fallback
+- Default to `execution_mode: "multi_agent"` for new workflows
+
+**Testing:**
+- Run both modes on same story
+- Compare results (multi-agent should catch more issues)
+
+**Rollout:**
+- Phase 1: Add multi-agent option
+- Phase 2: Make multi-agent default
+- Phase 3: Deprecate single-agent mode
+
+---
+
+## Future Enhancements (v2.1+)
+
+1. **Agent Reputation Tracking**
+   - Track which agents produce reliable results
+   - Penalize agents that consistently lie
+
+2. **Dynamic Agent Selection**
+   - Choose different review agents based on story type
+   - Security-focused reviewers for auth stories
+   - Performance reviewers for database stories
+
+3. **Parallel Validation**
+   - Run multiple validators simultaneously
+   - Require consensus (2/3 validators agree)
+
+4. **Agent Learning**
+   - Validators learn common failure patterns
+   - Reviewers learn project-specific issues
+
+---
+
+**Key Takeaway:** Trust but verify. Every agent's work is independently validated by a fresh agent with no conflict of interest.