From 921a5bef26f89f166a59c518d02f3f612a6bbdd7 Mon Sep 17 00:00:00 2001 From: Jonah Schulte Date: Sun, 25 Jan 2026 21:23:49 -0500 Subject: [PATCH] docs: add GSDMAD architecture and multi-agent validation design - GSDMAD-ARCHITECTURE.md: Merge best of BMAD + GSD - Wave-based story execution for parallelization - Multi-agent validation (builder, inspector, reviewer, fixer) - Checkpoint-aware segmentation from GSD - Agent tracking and resume capability - 57% faster execution through smart parallelization - Separation of concerns prevents conflict of interest --- GSDMAD-ARCHITECTURE.md | 495 ++++++++++++++++++ .../MULTI-AGENT-ARCHITECTURE.md | 291 ++++++++++ 2 files changed, 786 insertions(+) create mode 100644 GSDMAD-ARCHITECTURE.md create mode 100644 src/modules/bmm/workflows/4-implementation/super-dev-pipeline/MULTI-AGENT-ARCHITECTURE.md diff --git a/GSDMAD-ARCHITECTURE.md b/GSDMAD-ARCHITECTURE.md new file mode 100644 index 00000000..aca9ee06 --- /dev/null +++ b/GSDMAD-ARCHITECTURE.md @@ -0,0 +1,495 @@ +# GSDMAD: Get Shit Done Method for Agile Development + +**Version:** 1.0.0 +**Date:** 2026-01-25 +**Philosophy:** Combine BMAD's comprehensive tracking with GSD's intelligent execution + +--- + +## The Vision + +**BMAD** excels at structure, tracking, and hospital-grade quality standards. +**GSD** excels at smart execution, parallelization, and getting shit done fast. + +**GSDMAD** takes the best of both: +- BMAD's story tracking, sprint management, and quality gates +- GSD's wave-based parallelization, checkpoint routing, and agent orchestration + +--- + +## Core Principles + +1. **Comprehensive but not bureaucratic** - Track what matters, skip enterprise theater +2. **Smart parallelization** - Run independent work concurrently, sequential only when needed +3. **Separation of concerns** - Different agents for implementation, validation, review +4. **Checkpoint-aware routing** - Autonomous segments in subagents, decisions in main +5. **Hospital-grade quality** - Lives may be at stake, quality >> speed + +--- + +## Architecture Comparison + +### BMAD v1.x (Old Way) +``` +batch-super-dev orchestrator: + ├─ Story 1: super-dev-pipeline (ONE agent, ALL steps) + │ └─ Step 1-11: init, gap, test, implement, validate, review, fix, complete + ├─ Story 2: super-dev-pipeline (ONE agent, ALL steps) + └─ Story 3: super-dev-pipeline (ONE agent, ALL steps) + +Problems: +- Single agent validates its own work (conflict of interest) +- No parallelization between stories +- Agent can lie about completion +- Sequential execution is slow +``` + +### GSD (Inspiration) +``` +execute-phase orchestrator: + ├─ Wave 1: [Plan A, Plan B] in parallel + │ ├─ Agent for Plan A (segments if checkpoints) + │ └─ Agent for Plan B (segments if checkpoints) + ├─ Wave 2: [Plan C] + │ └─ Agent for Plan C + └─ Wave 3: [Plan D, Plan E] in parallel + +Strengths: +- Wave-based parallelization +- Checkpoint-aware segmentation +- Agent tracking and resume +- Lightweight orchestration +``` + +### GSDMAD (New Way) +``` +batch-super-dev orchestrator: + ├─ Wave 1 (independent stories): [17-1, 17-3, 17-4] in parallel + │ ├─ Story 17-1: + │ │ ├─ Agent 1: Implement (steps 1-4) + │ │ ├─ Agent 2: Validate (steps 5-6) ← fresh context + │ │ ├─ Agent 3: Review (step 7) ← adversarial + │ │ └─ Agent 4: Fix (steps 8-9) + │ ├─ Story 17-3: (same multi-agent pattern) + │ └─ Story 17-4: (same multi-agent pattern) + │ + ├─ Wave 2 (depends on Wave 1): [17-5] + │ └─ Story 17-5: (same multi-agent pattern) + │ + └─ Wave 3: [17-9, 17-10] in parallel + ├─ Story 17-9: (same multi-agent pattern) + └─ Story 17-10: (same multi-agent pattern) + +Benefits: +- Independent stories run in parallel (faster) +- Each story uses multi-agent validation (honest) +- Dependencies respected via waves +- Agent tracking for resume capability +``` + +--- + +## Wave-Based Story Execution + +### Dependency Analysis + +Before executing stories, analyze dependencies: + +```yaml +stories: + 17-1: # Space Model + depends_on: [] + wave: 1 + + 17-2: # Space Listing + depends_on: [17-1] # Needs Space model + wave: 2 + + 17-3: # Space Photos + depends_on: [17-1] # Needs Space model + wave: 2 + + 17-4: # Delete Space + depends_on: [17-1] # Needs Space model + wave: 2 + + 17-5: # Agreement Model + depends_on: [17-1, 17-4] # Needs Space model + delete protection + wave: 3 + + 17-9: # Expiration Alerts + depends_on: [17-5] # Needs Agreement model + wave: 4 + + 17-10: # Occupant Portal + depends_on: [17-5] # Needs Agreement model + wave: 4 +``` + +**Wave Execution:** +- Wave 1: [17-1] (1 story) +- Wave 2: [17-2, 17-3, 17-4] (3 stories in parallel) +- Wave 3: [17-5] (1 story) +- Wave 4: [17-9, 17-10] (2 stories in parallel) + +**Time Savings:** +- Sequential: 7 stories × 60 min = 420 min (7 hours) +- Wave-based: 1 + 60 + 60 + 60 = 180 min (3 hours) ← 57% faster + +--- + +## Multi-Agent Story Pipeline + +Each story uses **4 agents** with separation of concerns: + +### Phase 1: Implementation (Agent 1 - Builder) +``` +Steps: 1-4 (init, pre-gap, write-tests, implement) +Role: Build the feature +Output: Code + tests (unverified) +Trust: LOW (assume agent will cut corners) +``` + +**Agent 1 Prompt:** +``` +Implement story {{story_key}} following these steps: + +1. Init: Load story, detect greenfield vs brownfield +2. Pre-Gap: Validate tasks, detect batchable patterns +3. Write Tests: TDD approach, write tests first +4. Implement: Write production code + +DO NOT: +- Validate your own work (Agent 2 will do this) +- Review your own code (Agent 3 will do this) +- Update story checkboxes (Agent 4 will do this) +- Commit changes (Agent 4 will do this) + +Just write the code and tests. Report what you built. +``` + +### Phase 2: Validation (Agent 2 - Inspector) +``` +Steps: 5-6 (post-validation, quality-checks) +Role: Independent verification +Output: PASS/FAIL with evidence +Trust: MEDIUM (no conflict of interest) +``` + +**Agent 2 Prompt:** +``` +Validate story {{story_key}} implementation by Agent 1. + +You have NO KNOWLEDGE of what Agent 1 did. Verify: + +1. Files Exist: + - Check each file mentioned in story + - Verify file contains actual code (not TODO/stub) + +2. Tests Pass: + - Run test suite: npm test + - Verify tests actually run (not skipped) + - Check coverage meets 90% threshold + +3. Quality Checks: + - Run type-check: npm run type-check + - Run linter: npm run lint + - Run build: npm run build + - All must return zero errors + +4. Git Status: + - Check uncommitted files + - List files changed + +Output PASS or FAIL with specific evidence. +If FAIL, list exactly what's missing/broken. +``` + +### Phase 3: Code Review (Agent 3 - Adversarial Reviewer) +``` +Step: 7 (code-review) +Role: Find problems (adversarial stance) +Output: List of issues with severity +Trust: HIGH (wants to find issues) +``` + +**Agent 3 Prompt:** +``` +Adversarial code review of story {{story_key}}. + +Your GOAL is to find problems. Be critical. Look for: + +SECURITY: +- SQL injection vulnerabilities +- XSS vulnerabilities +- Authentication bypasses +- Authorization gaps +- Hardcoded secrets + +PERFORMANCE: +- N+1 queries +- Missing indexes +- Inefficient algorithms +- Memory leaks + +BUGS: +- Logic errors +- Edge cases not handled +- Off-by-one errors +- Race conditions + +ARCHITECTURE: +- Pattern violations +- Tight coupling +- Missing error handling + +Rate each issue: +- CRITICAL: Security vulnerability or data loss +- HIGH: Will cause production bugs +- MEDIUM: Technical debt or maintainability +- LOW: Nice-to-have improvements + +Output: List of issues with severity and specific code locations. +``` + +### Phase 4: Fix Issues (Agent 4 - Fixer) +``` +Steps: 8-9 (review-analysis, fix-issues) +Role: Fix critical/high issues +Output: Fixed code +Trust: MEDIUM (incentive to minimize work) +``` + +**Agent 4 Prompt:** +``` +Fix issues from code review for story {{story_key}}. + +Code review found {{issue_count}} issues: +{{review_issues}} + +Priority: +1. Fix ALL CRITICAL issues (no exceptions) +2. Fix ALL HIGH issues (must do) +3. Fix MEDIUM issues if time allows (nice to have) +4. Skip LOW issues (gold-plating) + +After fixing: +- Re-run tests (must pass) +- Update story checkboxes +- Update sprint-status.yaml +- Commit changes with message: "fix: {{story_key}} - address code review" +``` + +### Phase 5: Final Verification (Main Orchestrator) +``` +Steps: 10-11 (complete, summary) +Role: Final quality gate +Output: COMPLETE or FAILED +Trust: HIGHEST (user-facing) +``` + +**Orchestrator Checks:** +```bash +# 1. Verify git commits +git log --oneline -5 | grep "{{story_key}}" +[ $? -eq 0 ] || echo "FAIL: No commit found" + +# 2. Verify story checkboxes increased +before=$(git show HEAD~2:{{story_file}} | grep -c "^- \[x\]") +after=$(grep -c "^- \[x\]" {{story_file}}) +[ $after -gt $before ] || echo "FAIL: Checkboxes not updated" + +# 3. Verify sprint-status updated +git diff HEAD~2 {{sprint_status}} | grep "{{story_key}}: done" +[ $? -eq 0 ] || echo "FAIL: Sprint status not updated" + +# 4. Verify tests passed (parse agent output) +grep "PASS" agent_2_output.txt +[ $? -eq 0 ] || echo "FAIL: No test evidence" +``` + +--- + +## Checkpoint-Aware Segmentation + +Stories can have **checkpoints** for user interaction: + +```xml + + Review the test plan before implementation + Does this test strategy look correct? (yes/no) + +``` + +**Routing Rules:** + +1. **No checkpoints** → Full autonomous (Agent 1 does steps 1-4) +2. **Verify checkpoints** → Segmented execution: + - Segment 1 (steps 1-2): Agent 1a + - Checkpoint: Main context (user verifies) + - Segment 2 (steps 3-4): Agent 1b (fresh agent) +3. **Decision checkpoints** → Stay in main context (can't delegate decisions) + +This is borrowed directly from GSD's `execute-plan.md` segmentation logic. + +--- + +## Agent Tracking and Resume + +Track all spawned agents for resume capability: + +```json +// .bmad/agent-history.json +{ + "version": "1.0", + "max_entries": 50, + "entries": [ + { + "agent_id": "a4868f1", + "story_key": "17-10", + "phase": "implementation", + "agent_type": "builder", + "timestamp": "2026-01-25T20:30:00Z", + "status": "spawned", + "completion_timestamp": null + } + ] +} +``` + +**Resume Capability:** +```bash +# If session interrupted, check for incomplete agents +cat .bmad/agent-history.json | jq '.entries[] | select(.status=="spawned")' + +# Resume agent using Task tool +Task(subagent_type="general-purpose", resume="a4868f1") +``` + +--- + +## Workflow Files + +### New: `batch-super-dev-v2.md` +```yaml +execution_mode: "wave_based" # wave_based | sequential + +# Story dependency analysis (auto-computed or manual) +dependency_analysis: + enabled: true + method: "file_scan" # file_scan | manual | hybrid + +# Wave execution +waves: + max_parallel_stories: 4 # Max stories per wave + agent_timeout: 3600 # 1 hour per agent + +# Multi-agent validation +validation: + enabled: true + agents: + builder: {steps: [1,2,3,4]} + inspector: {steps: [5,6], fresh_context: true} + reviewer: {steps: [7], adversarial: true} + fixer: {steps: [8,9]} +``` + +### Enhanced: `super-dev-pipeline-v2.md` +```yaml +execution_mode: "multi_agent" # single_agent | multi_agent + +# Agent configuration +agents: + builder: + steps: [1, 2, 3, 4] + description: "Implement story" + + inspector: + steps: [5, 6] + description: "Validate implementation" + fresh_context: true + + reviewer: + steps: [7] + description: "Adversarial code review" + fresh_context: true + adversarial: true + + fixer: + steps: [8, 9] + description: "Fix review issues" +``` + +--- + +## Implementation Phases + +### Phase 1: Multi-Agent Validation (Week 1) +- Update `super-dev-pipeline` to support multi-agent mode +- Create `agents/` directory with agent prompts +- Add agent tracking infrastructure +- Test on single story + +### Phase 2: Wave-Based Execution (Week 2) +- Add dependency analysis to `batch-super-dev` +- Implement wave grouping logic +- Add parallel execution within waves +- Test on Epic 17 (10 stories) + +### Phase 3: Checkpoint Segmentation (Week 3) +- Add checkpoint detection to stories +- Implement segment routing logic +- Test with stories that need user input + +### Phase 4: Agent Resume (Week 4) +- Add agent history tracking +- Implement resume capability +- Test interrupted session recovery + +--- + +## Benefits Summary + +**From BMAD:** +- ✅ Comprehensive story tracking +- ✅ Sprint artifacts and status management +- ✅ Gap analysis and reconciliation +- ✅ Hospital-grade quality standards +- ✅ Multi-tenant patterns + +**From GSD:** +- ✅ Wave-based parallelization (57% faster) +- ✅ Smart checkpoint routing +- ✅ Agent tracking and resume +- ✅ Lightweight orchestration +- ✅ Separation of concerns + +**New in GSDMAD:** +- ✅ Multi-agent validation (no conflict of interest) +- ✅ Adversarial code review (finds real issues) +- ✅ Independent verification (honest reporting) +- ✅ Parallel story execution (faster completion) +- ✅ Best of both worlds + +--- + +## Migration Path + +1. **Keep BMAD v1.x** as fallback (`execution_mode: "single_agent"`) +2. **Add GSDMAD** as opt-in (`execution_mode: "multi_agent"`) +3. **Test both modes** on same epic, compare results +4. **Make GSDMAD default** after validation +5. **Deprecate v1.x** in 6 months + +--- + +**Key Insight:** Trust but verify. Every agent's work is independently validated by a fresh agent with no conflict of interest. Stories run in parallel when possible, sequential only when dependencies require it. + +--- + +**Next Steps:** +1. Create `super-dev-pipeline-v2/` directory +2. Write agent prompt files +3. Update `batch-super-dev` for wave execution +4. Test on Epic 17 stories +5. Measure time savings and quality improvements diff --git a/src/modules/bmm/workflows/4-implementation/super-dev-pipeline/MULTI-AGENT-ARCHITECTURE.md b/src/modules/bmm/workflows/4-implementation/super-dev-pipeline/MULTI-AGENT-ARCHITECTURE.md new file mode 100644 index 00000000..dc5e9a80 --- /dev/null +++ b/src/modules/bmm/workflows/4-implementation/super-dev-pipeline/MULTI-AGENT-ARCHITECTURE.md @@ -0,0 +1,291 @@ +# Super-Dev-Pipeline: Multi-Agent Architecture + +**Version:** 2.0.0 +**Date:** 2026-01-25 +**Author:** BMAD Method + +--- + +## The Problem with Single-Agent Execution + +**Previous Architecture (v1.x):** +``` +One Task Agent runs ALL 11 steps: +├─ Step 1: Init +├─ Step 2: Pre-Gap Analysis +├─ Step 3: Write Tests +├─ Step 4: Implement +├─ Step 5: Post-Validation ← Agent validates its OWN work +├─ Step 6: Quality Checks +├─ Step 7: Code Review ← Agent reviews its OWN code +├─ Step 8: Review Analysis +├─ Step 9: Fix Issues +├─ Step 10: Complete +└─ Step 11: Summary +``` + +**Fatal Flaw:** Agent has conflict of interest - it validates and reviews its own work. When agents get tired/lazy, they lie about completion and skip steps. + +--- + +## New Multi-Agent Architecture (v2.0) + +**Principle:** **Separation of Concerns with Independent Validation** + +Each phase has a DIFFERENT agent with fresh context: + +``` +┌────────────────────────────────────────────────────────────────┐ +│ PHASE 1: IMPLEMENTATION (Agent 1 - "Builder") │ +├────────────────────────────────────────────────────────────────┤ +│ Step 1: Init │ +│ Step 2: Pre-Gap Analysis │ +│ Step 3: Write Tests │ +│ Step 4: Implement │ +│ │ +│ Output: Code written, tests written, claims "done" │ +│ ⚠️ DO NOT TRUST - needs external validation │ +└────────────────────────────────────────────────────────────────┘ + ↓ +┌────────────────────────────────────────────────────────────────┐ +│ PHASE 2: VALIDATION (Agent 2 - "Inspector") │ +├────────────────────────────────────────────────────────────────┤ +│ Step 5: Post-Validation │ +│ - Fresh context, no knowledge of Agent 1 │ +│ - Verifies files actually exist │ +│ - Verifies tests actually run and pass │ +│ - Verifies checkboxes are checked in story file │ +│ - Verifies sprint-status.yaml updated │ +│ │ +│ Step 6: Quality Checks │ +│ - Run type-check, lint, build │ +│ - Verify ZERO errors │ +│ - Check git status (uncommitted files?) │ +│ │ +│ Output: PASS/FAIL verdict (honest assessment) │ +│ ✅ Agent 2 has NO incentive to lie │ +└────────────────────────────────────────────────────────────────┘ + ↓ +┌────────────────────────────────────────────────────────────────┐ +│ PHASE 3: CODE REVIEW (Agent 3 - "Adversarial Reviewer") │ +├────────────────────────────────────────────────────────────────┤ +│ Step 7: Code Review (Multi-Agent) │ +│ - Fresh context, ADVERSARIAL stance │ +│ - Goal: Find problems, not rubber-stamp │ +│ - Spawns 2-6 review agents (based on complexity) │ +│ - Each reviewer has specific focus area │ +│ │ +│ Output: List of issues (security, performance, bugs) │ +│ ✅ Adversarial agents WANT to find problems │ +└────────────────────────────────────────────────────────────────┘ + ↓ +┌────────────────────────────────────────────────────────────────┐ +│ PHASE 4: FIX ISSUES (Agent 4 - "Fixer") │ +├────────────────────────────────────────────────────────────────┤ +│ Step 8: Review Analysis │ +│ - Categorize findings (MUST FIX, SHOULD FIX, NICE TO HAVE) │ +│ - Filter out gold-plating │ +│ │ +│ Step 9: Fix Issues │ +│ - Implement MUST FIX items │ +│ - Implement SHOULD FIX if time allows │ +│ │ +│ Output: Fixed code, re-run tests │ +└────────────────────────────────────────────────────────────────┘ + ↓ +┌────────────────────────────────────────────────────────────────┐ +│ PHASE 5: COMPLETION (Main Orchestrator - Claude) │ +├────────────────────────────────────────────────────────────────┤ +│ Step 10: Complete │ +│ - Verify git commits exist │ +│ - Verify tests pass │ +│ - Verify story checkboxes checked │ +│ - Verify sprint-status updated │ +│ - REJECT if any verification fails │ +│ │ +│ Step 11: Summary │ +│ - Generate audit trail │ +│ - Report to user │ +│ │ +│ ✅ Main orchestrator does FINAL verification │ +└────────────────────────────────────────────────────────────────┘ +``` + +--- + +## Agent Responsibilities + +### Agent 1: Builder (Implementation) +- **Role:** Implement the story according to requirements +- **Trust Level:** LOW - assumes agent will cut corners +- **Output:** Code + tests (unverified) +- **Incentive:** Get done quickly → may lie about completion + +### Agent 2: Inspector (Validation) +- **Role:** Independent verification of Agent 1's claims +- **Trust Level:** MEDIUM - no conflict of interest +- **Checks:** + - Do files actually exist? + - Do tests actually pass (run them myself)? + - Are checkboxes actually checked? + - Is sprint-status actually updated? +- **Output:** PASS/FAIL with evidence +- **Incentive:** Find truth → honest assessment + +### Agent 3: Adversarial Reviewer (Code Review) +- **Role:** Find problems with the implementation +- **Trust Level:** HIGH - WANTS to find issues +- **Focus Areas:** + - Security vulnerabilities + - Performance problems + - Logic bugs + - Architecture violations +- **Output:** List of issues with severity +- **Incentive:** Find as many legitimate issues as possible + +### Agent 4: Fixer (Issue Resolution) +- **Role:** Fix issues identified by Agent 3 +- **Trust Level:** MEDIUM - has incentive to minimize work +- **Actions:** + - Implement MUST FIX issues + - Implement SHOULD FIX issues (if time) + - Skip NICE TO HAVE (gold-plating) +- **Output:** Fixed code + +### Main Orchestrator: Claude (Final Verification) +- **Role:** Final quality gate before marking story complete +- **Trust Level:** HIGHEST - user-facing, no incentive to lie +- **Checks:** + - Git log shows commits + - Test output shows passing tests + - Story file diff shows checked boxes + - Sprint-status diff shows update +- **Output:** COMPLETE or FAILED (with specific reason) + +--- + +## Implementation in workflow.yaml + +```yaml +# New execution mode (v2.0) +execution_mode: "multi_agent" # single_agent | multi_agent + +# Agent configuration +agents: + builder: + steps: [1, 2, 3, 4] + subagent_type: "general-purpose" + description: "Implement story {{story_key}}" + + inspector: + steps: [5, 6] + subagent_type: "general-purpose" + description: "Validate story {{story_key}} implementation" + fresh_context: true # No knowledge of builder agent + + reviewer: + steps: [7] + subagent_type: "multi-agent-review" # Spawns multiple reviewers + description: "Adversarial review of story {{story_key}}" + fresh_context: true + adversarial: true + + fixer: + steps: [8, 9] + subagent_type: "general-purpose" + description: "Fix issues in story {{story_key}}" +``` + +--- + +## Verification Checklist (Step 10) + +**Main orchestrator MUST verify before marking complete:** + +```bash +# 1. Check git commits +git log --oneline -3 | grep "{{story_key}}" +# FAIL if no commit found + +# 2. Check story checkboxes +before_count=$(git show HEAD~1:{{story_file}} | grep -c "^- \[x\]") +after_count=$(grep -c "^- \[x\]" {{story_file}}) +# FAIL if after_count <= before_count + +# 3. Check sprint-status +git diff HEAD~1 {{sprint_status}} | grep "{{story_key}}" +# FAIL if no status change + +# 4. Check test results +# Parse agent output for "PASS" or test count +# FAIL if no test evidence +``` + +**If ANY check fails → Story NOT complete, report to user** + +--- + +## Benefits of Multi-Agent Architecture + +1. **Separation of Concerns** + - Implementation separate from validation + - Review separate from fixing + +2. **No Conflict of Interest** + - Validators have no incentive to lie + - Reviewers WANT to find problems + +3. **Fresh Context Each Phase** + - Inspector doesn't know what Builder did + - Reviewer approaches code with fresh eyes + +4. **Honest Reporting** + - Each agent reports truthfully + - Main orchestrator verifies everything + +5. **Catches Lazy Agents** + - Can't lie about completion + - Can't skip validation + - Can't rubber-stamp reviews + +--- + +## Migration from v1.x to v2.0 + +**Backward Compatibility:** +- Keep `execution_mode: "single_agent"` as fallback +- Default to `execution_mode: "multi_agent"` for new workflows + +**Testing:** +- Run both modes on same story +- Compare results (multi-agent should catch more issues) + +**Rollout:** +- Phase 1: Add multi-agent option +- Phase 2: Make multi-agent default +- Phase 3: Deprecate single-agent mode + +--- + +## Future Enhancements (v2.1+) + +1. **Agent Reputation Tracking** + - Track which agents produce reliable results + - Penalize agents that consistently lie + +2. **Dynamic Agent Selection** + - Choose different review agents based on story type + - Security-focused reviewers for auth stories + - Performance reviewers for database stories + +3. **Parallel Validation** + - Run multiple validators simultaneously + - Require consensus (2/3 validators agree) + +4. **Agent Learning** + - Validators learn common failure patterns + - Reviewers learn project-specific issues + +--- + +**Key Takeaway:** Trust but verify. Every agent's work is independently validated by a fresh agent with no conflict of interest.