13 KiB
Super-Dev-Pipeline v2.0 - Multi-Agent Architecture
Version: 2.0.0 Architecture: GSDMAD (GSD + BMAD) Philosophy: Trust but verify, separation of concerns
Overview
This workflow implements a story using 4 independent agents with external validation at each phase.
Key Innovation: Each agent has single responsibility and fresh context. No agent validates its own work.
Execution Flow
┌─────────────────────────────────────────────────────────────┐
│ Main Orchestrator (Claude) │
│ - Loads story │
│ - Spawns agents sequentially │
│ - Verifies each phase │
│ - Final quality gate │
└─────────────────────────────────────────────────────────────┘
│
├──> Phase 1: Builder (Steps 1-4)
│ - Load story, analyze gaps
│ - Write tests (TDD)
│ - Implement code
│ - Report what was built (NO VALIDATION)
│
├──> Phase 2: Inspector (Steps 5-6)
│ - Fresh context, no Builder knowledge
│ - Verify files exist
│ - Run tests independently
│ - Run quality checks
│ - PASS or FAIL verdict
│
├──> Phase 3: Reviewer (Step 7)
│ - Fresh context, adversarial stance
│ - Find security vulnerabilities
│ - Find performance problems
│ - Find logic bugs
│ - Report issues with severity
│
├──> Phase 4: Fixer (Steps 8-9)
│ - Fix CRITICAL issues (all)
│ - Fix HIGH issues (all)
│ - Fix MEDIUM issues (if time)
│ - Skip LOW issues (gold-plating)
│ - Commit code changes
│
├──> Phase 5: Reconciler (Step 10) 🚨 MANDATORY
│ - Read git commit to see what was built
│ - Check off completed tasks in story file
│ - Fill Dev Agent Record with details
│ - VERIFY updates with bash commands
│ - BLOCKER: Exit 1 if verification fails
│
└──> Final Verification (Main)
- Check git commits exist
- Check story checkboxes updated (count > 0)
- Check Dev Agent Record filled
- Check sprint-status updated
- Check tests passed
- Mark COMPLETE or FAILED
Agent Spawning Instructions
Phase 1: Spawn Builder
Task({
subagent_type: "general-purpose",
description: "Implement story {{story_key}}",
prompt: `
You are the BUILDER agent for story {{story_key}}.
Load and execute: {agents_path}/builder.md
Story file: {{story_file}}
Complete Steps 1-4:
1. Init - Load story
2. Pre-Gap - Analyze what exists
3. Write Tests - TDD approach
4. Implement - Write production code
DO NOT:
- Validate your work
- Review your code
- Update checkboxes
- Commit changes
Just build it and report what you created.
`
});
Wait for Builder to complete. Store agent_id in agent-history.json.
Phase 2: Spawn Inspector
Task({
subagent_type: "general-purpose",
description: "Validate story {{story_key}} implementation",
prompt: `
You are the INSPECTOR agent for story {{story_key}}.
Load and execute: {agents_path}/inspector.md
Story file: {{story_file}}
You have NO KNOWLEDGE of what the Builder did.
Complete Steps 5-6:
5. Post-Validation - Verify files exist and have content
6. Quality Checks - Run type-check, lint, build, tests
Run all checks yourself. Don't trust Builder claims.
Output: PASS or FAIL verdict with evidence.
`
});
Wait for Inspector to complete. If FAIL, halt pipeline.
Phase 3: Spawn Reviewer
Task({
subagent_type: "bmad_bmm_multi-agent-review",
description: "Adversarial review of story {{story_key}}",
prompt: `
You are the ADVERSARIAL REVIEWER for story {{story_key}}.
Load and execute: {agents_path}/reviewer.md
Story file: {{story_file}}
Complexity: {{complexity_level}}
Your goal is to FIND PROBLEMS.
Complete Step 7:
7. Code Review - Find security, performance, logic issues
Be critical. Look for flaws.
Output: List of issues with severity ratings.
`
});
Wait for Reviewer to complete. Parse issues by severity.
Phase 4: Spawn Fixer
Task({
subagent_type: "general-purpose",
description: "Fix issues in story {{story_key}}",
prompt: `
You are the FIXER agent for story {{story_key}}.
Load and execute: {agents_path}/fixer.md
Story file: {{story_file}}
Review issues: {{review_findings}}
Complete Steps 8-9:
8. Review Analysis - Categorize issues, filter gold-plating
9. Fix Issues - Fix CRITICAL/HIGH, consider MEDIUM, skip LOW
After fixing:
- Update story checkboxes
- Update sprint-status.yaml
- Commit with descriptive message
Output: Fix summary with git commit hash.
`
});
Wait for Fixer to complete.
Phase 5: Spawn Reconciler (MANDATORY)
🚨 THIS PHASE IS MANDATORY. ALWAYS RUN. CANNOT BE SKIPPED. 🚨
Task({
subagent_type: "general-purpose",
description: "Reconcile story {{story_key}}",
prompt: `
You are the RECONCILER agent for story {{story_key}}.
Load and execute: {agents_path}/reconciler.md
Story file: {{story_file}}
Story key: {{story_key}}
Complete Step 10 - Story Reconciliation:
Your ONLY job:
1. Read git commit to see what was built
2. Check off completed tasks in story file (Edit tool)
3. Fill Dev Agent Record with files/dates/notes
4. Verify updates worked (bash grep commands)
5. Exit 1 if verification fails
DO NOT:
- Write code
- Fix bugs
- Run tests
- Do anything except update the story file
This is the LAST step. The story cannot be marked complete
without your verification passing.
Output: Reconciliation summary with checked task count.
`
});
Wait for Reconciler to complete. Verification MUST pass.
If Reconciler verification fails (exit 1):
- DO NOT proceed
- DO NOT mark story complete
- Fix the reconciliation immediately
- Re-run Reconciler until it passes
Final Verification (Main Orchestrator)
🚨 CRITICAL: This verification is MANDATORY. DO NOT skip. 🚨
After all agents complete (including Reconciler), YOU (the main orchestrator) must:
- Use the Bash tool to run these commands
- Read the output to see if verification passed
- If verification fails, use Edit and Bash tools to fix it NOW
- Do not proceed until verification passes
COMMAND TO RUN WITH BASH TOOL:
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
echo "🔍 FINAL VERIFICATION (MANDATORY)"
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
# 1. Check git commits exist
echo "Checking git commits..."
git log --oneline -3 | grep "{{story_key}}"
if [ $? -ne 0 ]; then
echo "❌ FAILED: No commit found for {{story_key}}"
echo "The Fixer agent did not commit changes."
exit 1
fi
echo "✅ Git commit found"
# 2. Check story file has checked tasks (ABSOLUTE BLOCKER)
echo "Checking story file updates..."
CHECKED_COUNT=$(grep -c '^- \[x\]' {{story_file}})
echo "Checked tasks: $CHECKED_COUNT"
if [ "$CHECKED_COUNT" -eq 0 ]; then
echo ""
echo "❌ BLOCKER: Story file has ZERO checked tasks"
echo ""
echo "This means the Fixer agent did NOT update the story file."
echo "The story CANNOT be marked complete without checked tasks."
echo ""
echo "You must:"
echo " 1. Read the git commit to see what was built"
echo " 2. Read the story Tasks section"
echo " 3. Use Edit tool to check off completed tasks"
echo " 4. Fill in Dev Agent Record"
echo " 5. Verify with grep"
echo " 6. Re-run this verification"
echo ""
exit 1
fi
echo "✅ Story file has $CHECKED_COUNT checked tasks"
# 3. Check Dev Agent Record filled
echo "Checking Dev Agent Record..."
RECORD_FILLED=$(grep -A 20 "^### Dev Agent Record" {{story_file}} | grep -c "Agent Model")
if [ "$RECORD_FILLED" -eq 0 ]; then
echo "❌ BLOCKER: Dev Agent Record NOT filled"
echo "The Fixer agent did not document what was built."
exit 1
fi
echo "✅ Dev Agent Record filled"
# 4. Check sprint-status updated
echo "Checking sprint-status..."
git diff HEAD~1 {{sprint_status}} | grep "{{story_key}}"
if [ $? -ne 0 ]; then
echo "❌ FAILED: Sprint status not updated for {{story_key}}"
exit 1
fi
echo "✅ Sprint status updated"
# 5. Check test evidence (optional - may have test failures)
echo "Checking test evidence..."
if [ -f "inspector_output.txt" ]; then
grep -E "PASS|tests.*passing" inspector_output.txt && echo "✅ Tests passing"
fi
echo ""
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
echo "✅ STORY COMPLETE - All verifications passed"
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
IF VERIFICATION FAILS:
- DO NOT mark story as "done"
- DO NOT proceed to next story
- FIX the failure immediately
- Re-run verification until it passes
Benefits Over Single-Agent
Separation of Concerns
- Builder doesn't validate own work
- Inspector has no incentive to lie
- Reviewer approaches with fresh eyes
- Fixer can't skip issues
Fresh Context Each Phase
- Each agent starts at 0% context
- No accumulated fatigue
- No degraded quality
- Honest reporting
Adversarial Review
- Reviewer WANTS to find issues
- Not defensive about the code
- More thorough than self-review
Honest Verification
- Inspector runs tests independently
- Main orchestrator verifies everything
- Can't fake completion
Complexity Routing
MICRO stories:
- Skip Reviewer (low risk)
- 2 agents: Builder → Inspector → Fixer
STANDARD stories:
- Full pipeline
- 4 agents: Builder → Inspector → Reviewer → Fixer
COMPLEX stories:
- Enhanced review (6 reviewers instead of 4)
- Full pipeline + extra scrutiny
- 4 agents: Builder → Inspector → Reviewer (enhanced) → Fixer
Agent Tracking
Track all agents in agent-history.json:
{
"version": "1.0",
"max_entries": 50,
"entries": [
{
"agent_id": "abc123",
"story_key": "17-10",
"phase": "builder",
"steps": [1,2,3,4],
"timestamp": "2026-01-25T21:00:00Z",
"status": "completed",
"completion_timestamp": "2026-01-25T21:15:00Z"
},
{
"agent_id": "def456",
"story_key": "17-10",
"phase": "inspector",
"steps": [5,6],
"timestamp": "2026-01-25T21:16:00Z",
"status": "completed",
"completion_timestamp": "2026-01-25T21:20:00Z"
}
]
}
Benefits:
- Resume interrupted sessions
- Track agent performance
- Debug failed pipelines
- Audit trail
Error Handling
If Builder fails:
- Don't spawn Inspector
- Report failure to user
- Option to resume or retry
If Inspector fails:
- Don't spawn Reviewer
- Report specific failures
- Resume Builder to fix issues
If Reviewer finds CRITICAL issues:
- Must spawn Fixer (not optional)
- Cannot mark story complete until fixed
If Fixer fails:
- Report unfixed issues
- Cannot mark story complete
- Manual intervention required
Comparison: v1.x vs v2.0
| Aspect | v1.x (Single-Agent) | v2.0 (Multi-Agent) |
|---|---|---|
| Agents | 1 | 4 |
| Validation | Self (conflict of interest) | Independent (no conflict) |
| Code Review | Self-review | Adversarial (fresh eyes) |
| Honesty | Low (can lie) | High (verified) |
| Context | Degrades over 11 steps | Fresh each phase |
| Catches Issues | Low | High |
| Completion Accuracy | ~60% (agents lie) | ~95% (verified) |
Migration from v1.x
Backward Compatibility:
execution_mode: "single_agent" # Use v1.x
execution_mode: "multi_agent" # Use v2.0 (new)
Gradual Rollout:
- Week 1: Test v2.0 on 3-5 stories
- Week 2: Make v2.0 default for new stories
- Week 3: Migrate existing stories to v2.0
- Week 4: Deprecate v1.x
Hospital-Grade Standards
⚕️ Lives May Be at Stake
- Independent validation catches errors
- Adversarial review finds security flaws
- Multiple checkpoints prevent shortcuts
- Final verification prevents false completion
QUALITY >> SPEED
Key Takeaway: Don't trust a single agent to build, validate, review, and commit its own work. Use independent agents with fresh context at each phase.