BMAD-METHOD

Commit Graph

Author	SHA1	Message	Date
Jonah Schulte	a268b4c1bc	feat: upgrade story-full-pipeline to v4.0 with 6 major enhancements Upgrade from v3.2.0 to v4.0.0 with improvements inspired by CooperBench research (Stanford/SAP 2026) on agent coordination failures. Enhancement 1: Resume Builder (v3.2+) - Phase 3 RESUMES Builder agent with review findings - Builder already has full codebase context (50-70% token savings) - More efficient than spawning fresh Fixer agent Enhancement 2: Inspector Code Citations (v4.0) - Inspector must map EVERY task to file:line citations - Example: "Create component" → "src/Component.tsx:45-67" - No more "trust me, it works" - requires proof - Returns structured JSON with code evidence per task - Prevents vague communication (CooperBench finding) Enhancement 3: Remove Hospital-Grade Framing (v4.0) - Dropped psychological appeal language - Kept rigorous verification gates and bash checks - Focus on concrete, measurable verification - Replaced with patterns/verification.md + patterns/tdd.md Enhancement 4: Micro Stories Get Security Scan (v4.0) - No longer skip ALL review for micro stories - Micro now gets 2 reviewers: Security + Architect - Lightweight but still catches critical vulnerabilities Enhancement 5: Test Quality Agent + Coverage Gate (v4.0) - New Test Quality Agent validates: - Edge cases covered (null, empty, invalid) - Error conditions tested - Meaningful assertions (not just "doesn't crash") - No flaky tests (random data, timing) - Automated Coverage Gate enforces 80% threshold - Builder must fix test gaps before proceeding Enhancement 6: Playbook Learning System (v4.0) - Phase 0: Query playbooks before implementation - Builder gets relevant patterns/gotchas upfront - Phase 6: Reflection agent extracts learnings - Auto-generates playbook updates for future agents - Bootstrap mode: auto-initializes playbooks if missing - Continuous improvement through reflection Pipeline: Phase 0 (Playbooks) → Phase 1 (Builder) → Phase 2 (Inspector + Test Quality + Reviewers parallel) → Phase 2.5 (Coverage Gate) → Phase 3 (Resume Builder) → Phase 4 (Inspector recheck) → Phase 5 (Reconciliation) → Phase 6 (Reflection) Files Modified: - workflow.yaml: v4.0 config with playbooks + quality_gates - workflow.md: Complete v4.0 documentation with all phases - agents/builder.md: Playbook awareness + structured JSON - agents/inspector.md: Code citation requirements + evidence format - agents/reviewer.md: Remove hospital-grade reference - agents/architect-integration-reviewer.md: Remove hospital-grade reference - agents/fixer.md: Remove hospital-grade reference - README.md: v4.0 documentation + CooperBench analysis Files Created: - agents/test-quality.md: Test quality validation agent - agents/reflection.md: Playbook learning agent - ../templates/implementation-playbook-template.md: Simple playbook structure Design Philosophy: The workflow avoids CooperBench's "curse of coordination" by using: - Sequential implementation (ONE writer, no merge conflicts) - Parallel verification (safe read-only validation) - Context reuse (no expectation failures) - Evidence-based communication (file:line citations) - Clear role separation (no overlapping responsibilities)	2026-01-28 13:28:37 -05:00
Jonah Schulte	9fbaca3384	feat(pipeline): add architect/integration reviewer for runtime verification - Adds third reviewer to catch routing, pattern, and integration issues - Verifies routes actually load (not just compile) - Checks migrations applied, dependencies installed - Compares new code against existing project patterns - Framework-agnostic approach works on any project Complexity routing updated: - micro: 2 reviewers (security, architect) - standard: 3 reviewers (security, logic, architect) - complex: 4 reviewers (security, logic, architect, quality) Version: 3.1.0 → 3.2.0	2026-01-28 09:36:05 -05:00

Author

SHA1

Message

Date

Jonah Schulte

a268b4c1bc

feat: upgrade story-full-pipeline to v4.0 with 6 major enhancements

Upgrade from v3.2.0 to v4.0.0 with improvements inspired by CooperBench research
(Stanford/SAP 2026) on agent coordination failures.

Enhancement 1: Resume Builder (v3.2+)
- Phase 3 RESUMES Builder agent with review findings
- Builder already has full codebase context (50-70% token savings)
- More efficient than spawning fresh Fixer agent

Enhancement 2: Inspector Code Citations (v4.0)
- Inspector must map EVERY task to file:line citations
- Example: "Create component" → "src/Component.tsx:45-67"
- No more "trust me, it works" - requires proof
- Returns structured JSON with code evidence per task
- Prevents vague communication (CooperBench finding)

Enhancement 3: Remove Hospital-Grade Framing (v4.0)
- Dropped psychological appeal language
- Kept rigorous verification gates and bash checks
- Focus on concrete, measurable verification
- Replaced with patterns/verification.md + patterns/tdd.md

Enhancement 4: Micro Stories Get Security Scan (v4.0)
- No longer skip ALL review for micro stories
- Micro now gets 2 reviewers: Security + Architect
- Lightweight but still catches critical vulnerabilities

Enhancement 5: Test Quality Agent + Coverage Gate (v4.0)
- New Test Quality Agent validates:
  - Edge cases covered (null, empty, invalid)
  - Error conditions tested
  - Meaningful assertions (not just "doesn't crash")
  - No flaky tests (random data, timing)
- Automated Coverage Gate enforces 80% threshold
- Builder must fix test gaps before proceeding

Enhancement 6: Playbook Learning System (v4.0)
- Phase 0: Query playbooks before implementation
- Builder gets relevant patterns/gotchas upfront
- Phase 6: Reflection agent extracts learnings
- Auto-generates playbook updates for future agents
- Bootstrap mode: auto-initializes playbooks if missing
- Continuous improvement through reflection

Pipeline: Phase 0 (Playbooks) → Phase 1 (Builder) → Phase 2 (Inspector +
Test Quality + Reviewers parallel) → Phase 2.5 (Coverage Gate) → Phase 3
(Resume Builder) → Phase 4 (Inspector recheck) → Phase 5 (Reconciliation) →
Phase 6 (Reflection)

Files Modified:
- workflow.yaml: v4.0 config with playbooks + quality_gates
- workflow.md: Complete v4.0 documentation with all phases
- agents/builder.md: Playbook awareness + structured JSON
- agents/inspector.md: Code citation requirements + evidence format
- agents/reviewer.md: Remove hospital-grade reference
- agents/architect-integration-reviewer.md: Remove hospital-grade reference
- agents/fixer.md: Remove hospital-grade reference
- README.md: v4.0 documentation + CooperBench analysis

Files Created:
- agents/test-quality.md: Test quality validation agent
- agents/reflection.md: Playbook learning agent
- ../templates/implementation-playbook-template.md: Simple playbook structure

Design Philosophy:
The workflow avoids CooperBench's "curse of coordination" by using:
- Sequential implementation (ONE writer, no merge conflicts)
- Parallel verification (safe read-only validation)
- Context reuse (no expectation failures)
- Evidence-based communication (file:line citations)
- Clear role separation (no overlapping responsibilities)

2026-01-28 13:28:37 -05:00

Jonah Schulte

9fbaca3384

feat(pipeline): add architect/integration reviewer for runtime verification

- Adds third reviewer to catch routing, pattern, and integration issues
- Verifies routes actually load (not just compile)
- Checks migrations applied, dependencies installed
- Compares new code against existing project patterns
- Framework-agnostic approach works on any project

Complexity routing updated:
- micro: 2 reviewers (security, architect)
- standard: 3 reviewers (security, logic, architect)
- complex: 4 reviewers (security, logic, architect, quality)

Version: 3.1.0 → 3.2.0

2026-01-28 09:36:05 -05:00

2 Commits