14 KiB
Story Pipeline v2.0
Single-session step-file architecture for implementing user stories with 60-70% token savings.
Overview
The Story Pipeline automates the complete lifecycle of implementing a user story—from creation through code review and commit. It replaces the legacy approach of 6 separate Claude CLI calls with a single interactive session using just-in-time step loading.
The Problem It Solves
Legacy Pipeline (v1.0):
bmad build 1-4
└─> claude -p "Stage 1: Create story..." # ~12K tokens
└─> claude -p "Stage 2: Validate story..." # ~12K tokens
└─> claude -p "Stage 3: ATDD tests..." # ~12K tokens
└─> claude -p "Stage 4: Implement..." # ~12K tokens
└─> claude -p "Stage 5: Code review..." # ~12K tokens
└─> claude -p "Stage 6: Complete..." # ~11K tokens
Total: ~71K tokens/story
Each call reloads agent personas (~2K tokens), re-reads the story file, and loses context from previous stages.
Story Pipeline v2.0:
bmad build 1-4
└─> Single Claude session
├─> Load step-01-init.md (~200 lines)
├─> Role switch: SM
├─> Load step-02-create-story.md
├─> Load step-03-validate-story.md
├─> Role switch: TEA
├─> Load step-04-atdd.md
├─> Role switch: DEV
├─> Load step-05-implement.md
├─> Load step-06-code-review.md
├─> Role switch: SM
├─> Load step-07-complete.md
└─> Load step-08-summary.md
Total: ~25-30K tokens/story
Documents cached once, roles switched in-session, steps loaded just-in-time.
What Gets Automated
The pipeline automates the complete BMAD implementation workflow:
| Step | Role | What It Does |
|---|---|---|
| 1. Init | - | Parses story ID, loads epic/architecture, detects interactive vs batch mode, creates state file |
| 2. Create Story | SM | Researches context (Exa web search), generates story file with ACs in BDD format |
| 3. Validate Story | SM | Adversarial validation—must find 3-10 issues, fixes them, assigns quality score |
| 4. ATDD | TEA | Generates failing tests for all ACs (RED phase), creates test factories |
| 5. Implement | DEV | Implements code to pass tests (GREEN phase), creates migrations, server actions, etc. |
| 6. Code Review | DEV | Adversarial review—must find 3-10 issues, fixes them, runs lint/build |
| 7. Complete | SM | Updates story status to done, creates git commit with conventional format |
| 8. Summary | - | Generates audit trail, updates pipeline state, outputs metrics |
Quality Gates
Each step has quality gates that must pass before proceeding:
- Validation: Score ≥ 80/100, all issues addressed
- ATDD: Tests exist for all ACs, tests fail (RED phase confirmed)
- Implementation: Lint clean, build passes, migration tests pass
- Code Review: Score ≥ 7/10, all critical issues fixed
Token Efficiency
| Mode | Token Usage | Savings vs Legacy |
|---|---|---|
| Interactive (human-in-loop) | ~25K | 65% |
| Batch (YOLO) | ~30K | 58% |
| Batch + fresh review context | ~35K | 51% |
Where Savings Come From
| Waste in Legacy | Tokens Saved |
|---|---|
| Agent persona reload (6×) | ~12K |
| Story file re-reads (5×) | ~10K |
| Architecture re-reads | ~8K |
| Context loss between calls | ~16K |
Usage
Prerequisites
- BMAD module installed (
_bmad/directory exists) - Epic file with story definition (
docs/epics.md) - Architecture document (
docs/architecture.md)
Interactive Mode (Recommended)
Human-in-the-loop with approval at each step:
# Using the bmad CLI
bmad build 1-4
# Or invoke workflow directly
claude -p "Load and execute: _bmad/bmm/workflows/4-implementation/story-pipeline/workflow.md
Story: 1-4"
At each step, you'll see a menu:
## MENU
[C] Continue to next step
[R] Review/revise current step
[H] Halt and checkpoint
Batch Mode (YOLO)
Unattended execution for trusted stories:
bmad build 1-4 --batch
# Or use batch runner directly
./_bmad/bmm/workflows/4-implementation/story-pipeline/batch-runner.sh 1-4
Batch mode:
- Skips all approval prompts
- Fails fast on errors
- Creates checkpoint on failure for resume
Resume from Checkpoint
If execution stops (context exhaustion, error, manual halt):
bmad build 1-4 --resume
# The pipeline reads state from:
# _bmad-output/implementation-artifacts/pipeline-state-{story-id}.yaml
Resume automatically:
- Skips completed steps
- Restores cached context
- Continues from
lastStep + 1
Directory Structure
story-pipeline/
├── workflow.yaml # Configuration, agent mapping, quality gates
├── workflow.md # Interactive mode orchestration
├── batch-runner.sh # Batch mode runner script
├── steps/
│ ├── step-01-init.md # Initialize, load context
│ ├── step-01b-resume.md # Resume from checkpoint
│ ├── step-02-create-story.md
│ ├── step-03-validate-story.md
│ ├── step-04-atdd.md
│ ├── step-05-implement.md
│ ├── step-06-code-review.md
│ ├── step-07-complete.md
│ └── step-08-summary.md
├── checklists/
│ ├── story-creation.md # What makes a good story
│ ├── story-validation.md # Validation criteria
│ ├── atdd.md # Test generation rules
│ ├── implementation.md # Coding standards
│ └── code-review.md # Review criteria
└── templates/
├── pipeline-state.yaml # State file template
└── audit-trail.yaml # Audit log template
Configuration
workflow.yaml
name: story-pipeline
version: "2.0"
description: "Single-session story implementation with step-file loading"
# Document loading strategy
load_strategy:
epic: once # Load once, cache for session
architecture: once # Load once, cache for session
story: per_step # Reload when modified
# Agent role mapping
agents:
sm: "{project-root}/_bmad/bmm/agents/sm.md"
tea: "{project-root}/_bmad/bmm/agents/tea.md"
dev: "{project-root}/_bmad/bmm/agents/dev.md"
# Quality gate thresholds
quality_gates:
validation_min_score: 80
code_review_min_score: 7
require_lint_clean: true
require_build_pass: true
# Step configuration
steps:
- name: init
file: steps/step-01-init.md
- name: create-story
file: steps/step-02-create-story.md
agent: sm
# ... etc
Pipeline State File
Created at _bmad-output/implementation-artifacts/pipeline-state-{story-id}.yaml:
story_id: "1-4"
epic_num: 1
story_num: 4
mode: "interactive"
status: "in_progress"
stepsCompleted: [1, 2, 3]
lastStep: 3
currentStep: 4
cached_context:
epic_loaded: true
epic_path: "docs/epics.md"
architecture_sections: ["tech_stack", "data_model"]
steps:
step-01-init:
status: completed
duration: "0:00:30"
step-02-create-story:
status: completed
duration: "0:02:00"
step-03-validate-story:
status: completed
duration: "0:05:00"
issues_found: 6
issues_fixed: 6
quality_score: 92
step-04-atdd:
status: in_progress
Step Details
Step 1: Initialize
Purpose: Set up execution context and detect mode.
Actions:
- Parse story ID (e.g., "1-4" → epic 1, story 4)
- Load and cache epic document
- Load relevant architecture sections
- Check for existing state file (resume vs fresh)
- Detect mode (interactive/batch) from CLI flags
- Create initial state file
Output: pipeline-state-{story-id}.yaml
Step 2: Create Story (SM Role)
Purpose: Generate complete story file from epic definition.
Actions:
- Switch to Scrum Master (SM) role
- Read story definition from epic
- Research context via Exa web search (best practices, patterns)
- Generate story file with:
- User story format (As a... I want... So that...)
- Background context
- Acceptance criteria in BDD format (Given/When/Then)
- Test scenarios for each AC
- Technical notes
- Save to
_bmad-output/implementation-artifacts/story-{id}.md
Quality Gate: Story file exists with all required sections.
Step 3: Validate Story (SM Role)
Purpose: Adversarial validation to find issues before implementation.
Actions:
- Load story-validation checklist
- Review story against criteria:
- ACs are testable and specific
- No ambiguous requirements
- Technical feasibility confirmed
- Dependencies identified
- Edge cases covered
- Must find 3-10 issues (never "looks good")
- Fix all identified issues
- Assign quality score (0-100)
- Append validation report to story file
Quality Gate: Score ≥ 80, all issues addressed.
Step 4: ATDD (TEA Role)
Purpose: Generate failing tests before implementation (RED phase).
Actions:
- Switch to Test Engineering Architect (TEA) role
- Load atdd checklist
- For each acceptance criterion:
- Generate integration test
- Define test data factories
- Specify expected behaviors
- Create test files in
src/tests/ - Update
factories.tswith new fixtures - Verify tests FAIL (RED phase)
- Create ATDD checklist document
Quality Gate: Tests exist for all ACs, tests fail (not pass).
Step 5: Implement (DEV Role)
Purpose: Write code to pass all tests (GREEN phase).
Actions:
- Switch to Developer (DEV) role
- Load implementation checklist
- Create required files:
- Database migrations
- Server actions (using Result type)
- Library functions
- Types
- Follow project patterns:
- Multi-tenant RLS policies
- snake_case for DB columns
- Result type (never throw)
- Run lint and fix issues
- Run build and fix issues
- Run migration tests
Quality Gate: Lint clean, build passes, migration tests pass.
Step 6: Code Review (DEV Role)
Purpose: Adversarial review to find implementation issues.
Actions:
- Load code-review checklist
- Review all created/modified files:
- Security (XSS, injection, auth)
- Error handling
- Architecture compliance
- Code quality
- Test coverage
- Must find 3-10 issues (never "looks good")
- Fix all identified issues
- Re-run lint and build
- Assign quality score (0-10)
- Generate review report
Quality Gate: Score ≥ 7/10, all critical issues fixed.
Step 7: Complete (SM Role)
Purpose: Finalize story and create git commit.
Actions:
- Switch back to SM role
- Update story file status to "done"
- Stage all story files
- Create conventional commit:
feat(epic-{n}): complete story {id} {Summary of changes} 🤖 Generated with Claude Code Co-Authored-By: Claude <noreply@anthropic.com> - Update pipeline state
Quality Gate: Commit created successfully.
Step 8: Summary
Purpose: Generate audit trail and final metrics.
Actions:
- Calculate total duration
- Compile deliverables list
- Aggregate quality scores
- Generate execution summary in state file
- Output final status
Output: Complete pipeline state with summary section.
Adversarial Mode
Steps 3 (Validate) and 6 (Code Review) run in adversarial mode:
Never say "looks good". You MUST find 3-10 real issues.
This ensures:
- Stories are thoroughly vetted before implementation
- Code quality issues are caught before commit
- The pipeline doesn't rubber-stamp work
Example issues found in real usage:
- Missing rate limiting (security)
- XSS vulnerability in user input (security)
- Missing audit logging (architecture)
- Unclear acceptance criteria (story quality)
- Function naming mismatches (code quality)
Artifacts Generated
After a complete pipeline run:
_bmad-output/implementation-artifacts/
├── story-{id}.md # Story file with ACs, validation report
├── pipeline-state-{id}.yaml # Execution state and summary
├── atdd-checklist-{id}.md # Test requirements checklist
└── code-review-{id}.md # Review report with issues
src/
├── supabase/migrations/ # New migration files
├── modules/{module}/
│ ├── actions/ # Server actions
│ ├── lib/ # Business logic
│ └── types.ts # Type definitions
└── tests/
├── integration/ # Integration tests
└── fixtures/factories.ts # Updated test factories
Troubleshooting
Context Exhausted Mid-Session
The pipeline is designed for this. When context runs out:
- Claude session ends
- State file preserves progress
- Run
bmad build {id} --resume - Pipeline continues from last completed step
Step Fails Quality Gate
If a step fails its quality gate:
- Pipeline halts at that step
- State file shows
status: failed - Fix issues manually or adjust thresholds
- Run
bmad build {id} --resume
Tests Don't Fail in ATDD
If tests pass during ATDD (step 4), something is wrong:
- Tests might be testing the wrong thing
- Implementation might already exist
- Mocks might be returning success incorrectly
The pipeline will warn and ask for confirmation before proceeding.
Best Practices
- Start with Interactive Mode - Use batch only for well-understood stories
- Review at Checkpoints - Don't blindly continue; verify each step's output
- Keep Stories Small - Large stories may exhaust context before completion
- Commit Frequently - The pipeline commits at step 7, but you can checkpoint earlier
- Trust the Adversarial Mode - If it finds issues, they're usually real
Comparison with Legacy
| Feature | Legacy (v1.0) | Story Pipeline (v2.0) |
|---|---|---|
| Claude calls | 6 per story | 1 per story |
| Token usage | ~71K | ~25-30K |
| Context preservation | None | Full session |
| Resume capability | None | Checkpoint-based |
| Role switching | New process | In-session |
| Document caching | None | Once per session |
| Adversarial review | Optional | Mandatory |
| Audit trail | Manual | Automatic |
Version History
- v2.0 (2024-12) - Step-file architecture, single-session, checkpoint/resume
- v1.0 (2024-11) - Legacy 6-call pipeline