docs(story-pipeline): add comprehensive documentation

Covers:
- Problem statement and token efficiency gains
- What each step automates (8-step workflow)
- Usage: interactive, batch, and resume modes
- Configuration options (workflow.yaml)
- State management and checkpointing
- Quality gates and adversarial mode
- Troubleshooting and best practices
- Comparison with legacy pipeline
This commit is contained in:
Thomas Jetzinger 2025-12-26 10:12:37 +01:00 committed by Thomas
parent d2d6328be2
commit 1347daa279
1 changed files with 491 additions and 0 deletions

View File

@ -0,0 +1,491 @@
# Story Pipeline v2.0
> Single-session step-file architecture for implementing user stories with 60-70% token savings.
## Overview
The Story Pipeline automates the complete lifecycle of implementing a user story—from creation through code review and commit. It replaces the legacy approach of 6 separate Claude CLI calls with a single interactive session using just-in-time step loading.
### The Problem It Solves
**Legacy Pipeline (v1.0):**
```
bmad build 1-4
└─> claude -p "Stage 1: Create story..." # ~12K tokens
└─> claude -p "Stage 2: Validate story..." # ~12K tokens
└─> claude -p "Stage 3: ATDD tests..." # ~12K tokens
└─> claude -p "Stage 4: Implement..." # ~12K tokens
└─> claude -p "Stage 5: Code review..." # ~12K tokens
└─> claude -p "Stage 6: Complete..." # ~11K tokens
Total: ~71K tokens/story
```
Each call reloads agent personas (~2K tokens), re-reads the story file, and loses context from previous stages.
**Story Pipeline v2.0:**
```
bmad build 1-4
└─> Single Claude session
├─> Load step-01-init.md (~200 lines)
├─> Role switch: SM
├─> Load step-02-create-story.md
├─> Load step-03-validate-story.md
├─> Role switch: TEA
├─> Load step-04-atdd.md
├─> Role switch: DEV
├─> Load step-05-implement.md
├─> Load step-06-code-review.md
├─> Role switch: SM
├─> Load step-07-complete.md
└─> Load step-08-summary.md
Total: ~25-30K tokens/story
```
Documents cached once, roles switched in-session, steps loaded just-in-time.
## What Gets Automated
The pipeline automates the complete BMAD implementation workflow:
| Step | Role | What It Does |
|------|------|--------------|
| **1. Init** | - | Parses story ID, loads epic/architecture, detects interactive vs batch mode, creates state file |
| **2. Create Story** | SM | Researches context (Exa web search), generates story file with ACs in BDD format |
| **3. Validate Story** | SM | Adversarial validation—must find 3-10 issues, fixes them, assigns quality score |
| **4. ATDD** | TEA | Generates failing tests for all ACs (RED phase), creates test factories |
| **5. Implement** | DEV | Implements code to pass tests (GREEN phase), creates migrations, server actions, etc. |
| **6. Code Review** | DEV | Adversarial review—must find 3-10 issues, fixes them, runs lint/build |
| **7. Complete** | SM | Updates story status to done, creates git commit with conventional format |
| **8. Summary** | - | Generates audit trail, updates pipeline state, outputs metrics |
### Quality Gates
Each step has quality gates that must pass before proceeding:
- **Validation**: Score ≥ 80/100, all issues addressed
- **ATDD**: Tests exist for all ACs, tests fail (RED phase confirmed)
- **Implementation**: Lint clean, build passes, migration tests pass
- **Code Review**: Score ≥ 7/10, all critical issues fixed
## Token Efficiency
| Mode | Token Usage | Savings vs Legacy |
|------|-------------|-------------------|
| Interactive (human-in-loop) | ~25K | 65% |
| Batch (YOLO) | ~30K | 58% |
| Batch + fresh review context | ~35K | 51% |
### Where Savings Come From
| Waste in Legacy | Tokens Saved |
|-----------------|--------------|
| Agent persona reload (6×) | ~12K |
| Story file re-reads (5×) | ~10K |
| Architecture re-reads | ~8K |
| Context loss between calls | ~16K |
## Usage
### Prerequisites
- BMAD module installed (`_bmad/` directory exists)
- Epic file with story definition (`docs/epics.md`)
- Architecture document (`docs/architecture.md`)
### Interactive Mode (Recommended)
Human-in-the-loop with approval at each step:
```bash
# Using the bmad CLI
bmad build 1-4
# Or invoke workflow directly
claude -p "Load and execute: _bmad/bmm/workflows/4-implementation/story-pipeline/workflow.md
Story: 1-4"
```
At each step, you'll see a menu:
```
## MENU
[C] Continue to next step
[R] Review/revise current step
[H] Halt and checkpoint
```
### Batch Mode (YOLO)
Unattended execution for trusted stories:
```bash
bmad build 1-4 --batch
# Or use batch runner directly
./_bmad/bmm/workflows/4-implementation/story-pipeline/batch-runner.sh 1-4
```
Batch mode:
- Skips all approval prompts
- Fails fast on errors
- Creates checkpoint on failure for resume
### Resume from Checkpoint
If execution stops (context exhaustion, error, manual halt):
```bash
bmad build 1-4 --resume
# The pipeline reads state from:
# docs/sprint-artifacts/pipeline-state-{story-id}.yaml
```
Resume automatically:
- Skips completed steps
- Restores cached context
- Continues from `lastStep + 1`
## Directory Structure
```
story-pipeline/
├── workflow.yaml # Configuration, agent mapping, quality gates
├── workflow.md # Interactive mode orchestration
├── batch-runner.sh # Batch mode runner script
├── steps/
│ ├── step-01-init.md # Initialize, load context
│ ├── step-01b-resume.md # Resume from checkpoint
│ ├── step-02-create-story.md
│ ├── step-03-validate-story.md
│ ├── step-04-atdd.md
│ ├── step-05-implement.md
│ ├── step-06-code-review.md
│ ├── step-07-complete.md
│ └── step-08-summary.md
├── checklists/
│ ├── story-creation.md # What makes a good story
│ ├── story-validation.md # Validation criteria
│ ├── atdd.md # Test generation rules
│ ├── implementation.md # Coding standards
│ └── code-review.md # Review criteria
└── templates/
├── pipeline-state.yaml # State file template
└── audit-trail.yaml # Audit log template
```
## Configuration
### workflow.yaml
```yaml
name: story-pipeline
version: "2.0"
description: "Single-session story implementation with step-file loading"
# Document loading strategy
load_strategy:
epic: once # Load once, cache for session
architecture: once # Load once, cache for session
story: per_step # Reload when modified
# Agent role mapping
agents:
sm: "{project-root}/_bmad/bmm/agents/sm.md"
tea: "{project-root}/_bmad/bmm/agents/tea.md"
dev: "{project-root}/_bmad/bmm/agents/dev.md"
# Quality gate thresholds
quality_gates:
validation_min_score: 80
code_review_min_score: 7
require_lint_clean: true
require_build_pass: true
# Step configuration
steps:
- name: init
file: steps/step-01-init.md
- name: create-story
file: steps/step-02-create-story.md
agent: sm
# ... etc
```
### Pipeline State File
Created at `docs/sprint-artifacts/pipeline-state-{story-id}.yaml`:
```yaml
story_id: "1-4"
epic_num: 1
story_num: 4
mode: "interactive"
status: "in_progress"
stepsCompleted: [1, 2, 3]
lastStep: 3
currentStep: 4
cached_context:
epic_loaded: true
epic_path: "docs/epics.md"
architecture_sections: ["tech_stack", "data_model"]
steps:
step-01-init:
status: completed
duration: "0:00:30"
step-02-create-story:
status: completed
duration: "0:02:00"
step-03-validate-story:
status: completed
duration: "0:05:00"
issues_found: 6
issues_fixed: 6
quality_score: 92
step-04-atdd:
status: in_progress
```
## Step Details
### Step 1: Initialize
**Purpose:** Set up execution context and detect mode.
**Actions:**
1. Parse story ID (e.g., "1-4" → epic 1, story 4)
2. Load and cache epic document
3. Load relevant architecture sections
4. Check for existing state file (resume vs fresh)
5. Detect mode (interactive/batch) from CLI flags
6. Create initial state file
**Output:** `pipeline-state-{story-id}.yaml`
### Step 2: Create Story (SM Role)
**Purpose:** Generate complete story file from epic definition.
**Actions:**
1. Switch to Scrum Master (SM) role
2. Read story definition from epic
3. Research context via Exa web search (best practices, patterns)
4. Generate story file with:
- User story format (As a... I want... So that...)
- Background context
- Acceptance criteria in BDD format (Given/When/Then)
- Test scenarios for each AC
- Technical notes
5. Save to `docs/sprint-artifacts/story-{id}.md`
**Quality Gate:** Story file exists with all required sections.
### Step 3: Validate Story (SM Role)
**Purpose:** Adversarial validation to find issues before implementation.
**Actions:**
1. Load story-validation checklist
2. Review story against criteria:
- ACs are testable and specific
- No ambiguous requirements
- Technical feasibility confirmed
- Dependencies identified
- Edge cases covered
3. **Must find 3-10 issues** (never "looks good")
4. Fix all identified issues
5. Assign quality score (0-100)
6. Append validation report to story file
**Quality Gate:** Score ≥ 80, all issues addressed.
### Step 4: ATDD (TEA Role)
**Purpose:** Generate failing tests before implementation (RED phase).
**Actions:**
1. Switch to Test Engineering Architect (TEA) role
2. Load atdd checklist
3. For each acceptance criterion:
- Generate integration test
- Define test data factories
- Specify expected behaviors
4. Create test files in `src/tests/`
5. Update `factories.ts` with new fixtures
6. **Verify tests FAIL** (RED phase)
7. Create ATDD checklist document
**Quality Gate:** Tests exist for all ACs, tests fail (not pass).
### Step 5: Implement (DEV Role)
**Purpose:** Write code to pass all tests (GREEN phase).
**Actions:**
1. Switch to Developer (DEV) role
2. Load implementation checklist
3. Create required files:
- Database migrations
- Server actions (using Result type)
- Library functions
- Types
4. Follow project patterns:
- Multi-tenant RLS policies
- snake_case for DB columns
- Result type (never throw)
5. Run lint and fix issues
6. Run build and fix issues
7. Run migration tests
**Quality Gate:** Lint clean, build passes, migration tests pass.
### Step 6: Code Review (DEV Role)
**Purpose:** Adversarial review to find implementation issues.
**Actions:**
1. Load code-review checklist
2. Review all created/modified files:
- Security (XSS, injection, auth)
- Error handling
- Architecture compliance
- Code quality
- Test coverage
3. **Must find 3-10 issues** (never "looks good")
4. Fix all identified issues
5. Re-run lint and build
6. Assign quality score (0-10)
7. Generate review report
**Quality Gate:** Score ≥ 7/10, all critical issues fixed.
### Step 7: Complete (SM Role)
**Purpose:** Finalize story and create git commit.
**Actions:**
1. Switch back to SM role
2. Update story file status to "done"
3. Stage all story files
4. Create conventional commit:
```
feat(epic-{n}): complete story {id}
{Summary of changes}
🤖 Generated with Claude Code
Co-Authored-By: Claude <noreply@anthropic.com>
```
5. Update pipeline state
**Quality Gate:** Commit created successfully.
### Step 8: Summary
**Purpose:** Generate audit trail and final metrics.
**Actions:**
1. Calculate total duration
2. Compile deliverables list
3. Aggregate quality scores
4. Generate execution summary in state file
5. Output final status
**Output:** Complete pipeline state with summary section.
## Adversarial Mode
Steps 3 (Validate) and 6 (Code Review) run in **adversarial mode**:
> **Never say "looks good"**. You MUST find 3-10 real issues.
This ensures:
- Stories are thoroughly vetted before implementation
- Code quality issues are caught before commit
- The pipeline doesn't rubber-stamp work
Example issues found in real usage:
- Missing rate limiting (security)
- XSS vulnerability in user input (security)
- Missing audit logging (architecture)
- Unclear acceptance criteria (story quality)
- Function naming mismatches (code quality)
## Artifacts Generated
After a complete pipeline run:
```
docs/sprint-artifacts/
├── story-{id}.md # Story file with ACs, validation report
├── pipeline-state-{id}.yaml # Execution state and summary
├── atdd-checklist-{id}.md # Test requirements checklist
└── code-review-{id}.md # Review report with issues
src/
├── supabase/migrations/ # New migration files
├── modules/{module}/
│ ├── actions/ # Server actions
│ ├── lib/ # Business logic
│ └── types.ts # Type definitions
└── tests/
├── integration/ # Integration tests
└── fixtures/factories.ts # Updated test factories
```
## Troubleshooting
### Context Exhausted Mid-Session
The pipeline is designed for this. When context runs out:
1. Claude session ends
2. State file preserves progress
3. Run `bmad build {id} --resume`
4. Pipeline continues from last completed step
### Step Fails Quality Gate
If a step fails its quality gate:
1. Pipeline halts at that step
2. State file shows `status: failed`
3. Fix issues manually or adjust thresholds
4. Run `bmad build {id} --resume`
### Tests Don't Fail in ATDD
If tests pass during ATDD (step 4), something is wrong:
- Tests might be testing the wrong thing
- Implementation might already exist
- Mocks might be returning success incorrectly
The pipeline will warn and ask for confirmation before proceeding.
## Best Practices
1. **Start with Interactive Mode** - Use batch only for well-understood stories
2. **Review at Checkpoints** - Don't blindly continue; verify each step's output
3. **Keep Stories Small** - Large stories may exhaust context before completion
4. **Commit Frequently** - The pipeline commits at step 7, but you can checkpoint earlier
5. **Trust the Adversarial Mode** - If it finds issues, they're usually real
## Comparison with Legacy
| Feature | Legacy (v1.0) | Story Pipeline (v2.0) |
|---------|---------------|----------------------|
| Claude calls | 6 per story | 1 per story |
| Token usage | ~71K | ~25-30K |
| Context preservation | None | Full session |
| Resume capability | None | Checkpoint-based |
| Role switching | New process | In-session |
| Document caching | None | Once per session |
| Adversarial review | Optional | Mandatory |
| Audit trail | Manual | Automatic |
## Version History
- **v2.0** (2024-12) - Step-file architecture, single-session, checkpoint/resume
- **v1.0** (2024-11) - Legacy 6-call pipeline