History

Jonah Schulte eecff22f91 feat: add anti-skip safeguards and verification to implementation workflows - Add YOLO mode clarification: auto-approve prompts only, NOT skip steps - Add mandatory story creation verification (file exists, 4kb+ size) - Add quality requirements for create-story output - Add sprint-status tracking updates - Harden autonomous-epic, create-story, and story-pipeline workflows - Ensure stories contain required sections (Tasks, Acceptance Criteria, Dev Notes)		2026-01-01 02:52:48 -05:00
..
checklists	feat: integrate story-pipeline with autonomous-epic for 65% token savings	2025-12-27 12:53:35 -05:00
steps	feat: add anti-skip safeguards and verification to implementation workflows	2026-01-01 02:52:48 -05:00
templates	feat: integrate story-pipeline with autonomous-epic for 65% token savings	2025-12-27 12:53:35 -05:00
README.md	feat: add anti-skip safeguards and verification to implementation workflows	2026-01-01 02:52:48 -05:00
batch-runner.sh	feat: integrate story-pipeline with autonomous-epic for 65% token savings	2025-12-27 12:53:35 -05:00
workflow.md	feat: integrate story-pipeline with autonomous-epic for 65% token savings	2025-12-27 12:53:35 -05:00
workflow.yaml	feat: integrate story-pipeline with autonomous-epic for 65% token savings	2025-12-27 12:53:35 -05:00

README.md

Story Pipeline v2.0

Single-session step-file architecture for implementing user stories with 60-70% token savings.

Overview

The Story Pipeline automates the complete lifecycle of implementing a user story—from creation through code review and commit. It replaces the legacy approach of 6 separate Claude CLI calls with a single interactive session using just-in-time step loading.

The Problem It Solves

Legacy Pipeline (v1.0):

bmad build 1-4
  └─> claude -p "Stage 1: Create story..."     # ~12K tokens
  └─> claude -p "Stage 2: Validate story..."   # ~12K tokens
  └─> claude -p "Stage 3: ATDD tests..."       # ~12K tokens
  └─> claude -p "Stage 4: Implement..."        # ~12K tokens
  └─> claude -p "Stage 5: Code review..."      # ~12K tokens
  └─> claude -p "Stage 6: Complete..."         # ~11K tokens
                                        Total: ~71K tokens/story

Each call reloads agent personas (~2K tokens), re-reads the story file, and loses context from previous stages.

Story Pipeline v2.0:

bmad build 1-4
  └─> Single Claude session
        ├─> Load step-01-init.md (~200 lines)
        ├─> Role switch: SM
        ├─> Load step-02-create-story.md
        ├─> Load step-03-validate-story.md
        ├─> Role switch: TEA
        ├─> Load step-04-atdd.md
        ├─> Role switch: DEV
        ├─> Load step-05-implement.md
        ├─> Load step-06-code-review.md
        ├─> Role switch: SM
        ├─> Load step-07-complete.md
        └─> Load step-08-summary.md
                                        Total: ~25-30K tokens/story

Documents cached once, roles switched in-session, steps loaded just-in-time.

What Gets Automated

The pipeline automates the complete BMAD implementation workflow:

Step	Role	What It Does
1. Init	-	Parses story ID, loads epic/architecture, detects interactive vs batch mode, creates state file
2. Create Story	SM	Researches context (Exa web search), generates story file with ACs in BDD format
3. Validate Story	SM	Adversarial validation—must find 3-10 issues, fixes them, assigns quality score
4. ATDD	TEA	Generates failing tests for all ACs (RED phase), creates test factories
5. Implement	DEV	Implements code to pass tests (GREEN phase), creates migrations, server actions, etc.
6. Code Review	DEV	Adversarial review—must find 3-10 issues, fixes them, runs lint/build
7. Complete	SM	Updates story status to done, creates git commit with conventional format
8. Summary	-	Generates audit trail, updates pipeline state, outputs metrics

Quality Gates

Each step has quality gates that must pass before proceeding:

Validation: Score ≥ 80/100, all issues addressed
ATDD: Tests exist for all ACs, tests fail (RED phase confirmed)
Implementation: Lint clean, build passes, migration tests pass
Code Review: Score ≥ 7/10, all critical issues fixed

Token Efficiency

Mode	Token Usage	Savings vs Legacy
Interactive (human-in-loop)	~25K	65%
Batch (YOLO)	~30K	58%
Batch + fresh review context	~35K	51%

Where Savings Come From

Waste in Legacy	Tokens Saved
Agent persona reload (6×)	~12K
Story file re-reads (5×)	~10K
Architecture re-reads	~8K
Context loss between calls	~16K

Usage

Prerequisites

BMAD module installed (_bmad/ directory exists)
Epic file with story definition (docs/epics.md)
Architecture document (docs/architecture.md)

Interactive Mode (Recommended)

Human-in-the-loop with approval at each step:

# Using the bmad CLI
bmad build 1-4

# Or invoke workflow directly
claude -p "Load and execute: _bmad/bmm/workflows/4-implementation/story-pipeline/workflow.md
Story: 1-4"

At each step, you'll see a menu:

## MENU
[C] Continue to next step
[R] Review/revise current step
[H] Halt and checkpoint

Batch Mode (YOLO)

Unattended execution for trusted stories:

bmad build 1-4 --batch

# Or use batch runner directly
./_bmad/bmm/workflows/4-implementation/story-pipeline/batch-runner.sh 1-4

Batch mode:

Skips all approval prompts
Fails fast on errors
Creates checkpoint on failure for resume

Resume from Checkpoint

If execution stops (context exhaustion, error, manual halt):

bmad build 1-4 --resume

# The pipeline reads state from:
# _bmad-output/implementation-artifacts/pipeline-state-{story-id}.yaml

Resume automatically:

Skips completed steps
Restores cached context
Continues from lastStep + 1

Directory Structure

story-pipeline/
├── workflow.yaml          # Configuration, agent mapping, quality gates
├── workflow.md            # Interactive mode orchestration
├── batch-runner.sh        # Batch mode runner script
├── steps/
│   ├── step-01-init.md        # Initialize, load context
│   ├── step-01b-resume.md     # Resume from checkpoint
│   ├── step-02-create-story.md
│   ├── step-03-validate-story.md
│   ├── step-04-atdd.md
│   ├── step-05-implement.md
│   ├── step-06-code-review.md
│   ├── step-07-complete.md
│   └── step-08-summary.md
├── checklists/
│   ├── story-creation.md      # What makes a good story
│   ├── story-validation.md    # Validation criteria
│   ├── atdd.md                # Test generation rules
│   ├── implementation.md      # Coding standards
│   └── code-review.md         # Review criteria
└── templates/
    ├── pipeline-state.yaml    # State file template
    └── audit-trail.yaml       # Audit log template

Configuration

workflow.yaml

name: story-pipeline
version: "2.0"
description: "Single-session story implementation with step-file loading"

# Document loading strategy
load_strategy:
  epic: once          # Load once, cache for session
  architecture: once  # Load once, cache for session
  story: per_step     # Reload when modified

# Agent role mapping
agents:
  sm: "{project-root}/_bmad/bmm/agents/sm.md"
  tea: "{project-root}/_bmad/bmm/agents/tea.md"
  dev: "{project-root}/_bmad/bmm/agents/dev.md"

# Quality gate thresholds
quality_gates:
  validation_min_score: 80
  code_review_min_score: 7
  require_lint_clean: true
  require_build_pass: true

# Step configuration
steps:
  - name: init
    file: steps/step-01-init.md
  - name: create-story
    file: steps/step-02-create-story.md
    agent: sm
  # ... etc

Pipeline State File

Created at _bmad-output/implementation-artifacts/pipeline-state-{story-id}.yaml:

story_id: "1-4"
epic_num: 1
story_num: 4
mode: "interactive"
status: "in_progress"
stepsCompleted: [1, 2, 3]
lastStep: 3
currentStep: 4

cached_context:
  epic_loaded: true
  epic_path: "docs/epics.md"
  architecture_sections: ["tech_stack", "data_model"]

steps:
  step-01-init:
    status: completed
    duration: "0:00:30"
  step-02-create-story:
    status: completed
    duration: "0:02:00"
  step-03-validate-story:
    status: completed
    duration: "0:05:00"
    issues_found: 6
    issues_fixed: 6
    quality_score: 92
  step-04-atdd:
    status: in_progress

Step Details

Step 1: Initialize

Purpose: Set up execution context and detect mode.

Actions:

Parse story ID (e.g., "1-4" → epic 1, story 4)
Load and cache epic document
Load relevant architecture sections
Check for existing state file (resume vs fresh)
Detect mode (interactive/batch) from CLI flags
Create initial state file

Output: pipeline-state-{story-id}.yaml

Step 2: Create Story (SM Role)

Purpose: Generate complete story file from epic definition.

Actions:

Switch to Scrum Master (SM) role
Read story definition from epic
Research context via Exa web search (best practices, patterns)
Generate story file with:
- User story format (As a... I want... So that...)
- Background context
- Acceptance criteria in BDD format (Given/When/Then)
- Test scenarios for each AC
- Technical notes
Save to _bmad-output/implementation-artifacts/story-{id}.md

Quality Gate: Story file exists with all required sections.

Step 3: Validate Story (SM Role)

Purpose: Adversarial validation to find issues before implementation.

Actions:

Load story-validation checklist
Review story against criteria:
- ACs are testable and specific
- No ambiguous requirements
- Technical feasibility confirmed
- Dependencies identified
- Edge cases covered
Must find 3-10 issues (never "looks good")
Fix all identified issues
Assign quality score (0-100)
Append validation report to story file

Quality Gate: Score ≥ 80, all issues addressed.

Step 4: ATDD (TEA Role)

Purpose: Generate failing tests before implementation (RED phase).

Actions:

Switch to Test Engineering Architect (TEA) role
Load atdd checklist
For each acceptance criterion:
- Generate integration test
- Define test data factories
- Specify expected behaviors
Create test files in src/tests/
Update factories.ts with new fixtures
Verify tests FAIL (RED phase)
Create ATDD checklist document

Quality Gate: Tests exist for all ACs, tests fail (not pass).

Step 5: Implement (DEV Role)

Purpose: Write code to pass all tests (GREEN phase).

Actions:

Switch to Developer (DEV) role
Load implementation checklist
Create required files:
- Database migrations
- Server actions (using Result type)
- Library functions
- Types
Follow project patterns:
- Multi-tenant RLS policies
- snake_case for DB columns
- Result type (never throw)
Run lint and fix issues
Run build and fix issues
Run migration tests

Quality Gate: Lint clean, build passes, migration tests pass.

Step 6: Code Review (DEV Role)

Purpose: Adversarial review to find implementation issues.

Actions:

Load code-review checklist
Review all created/modified files:
- Security (XSS, injection, auth)
- Error handling
- Architecture compliance
- Code quality
- Test coverage
Must find 3-10 issues (never "looks good")
Fix all identified issues
Re-run lint and build
Assign quality score (0-10)
Generate review report

Quality Gate: Score ≥ 7/10, all critical issues fixed.

Step 7: Complete (SM Role)

Purpose: Finalize story and create git commit.

Actions:

Switch back to SM role
Update story file status to "done"
Stage all story files

Create conventional commit:

feat(epic-{n}): complete story {id}

{Summary of changes}

🤖 Generated with Claude Code
Co-Authored-By: Claude <noreply@anthropic.com>

Update pipeline state

Quality Gate: Commit created successfully.

Step 8: Summary

Purpose: Generate audit trail and final metrics.

Actions:

Calculate total duration
Compile deliverables list
Aggregate quality scores
Generate execution summary in state file
Output final status

Output: Complete pipeline state with summary section.

Adversarial Mode

Steps 3 (Validate) and 6 (Code Review) run in adversarial mode:

Never say "looks good". You MUST find 3-10 real issues.

This ensures:

Stories are thoroughly vetted before implementation
Code quality issues are caught before commit
The pipeline doesn't rubber-stamp work

Example issues found in real usage:

Missing rate limiting (security)
XSS vulnerability in user input (security)
Missing audit logging (architecture)
Unclear acceptance criteria (story quality)
Function naming mismatches (code quality)

Artifacts Generated

After a complete pipeline run:

_bmad-output/implementation-artifacts/
├── story-{id}.md              # Story file with ACs, validation report
├── pipeline-state-{id}.yaml   # Execution state and summary
├── atdd-checklist-{id}.md     # Test requirements checklist
└── code-review-{id}.md        # Review report with issues

src/
├── supabase/migrations/       # New migration files
├── modules/{module}/
│   ├── actions/               # Server actions
│   ├── lib/                   # Business logic
│   └── types.ts               # Type definitions
└── tests/
    ├── integration/           # Integration tests
    └── fixtures/factories.ts  # Updated test factories

Troubleshooting

Context Exhausted Mid-Session

The pipeline is designed for this. When context runs out:

Claude session ends
State file preserves progress
Run bmad build {id} --resume
Pipeline continues from last completed step

Step Fails Quality Gate

If a step fails its quality gate:

Pipeline halts at that step
State file shows status: failed
Fix issues manually or adjust thresholds
Run bmad build {id} --resume

Tests Don't Fail in ATDD

If tests pass during ATDD (step 4), something is wrong:

Tests might be testing the wrong thing
Implementation might already exist
Mocks might be returning success incorrectly

The pipeline will warn and ask for confirmation before proceeding.

Best Practices

Start with Interactive Mode - Use batch only for well-understood stories
Review at Checkpoints - Don't blindly continue; verify each step's output
Keep Stories Small - Large stories may exhaust context before completion
Commit Frequently - The pipeline commits at step 7, but you can checkpoint earlier
Trust the Adversarial Mode - If it finds issues, they're usually real

Comparison with Legacy

Feature	Legacy (v1.0)	Story Pipeline (v2.0)
Claude calls	6 per story	1 per story
Token usage	~71K	~25-30K
Context preservation	None	Full session
Resume capability	None	Checkpoint-based
Role switching	New process	In-session
Document caching	None	Once per session
Adversarial review	Optional	Mandatory
Audit trail	Manual	Automatic

Version History

v2.0 (2024-12) - Step-file architecture, single-session, checkpoint/resume
v1.0 (2024-11) - Legacy 6-call pipeline

README.md Unescape Escape

Story Pipeline v2.0

Overview

The Problem It Solves

What Gets Automated

Quality Gates

Token Efficiency

Where Savings Come From

Usage

Prerequisites

Interactive Mode (Recommended)

Batch Mode (YOLO)

Resume from Checkpoint

Directory Structure

Configuration

workflow.yaml

Pipeline State File

Step Details

Step 1: Initialize

Step 2: Create Story (SM Role)

Step 3: Validate Story (SM Role)

Step 4: ATDD (TEA Role)

Step 5: Implement (DEV Role)

Step 6: Code Review (DEV Role)

Step 7: Complete (SM Role)

Step 8: Summary

Adversarial Mode

Artifacts Generated

Troubleshooting

Context Exhausted Mid-Session

Step Fails Quality Gate

Tests Don't Fail in ATDD

Best Practices

Comparison with Legacy

Version History

README.md