Merge origin/main: sync epic-specific tracking files with backwards compatibility

Resolved conflict in autonomous-epic/workflow.yaml by: - Accepting origin/main's cleaner naming: .autonomous-epic-{epic_num}-progress.yaml - Adding backwards compatibility to check both new and legacy formats - Updated all progress file references to use dynamic {{progress_file_path}} Changes: - workflow.yaml: Use new naming convention - instructions.xml: Check for both formats (new + legacy) on resume - README.md: Document backwards compatibility This ensures no in-progress epics are missed when upgrading between versions.
2026-01-02 20:20:35 -05:00 · 2026-01-02 20:20:35 -05:00 · 343b4ef425
parent 04ad0b5019 de4a19c95e
commit 343b4ef425
33 changed files with 6500 additions and 218 deletions
--- a/.claude-commands/bmad/bmm/workflows/validate-all-epics.md
+++ b/.claude-commands/bmad/bmm/workflows/validate-all-epics.md
@ -0,0 +1,13 @@
+---
+description: 'Validate and fix sprint-status.yaml for ALL epics. Scans every story file, validates quality, counts tasks, updates sprint-status.yaml to match REALITY across entire project.'
+---
+
+IT IS CRITICAL THAT YOU FOLLOW THESE STEPS - while staying in character as the current agent persona you may have loaded:
+
+<steps CRITICAL="TRUE">
+1. Always LOAD the FULL @_bmad/core/tasks/workflow.xml
+2. READ its entire contents - this is the CORE OS for EXECUTING the specific workflow-config @_bmad/bmm/workflows/4-implementation/validate-all-epics/workflow.yaml
+3. Pass the yaml path _bmad/bmm/workflows/4-implementation/validate-all-epics/workflow.yaml as 'workflow-config' parameter to the workflow.xml instructions
+4. Follow workflow.xml instructions EXACTLY as written to process and follow the specific workflow config and its instructions
+5. Save outputs after EACH section when generating any documents from templates
+</steps>
--- a/.claude-commands/bmad/bmm/workflows/validate-epic-status.md
+++ b/.claude-commands/bmad/bmm/workflows/validate-epic-status.md
@ -0,0 +1,13 @@
+---
+description: 'Validate and fix sprint-status.yaml for a single epic. Scans story files for task completion, validates quality (>10KB, proper tasks), updates sprint-status.yaml to match REALITY.'
+---
+
+IT IS CRITICAL THAT YOU FOLLOW THESE STEPS - while staying in character as the current agent persona you may have loaded:
+
+<steps CRITICAL="TRUE">
+1. Always LOAD the FULL @_bmad/core/tasks/workflow.xml
+2. READ its entire contents - this is the CORE OS for EXECUTING the specific workflow-config @_bmad/bmm/workflows/4-implementation/validate-epic-status/workflow.yaml
+3. Pass the yaml path _bmad/bmm/workflows/4-implementation/validate-epic-status/workflow.yaml as 'workflow-config' parameter to the workflow.xml instructions
+4. Follow workflow.xml instructions EXACTLY as written to process and follow the specific workflow config and its instructions
+5. Save outputs after EACH section when generating any documents from templates
+</steps>
--- a/docs/HOW-TO-VALIDATE-SPRINT-STATUS.md
+++ b/docs/HOW-TO-VALIDATE-SPRINT-STATUS.md
@ -0,0 +1,101 @@
+# How to Validate Sprint Status - Complete Guide
+
+**Created:** 2026-01-02
+**Purpose:** Ensure sprint-status.yaml and story files reflect REALITY, not fiction
+
+---
+
+## Three Levels of Validation
+
+### Level 1: Status Field Validation (FAST - Free)
+Compare Status field in story files vs sprint-status.yaml
+**Cost:** Free | **Time:** 5 seconds
+
+```bash
+python3 scripts/lib/sprint-status-updater.py --mode validate
+```
+
+### Level 2: Deep Story Validation (MEDIUM - $0.15/story)
+Haiku agent reads actual code and verifies all tasks
+**Cost:** ~$0.15/story | **Time:** 2-5 min/story
+
+```bash
+/validate-story-deep docs/sprint-artifacts/16e-6-ecs-task-definitions-tier3.md
+```
+
+### Level 3: Comprehensive Platform Audit (DEEP - $76 total)
+Validates ALL 511 stories using batched Haiku agents
+**Cost:** ~$76 total | **Time:** 4-6 hours
+
+```bash
+/validate-all-stories-deep
+/validate-all-stories-deep --epic 16e  # Or filter to specific epic
+```
+
+---
+
+## Why Haiku Not Sonnet
+
+**Per story cost:**
+- Haiku: $0.15
+- Sonnet: $1.80
+- **Savings: 92%**
+
+**Full platform:**
+- Haiku: $76
+- Sonnet: $920
+- **Savings: $844**
+
+**Agent startup overhead (why ONE agent per story):**
+- Bad: 50 tasks × 50 agents = 2.5M tokens overhead
+- Good: 1 agent reads all files, verifies all 50 tasks = 25K overhead
+- **Savings: 99% less overhead**
+
+---
+
+## Batching (Max 5 Stories Concurrent)
+
+**Why batch_size = 5:**
+- Prevents spawning 511 agents at once
+- Allows progress saving/resuming
+- Rate limiting friendly
+
+**Execution:**
+- Batch 1: Stories 1-5 (5 agents)
+- Wait for completion
+- Batch 2: Stories 6-10 (5 agents)
+- ...continues until done
+
+---
+
+## What Gets Verified
+
+For each task, Haiku agent:
+1. Finds files with Glob/Grep
+2. Reads code with Read tool
+3. Checks for stubs/TODOs
+4. Verifies tests exist
+5. Checks multi-tenant isolation
+6. Reports: actually_complete, evidence, issues
+
+---
+
+## Commands Reference
+
+```bash
+# Weekly validation (free, 5 sec)
+python3 scripts/lib/sprint-status-updater.py --mode validate
+
+# Fix discrepancies
+python3 scripts/lib/sprint-status-updater.py --mode fix
+
+# Deep validate one story ($0.15, 2-5 min)
+/validate-story-deep docs/sprint-artifacts/STORY.md
+
+# Comprehensive audit ($76, 4-6h)
+/validate-all-stories-deep
+```
+
+---
+
+**Files:** `_bmad/bmm/workflows/4-implementation/validate-*-deep/`
--- a/docs/workflows/SPRINT-STATUS-SYNC-GUIDE.md
+++ b/docs/workflows/SPRINT-STATUS-SYNC-GUIDE.md
@ -0,0 +1,482 @@
+# Sprint Status Sync - Complete Guide
+
+**Created:** 2026-01-02
+**Purpose:** Prevent drift between story files and sprint-status.yaml
+**Status:** PRODUCTION READY
+
+---
+
+## 🚨 THE PROBLEM WE SOLVED
+
+**Before Fix (2026-01-02):**
+- 78% of story files (435/552) had NO `Status:` field
+- 30+ completed stories not reflected in sprint-status.yaml
+- Epic 19: 28 stories done, sprint-status said "in-progress"
+- Epic 16d: 3 stories done, sprint-status said "backlog"
+- Last verification: 32+ hours old
+
+**Root Cause:**
+- Autonomous workflows prioritized velocity over tracking
+- Manual workflows didn't enforce status updates
+- No automated sync mechanism
+- sprint-status.yaml manually maintained
+
+---
+
+## ✅ THE SOLUTION (Full Workflow Fix)
+
+### Component 1: Automated Sync Script
+
+**Script:** `scripts/sync-sprint-status.sh`
+**Purpose:** Scan story Status: fields → Update sprint-status.yaml
+
+**Usage:**
+```bash
+# Update sprint-status.yaml
+pnpm sync:sprint-status
+
+# Preview changes (no modifications)
+pnpm sync:sprint-status:dry-run
+
+# Validate only (exit 1 if out of sync)
+pnpm validate:sprint-status
+```
+
+**Features:**
+- Only updates stories WITH explicit Status: fields
+- Skips stories without Status: (trusts sprint-status.yaml)
+- Creates automatic backups (.sprint-status-backups/)
+- Preserves all comments and structure
+- Returns clear pass/fail exit codes
+
+---
+
+### Component 2: Workflow Enforcement
+
+**Modified Files:**
+1. `_bmad/bmm/workflows/4-implementation/dev-story/instructions.xml`
+2. `_bmad/bmm/workflows/4-implementation/autonomous-epic/instructions.xml`
+
+**Changes:**
+- ✅ HALT if story not found in sprint-status.yaml (was: warning)
+- ✅ Verify sprint-status.yaml update persisted (new validation)
+- ✅ Update both story Status: field AND sprint-status.yaml
+- ✅ Fail loudly if either update fails
+
+**Before:** Workflows logged warnings, continued anyway
+**After:** Workflows HALT if tracking update fails
+
+---
+
+### Component 3: CI/CD Validation
+
+**Workflow:** `.github/workflows/validate-sprint-status.yml`
+**Trigger:** Every PR touching docs/sprint-artifacts/
+
+**Checks:**
+1. sprint-status.yaml exists
+2. All changed story files have Status: fields
+3. sprint-status.yaml is in sync (runs validation)
+4. Blocks merge if validation fails
+
+**How to fix CI failures:**
+```bash
+# See what's wrong
+./scripts/sync-sprint-status.sh --dry-run
+
+# Fix it
+./scripts/sync-sprint-status.sh
+
+# Commit
+git add docs/sprint-artifacts/sprint-status.yaml
+git commit -m "chore: sync sprint-status.yaml"
+git push
+```
+
+---
+
+### Component 4: pnpm Scripts
+
+**Added to package.json:**
+```json
+{
+  "scripts": {
+    "sync:sprint-status": "./scripts/sync-sprint-status.sh",
+    "sync:sprint-status:dry-run": "./scripts/sync-sprint-status.sh --dry-run",
+    "validate:sprint-status": "./scripts/sync-sprint-status.sh --validate"
+  }
+}
+```
+
+**When to run:**
+- `pnpm sync:sprint-status` - After manually updating story Status: fields
+- `pnpm validate:sprint-status` - Before committing changes
+- Automatically in CI/CD - Validates on every PR
+
+---
+
+## 🎯 NEW WORKFLOW (How It Works Now)
+
+### When Creating a Story
+
+```
+/create-story workflow
+  ↓
+1. Generate story file with Status: ready-for-dev
+  ↓
+2. Add entry to sprint-status.yaml with status "ready-for-dev"
+  ↓
+3. HALT if sprint-status.yaml update fails
+  ↓
+✅ Story file and sprint-status.yaml both updated
+```
+
+### When Implementing a Story
+
+```
+/dev-story workflow
+  ↓
+1. Load story, start work
+  ↓
+2. Mark tasks complete [x]
+  ↓
+3. Run tests, validate
+  ↓
+4. Update story Status: "in-progress" → "review"
+  ↓
+5. Update sprint-status.yaml: "in-progress" → "review"
+  ↓
+6. VERIFY sprint-status.yaml update persisted
+  ↓
+7. HALT if verification fails
+  ↓
+✅ Both updated and verified
+```
+
+### When Running Autonomous Epic
+
+```
+/autonomous-epic workflow
+  ↓
+For each story:
+  1. Run super-dev-pipeline
+  ↓
+  2. Check all tasks complete
+  ↓
+  3. Update story Status: "done"
+  ↓
+  4. Update sprint-status.yaml entry to "done"
+  ↓
+  5. Verify update persisted
+  ↓
+  6. Log failure if verification fails (don't halt - continue)
+  ↓
+After all stories:
+  7. Mark epic "done" in sprint-status.yaml
+  ↓
+  8. Verify epic status persisted
+  ↓
+✅ All stories and epic status updated
+```
+
+---
+
+## 🛡️ ENFORCEMENT MECHANISMS
+
+### 1. Required Fields (Create-Story)
+- **Enforcement:** Story MUST be added to sprint-status.yaml during creation
+- **Validation:** Workflow HALTS if story not found after creation
+- **Result:** No orphaned stories
+
+### 2. Status Updates (Dev-Story)
+- **Enforcement:** Both Status: field AND sprint-status.yaml MUST update
+- **Validation:** Re-read sprint-status.yaml to verify update
+- **Result:** No silent failures
+
+### 3. Verification (Autonomous-Epic)
+- **Enforcement:** Sprint-status.yaml updated after each story
+- **Validation:** Verify update persisted, log failure if not
+- **Result:** Tracking stays in sync even during autonomous runs
+
+### 4. CI/CD Gates (GitHub Actions)
+- **Enforcement:** PR merge blocked if validation fails
+- **Validation:** Runs `pnpm validate:sprint-status` on every PR
+- **Result:** Drift cannot be merged
+
+---
+
+## 📋 MANUAL SYNC PROCEDURES
+
+### If sprint-status.yaml Gets Out of Sync
+
+**Scenario 1: Story Status: fields updated but sprint-status.yaml not synced**
+```bash
+# See what needs updating
+pnpm sync:sprint-status:dry-run
+
+# Apply updates
+pnpm sync:sprint-status
+
+# Verify
+pnpm validate:sprint-status
+
+# Commit
+git add docs/sprint-artifacts/sprint-status.yaml
+git commit -m "chore: sync sprint-status.yaml with story updates"
+```
+
+**Scenario 2: sprint-status.yaml has truth, story files missing Status: fields**
+```bash
+# Create script to backfill Status: fields FROM sprint-status.yaml
+./scripts/backfill-story-status-fields.sh  # (To be created if needed)
+
+# This would:
+# 1. Read sprint-status.yaml
+# 2. For each story entry, find the story file
+# 3. Add/update Status: field to match sprint-status.yaml
+# 4. Preserve all other content
+```
+
+**Scenario 3: Massive drift after autonomous work**
+```bash
+# Option A: Trust sprint-status.yaml (if it was manually verified)
+# - Backfill story Status: fields from sprint-status.yaml
+# - Don't run sync (sprint-status.yaml is source of truth)
+
+# Option B: Trust story Status: fields (if recently updated)
+# - Run sync to update sprint-status.yaml
+pnpm sync:sprint-status
+
+# Option C: Manual audit (when both are uncertain)
+# - Review SPRINT-STATUS-AUDIT-2026-01-02.md
+# - Check git commits for completion evidence
+# - Manually correct both files
+```
+
+---
+
+## 🧪 TESTING
+
+### Test 1: Validate Current State
+```bash
+pnpm validate:sprint-status
+# Should exit 0 if in sync, exit 1 if discrepancies
+```
+
+### Test 2: Dry Run (No Changes)
+```bash
+pnpm sync:sprint-status:dry-run
+# Shows what WOULD change without applying
+```
+
+### Test 3: Apply Sync
+```bash
+pnpm sync:sprint-status
+# Updates sprint-status.yaml, creates backup
+```
+
+### Test 4: CI/CD Simulation
+```bash
+# Simulate PR validation
+.github/workflows/validate-sprint-status.yml
+# (Run via act or GitHub Actions)
+```
+
+---
+
+## 📊 METRICS & MONITORING
+
+### How to Check Sprint Health
+
+**Check 1: Discrepancy Count**
+```bash
+pnpm sync:sprint-status:dry-run 2>&1 | grep "discrepancies"
+# Should show: "0 discrepancies" if healthy
+```
+
+**Check 2: Last Verification Timestamp**
+```bash
+head -5 docs/sprint-artifacts/sprint-status.yaml | grep last_verified
+# Should be within last 24 hours
+```
+
+**Check 3: Stories Missing Status: Fields**
+```bash
+grep -L "^Status:" docs/sprint-artifacts/*.md | wc -l
+# Should decrease over time as stories get Status: fields
+```
+
+### Alerts to Set Up (Future)
+
+- ⚠️ If last_verified > 7 days old → Manual audit recommended
+- ⚠️ If discrepancy count > 10 → Investigate why sync not running
+- ⚠️ If stories without Status: > 50 → Backfill campaign needed
+
+---
+
+## 🎓 BEST PRACTICES
+
+### For Story Creators
+1. Always use `/create-story` workflow (adds to sprint-status.yaml automatically)
+2. Never create story .md files manually
+3. Always include Status: field in story template
+
+### For Story Implementers
+1. Use `/dev-story` workflow (updates both Status: and sprint-status.yaml)
+2. If manually updating Status: field, run `pnpm sync:sprint-status` after
+3. Before marking "done", verify sprint-status.yaml reflects your work
+
+### For Autonomous Workflows
+1. autonomous-epic workflow now includes sprint-status.yaml updates
+2. Verifies updates persisted after each story
+3. Logs failures but continues (doesn't halt entire epic for tracking issues)
+
+### For Code Reviewers
+1. Check that PR includes sprint-status.yaml update if stories changed
+2. Verify CI/CD validation passes
+3. If validation fails, request sync before approving
+
+---
+
+## 🔧 MAINTENANCE
+
+### Weekly Tasks
+- [ ] Review discrepancy count: `pnpm sync:sprint-status:dry-run`
+- [ ] Run sync if needed: `pnpm sync:sprint-status`
+- [ ] Check backup count: `ls -1 .sprint-status-backups/ | wc -l`
+- [ ] Clean old backups (keep last 30 days)
+
+### Monthly Tasks
+- [ ] Full audit: Review SPRINT-STATUS-AUDIT template
+- [ ] Backfill missing Status: fields (reduce count to <10)
+- [ ] Verify all epics have correct status
+- [ ] Update this guide based on learnings
+
+---
+
+## 📝 FILE REFERENCE
+
+**Core Files:**
+- `docs/sprint-artifacts/sprint-status.yaml` - Single source of truth
+- `scripts/sync-sprint-status.sh` - Bash wrapper script
+- `scripts/lib/sprint-status-updater.py` - Python updater logic
+
+**Workflow Files:**
+- `_bmad/bmm/workflows/4-implementation/dev-story/instructions.xml`
+- `_bmad/bmm/workflows/4-implementation/autonomous-epic/instructions.xml`
+- `_bmad/bmm/workflows/4-implementation/create-story-with-gap-analysis/step-03-generate-story.md`
+
+**CI/CD:**
+- `.github/workflows/validate-sprint-status.yml`
+
+**Documentation:**
+- `SPRINT-STATUS-AUDIT-2026-01-02.md` - Initial audit findings
+- `docs/workflows/SPRINT-STATUS-SYNC-GUIDE.md` - This file
+
+---
+
+## 🐛 TROUBLESHOOTING
+
+### Issue: "Story not found in sprint-status.yaml"
+
+**Cause:** Story file created outside of /create-story workflow
+**Fix:**
+```bash
+# Manually add to sprint-status.yaml under correct epic
+vim docs/sprint-artifacts/sprint-status.yaml
+# Add line:   story-id: ready-for-dev
+
+# Or re-run create-story workflow
+/create-story
+```
+
+### Issue: "sprint-status.yaml update failed to persist"
+
+**Cause:** File system permissions or concurrent writes
+**Fix:**
+```bash
+# Check file permissions
+ls -la docs/sprint-artifacts/sprint-status.yaml
+
+# Check for file locks
+lsof | grep sprint-status.yaml
+
+# Manual update if needed
+vim docs/sprint-artifacts/sprint-status.yaml
+```
+
+### Issue: "85 discrepancies found"
+
+**Cause:** Story Status: fields not updated after completion
+**Fix:**
+```bash
+# Review discrepancies
+pnpm sync:sprint-status:dry-run
+
+# Apply updates (will update sprint-status.yaml to match story files)
+pnpm sync:sprint-status
+
+# If story files are WRONG (Status: ready-for-dev but actually done):
+#   Manually update story Status: fields first
+#   Then run sync
+```
+
+---
+
+## 🎯 SUCCESS CRITERIA
+
+**System is working correctly when:**
+- ✅ `pnpm validate:sprint-status` exits 0 (no discrepancies)
+- ✅ Last verified timestamp < 24 hours old
+- ✅ Stories with missing Status: fields < 10
+- ✅ CI/CD validation passes on all PRs
+- ✅ New stories automatically added to sprint-status.yaml
+
+**System needs attention when:**
+- ❌ Discrepancy count > 10
+- ❌ Last verified > 7 days old
+- ❌ CI/CD validation failing frequently
+- ❌ Stories missing Status: fields > 50
+
+---
+
+## 🔄 MIGRATION CHECKLIST (One-Time)
+
+If implementing this on an existing project:
+
+- [x] Create scripts/sync-sprint-status.sh
+- [x] Create scripts/lib/sprint-status-updater.py
+- [x] Modify dev-story workflow (add enforcement)
+- [x] Modify autonomous-epic workflow (add verification)
+- [x] Add CI/CD validation workflow
+- [x] Add pnpm scripts
+- [x] Run initial sync: `pnpm sync:sprint-status`
+- [ ] Backfill missing Status: fields (optional, gradual)
+- [x] Document in this guide
+- [ ] Train team on new workflow
+- [ ] Monitor for 2 weeks, adjust as needed
+
+---
+
+## 📈 EXPECTED OUTCOMES
+
+**Immediate (Week 1):**
+- sprint-status.yaml stays in sync
+- New stories automatically tracked
+- Autonomous work properly recorded
+
+**Short-term (Month 1):**
+- Discrepancy count approaches zero
+- CI/CD catches drift before merge
+- Team trusts sprint-status.yaml as source of truth
+
+**Long-term (Month 3+):**
+- Zero manual sprint-status.yaml updates needed
+- Automated reporting reliable
+- Velocity metrics accurate
+
+---
+
+**Last Updated:** 2026-01-02
+**Status:** Active - Production Ready
+**Maintained By:** Platform Team
--- a/scripts/lib/add-status-fields.py
+++ b/scripts/lib/add-status-fields.py
@ -0,0 +1,112 @@
+#!/usr/bin/env python3
+"""
+Add Status field to story files that are missing it.
+Uses sprint-status.yaml as source of truth.
+"""
+
+import re
+from pathlib import Path
+from typing import Dict
+
+def load_sprint_status(path: str = "docs/sprint-artifacts/sprint-status.yaml") -> Dict[str, str]:
+    """Load story statuses from sprint-status.yaml"""
+    with open(path) as f:
+        lines = f.readlines()
+
+    statuses = {}
+    in_dev_status = False
+
+    for line in lines:
+        if 'development_status:' in line:
+            in_dev_status = True
+            continue
+
+        if in_dev_status:
+            # Check if we've left development_status section
+            if line.strip() and not line.startswith('  ') and not line.startswith('#'):
+                break
+
+            # Parse story line: "  story-id: status  # comment"
+            match = re.match(r'  ([a-z0-9-]+):\s*(\S+)', line)
+            if match:
+                story_id, status = match.groups()
+                statuses[story_id] = status
+
+    return statuses
+
+def add_status_to_story(story_file: Path, status: str) -> bool:
+    """Add Status field to story file if missing"""
+    content = story_file.read_text()
+
+    # Check if Status field already exists (handles both "Status:" and "**Status:**")
+    if re.search(r'^\*?\*?Status:', content, re.MULTILINE | re.IGNORECASE):
+        return False  # Already has Status field
+    
+    # Find the first section after the title (usually ## Story or ## Description)
+    # Insert Status field before that
+    lines = content.split('\n')
+    
+    # Find insertion point (after title, before first ## section)
+    insert_idx = None
+    for idx, line in enumerate(lines):
+        if line.startswith('# ') and idx == 0:
+            # Title line - keep looking
+            continue
+        if line.startswith('##'):
+            # Found first section - insert before it
+            insert_idx = idx
+            break
+    
+    if insert_idx is None:
+        # No ## sections found, insert after title
+        insert_idx = 1
+    
+    # Insert blank line, Status field, blank line
+    lines.insert(insert_idx, '')
+    lines.insert(insert_idx + 1, f'**Status:** {status}')
+    lines.insert(insert_idx + 2, '')
+    
+    # Write back
+    story_file.write_text('\n'.join(lines))
+    return True
+
+def main():
+    story_dir = Path("docs/sprint-artifacts")
+    statuses = load_sprint_status()
+    
+    added = 0
+    skipped = 0
+    missing = 0
+    
+    for story_file in sorted(story_dir.glob("*.md")):
+        story_id = story_file.stem
+        
+        # Skip special files
+        if (story_id.startswith('.') or 
+            story_id.startswith('EPIC-') or
+            'COMPLETION' in story_id.upper() or
+            'SUMMARY' in story_id.upper() or
+            'REPORT' in story_id.upper() or
+            'README' in story_id.upper()):
+            continue
+        
+        if story_id not in statuses:
+            print(f"⚠️  {story_id}: Not in sprint-status.yaml")
+            missing += 1
+            continue
+        
+        status = statuses[story_id]
+        
+        if add_status_to_story(story_file, status):
+            print(f"✓ {story_id}: Added Status: {status}")
+            added += 1
+        else:
+            skipped += 1
+    
+    print()
+    print(f"✅ Added Status field to {added} stories")
+    print(f"ℹ️  Skipped {skipped} stories (already have Status)")
+    print(f"⚠️  {missing} stories not in sprint-status.yaml")
+
+if __name__ == '__main__':
+    main()
--- a/scripts/lib/bedrock-client.ts
+++ b/scripts/lib/bedrock-client.ts
@ -0,0 +1,219 @@
+/**
+ * AWS Bedrock Client for Test Generation
+ *
+ * Alternative to Anthropic API - uses AWS Bedrock Runtime
+ * Requires: source ~/git/creds-nonprod.sh (or creds-prod.sh)
+ */
+
+import { BedrockRuntimeClient, InvokeModelCommand } from '@aws-sdk/client-bedrock-runtime';
+import { RateLimiter } from './rate-limiter.js';
+
+export interface GenerateTestOptions {
+	sourceCode: string;
+	sourceFilePath: string;
+	testTemplate: string;
+	model?: string;
+	temperature?: number;
+	maxTokens?: number;
+}
+
+export interface GenerateTestResult {
+	testCode: string;
+	tokensUsed: number;
+	model: string;
+}
+
+export class BedrockClient {
+	private client: BedrockRuntimeClient;
+	private rateLimiter: RateLimiter;
+	private model: string;
+
+	constructor(region: string = 'us-east-1') {
+		// AWS SDK will automatically use credentials from environment
+		// (set via source ~/git/creds-nonprod.sh)
+		this.client = new BedrockRuntimeClient({ region });
+
+		this.rateLimiter = new RateLimiter({
+			requestsPerMinute: 50,
+			maxRetries: 3,
+			maxConcurrent: 5,
+		});
+
+		// Use application-specific inference profile ARN (not foundation model ID)
+		// Cross-region inference profiles (us.*) are blocked by SCP
+		// Pattern from: illuminizer/src/services/coxAi/modelMapping.ts
+		this.model = 'arn:aws:bedrock:us-east-1:247721768464:application-inference-profile/pzxu78pafm8x';
+	}
+
+	/**
+	 * Generate test file from source code using Bedrock
+	 */
+	async generateTest(options: GenerateTestOptions): Promise<GenerateTestResult> {
+		const systemPrompt = this.buildSystemPrompt();
+		const userPrompt = this.buildUserPrompt(options);
+
+		const result = await this.rateLimiter.withRetry(async () => {
+			// Bedrock request format (different from Anthropic API)
+			const payload = {
+				anthropic_version: 'bedrock-2023-05-31',
+				max_tokens: options.maxTokens ?? 8000,
+				temperature: options.temperature ?? 0,
+				system: systemPrompt,
+				messages: [
+					{
+						role: 'user',
+						content: userPrompt,
+					},
+				],
+			};
+
+			const command = new InvokeModelCommand({
+				modelId: options.model ?? this.model,
+				contentType: 'application/json',
+				accept: 'application/json',
+				body: JSON.stringify(payload),
+			});
+
+			const response = await this.client.send(command);
+
+			// Parse Bedrock response
+			const responseBody = JSON.parse(new TextDecoder().decode(response.body));
+
+			if (!responseBody.content || responseBody.content.length === 0) {
+				throw new Error('Empty response from Bedrock');
+			}
+
+			const content = responseBody.content[0];
+			if (content.type !== 'text') {
+				throw new Error('Unexpected response format from Bedrock');
+			}
+
+			return {
+				testCode: this.extractCodeFromResponse(content.text),
+				tokensUsed: responseBody.usage.input_tokens + responseBody.usage.output_tokens,
+				model: this.model,
+			};
+		}, `Generate test for ${options.sourceFilePath}`);
+
+		return result;
+	}
+
+	/**
+	 * Build system prompt (same as Anthropic client)
+	 */
+	private buildSystemPrompt(): string {
+		return `You are an expert TypeScript test engineer specializing in NestJS backend testing.
+
+Your task is to generate comprehensive, production-quality test files that:
+- Follow NestJS testing patterns exactly
+- Achieve 80%+ code coverage
+- Test happy paths AND error scenarios
+- Mock all external dependencies properly
+- Include multi-tenant isolation tests
+- Use proper TypeScript types (ZERO any types)
+- Are immediately runnable without modifications
+
+Key Requirements:
+1. Test Structure: Use describe/it blocks with clear test names
+2. Mocking: Use jest.Mocked<T> for type-safe mocks
+3. Coverage: Test all public methods + edge cases
+4. Error Handling: Test all error scenarios (NotFound, Conflict, BadRequest, etc.)
+5. Multi-Tenant: Verify dealerId isolation in all operations
+6. Performance: Include basic performance tests where applicable
+7. Type Safety: No any types, proper interfaces, type guards
+
+Code Quality Standards:
+- Descriptive test names: "should throw NotFoundException when user not found"
+- Clear arrange/act/assert structure
+- Minimal but complete mocking (don't mock what you don't need)
+- Test behavior, not implementation details
+
+Output Format:
+- Return ONLY the complete test file code
+- No explanations, no markdown formatting
+- Include all necessary imports
+- Follow the template structure provided`;
+	}
+
+	/**
+	 * Build user prompt (same as Anthropic client)
+	 */
+	private buildUserPrompt(options: GenerateTestOptions): string {
+		return `Generate a comprehensive test file for this TypeScript source file:
+
+File Path: ${options.sourceFilePath}
+
+Source Code:
+\`\`\`typescript
+${options.sourceCode}
+\`\`\`
+
+Template to Follow:
+\`\`\`typescript
+${options.testTemplate}
+\`\`\`
+
+Instructions:
+1. Analyze the source code to identify:
+   - All public methods that need testing
+   - Dependencies that need mocking
+   - Error scenarios to test
+   - Multi-tenant considerations (dealerId filtering)
+
+2. Generate tests that cover:
+   - Initialization (dependency injection)
+   - Core functionality (all CRUD operations)
+   - Error handling (NotFound, Conflict, validation errors)
+   - Multi-tenant isolation (prevent cross-dealer access)
+   - Edge cases (null inputs, empty arrays, boundary values)
+
+3. Follow the template structure:
+   - Section 1: Initialization
+   - Section 2: Core functionality (one describe per method)
+   - Section 3: Error handling
+   - Section 4: Multi-tenant isolation
+   - Section 5: Performance (if applicable)
+
+4. Quality requirements:
+   - 80%+ coverage target
+   - Type-safe mocks using jest.Mocked<T>
+   - Descriptive test names
+   - No any types
+   - Proper imports
+
+Output the complete test file code now:`;
+	}
+
+	/**
+	 * Extract code from response (same as Anthropic client)
+	 */
+	private extractCodeFromResponse(response: string): string {
+		let code = response.trim();
+		code = code.replace(/^```(?:typescript|ts)?\n/i, '');
+		code = code.replace(/\n```\s*$/i, '');
+		return code;
+	}
+
+	/**
+	 * Estimate cost for Bedrock (different pricing than Anthropic API)
+	 */
+	estimateCost(sourceCodeLength: number, numFiles: number): { inputTokens: number; outputTokens: number; estimatedCost: number } {
+		const avgInputTokensPerFile = Math.ceil(sourceCodeLength / 4) + 10000;
+		const avgOutputTokensPerFile = 3000;
+
+		const totalInputTokens = avgInputTokensPerFile * numFiles;
+		const totalOutputTokens = avgOutputTokensPerFile * numFiles;
+
+		// Bedrock pricing for Claude Sonnet 4 (as of 2026-01):
+		// - Input: $0.003 per 1k tokens
+		// - Output: $0.015 per 1k tokens
+		const inputCost = (totalInputTokens / 1000) * 0.003;
+		const outputCost = (totalOutputTokens / 1000) * 0.015;
+
+		return {
+			inputTokens: totalInputTokens,
+			outputTokens: totalOutputTokens,
+			estimatedCost: inputCost + outputCost,
+		};
+	}
+}
--- a/scripts/lib/claude-client.ts
+++ b/scripts/lib/claude-client.ts
@ -0,0 +1,212 @@
+/**
+ * Claude API Client for Test Generation
+ *
+ * Handles API communication with proper error handling and rate limiting.
+ */
+
+import Anthropic from '@anthropic-ai/sdk';
+import { RateLimiter } from './rate-limiter.js';
+
+export interface GenerateTestOptions {
+	sourceCode: string;
+	sourceFilePath: string;
+	testTemplate: string;
+	model?: string;
+	temperature?: number;
+	maxTokens?: number;
+}
+
+export interface GenerateTestResult {
+	testCode: string;
+	tokensUsed: number;
+	model: string;
+}
+
+export class ClaudeClient {
+	private client: Anthropic;
+	private rateLimiter: RateLimiter;
+	private model: string;
+
+	constructor(apiKey?: string) {
+		const key = apiKey ?? process.env.ANTHROPIC_API_KEY;
+
+		if (!key) {
+			throw new Error(
+				'ANTHROPIC_API_KEY environment variable is required.\n' +
+				'Please set it with: export ANTHROPIC_API_KEY=sk-ant-...'
+			);
+		}
+
+		this.client = new Anthropic({ apiKey: key });
+		this.rateLimiter = new RateLimiter({
+			requestsPerMinute: 50,
+			maxRetries: 3,
+			maxConcurrent: 5,
+		});
+		this.model = 'claude-sonnet-4-5-20250929'; // Sonnet 4.5 for speed + quality balance
+	}
+
+	/**
+	 * Generate test file from source code
+	 */
+	async generateTest(options: GenerateTestOptions): Promise<GenerateTestResult> {
+		const systemPrompt = this.buildSystemPrompt();
+		const userPrompt = this.buildUserPrompt(options);
+
+		const result = await this.rateLimiter.withRetry(async () => {
+			const response = await this.client.messages.create({
+				model: options.model ?? this.model,
+				max_tokens: options.maxTokens ?? 8000,
+				temperature: options.temperature ?? 0, // 0 for consistency
+				system: systemPrompt,
+				messages: [
+					{
+						role: 'user',
+						content: userPrompt,
+					},
+				],
+			});
+
+			const content = response.content[0];
+			if (content.type !== 'text') {
+				throw new Error('Unexpected response format from Claude API');
+			}
+
+			return {
+				testCode: this.extractCodeFromResponse(content.text),
+				tokensUsed: response.usage.input_tokens + response.usage.output_tokens,
+				model: response.model,
+			};
+		}, `Generate test for ${options.sourceFilePath}`);
+
+		return result;
+	}
+
+	/**
+	 * Build system prompt with test generation instructions
+	 */
+	private buildSystemPrompt(): string {
+		return `You are an expert TypeScript test engineer specializing in NestJS backend testing.
+
+Your task is to generate comprehensive, production-quality test files that:
+- Follow NestJS testing patterns exactly
+- Achieve 80%+ code coverage
+- Test happy paths AND error scenarios
+- Mock all external dependencies properly
+- Include multi-tenant isolation tests
+- Use proper TypeScript types (ZERO any types)
+- Are immediately runnable without modifications
+
+Key Requirements:
+1. Test Structure: Use describe/it blocks with clear test names
+2. Mocking: Use jest.Mocked<T> for type-safe mocks
+3. Coverage: Test all public methods + edge cases
+4. Error Handling: Test all error scenarios (NotFound, Conflict, BadRequest, etc.)
+5. Multi-Tenant: Verify dealerId isolation in all operations
+6. Performance: Include basic performance tests where applicable
+7. Type Safety: No any types, proper interfaces, type guards
+
+Code Quality Standards:
+- Descriptive test names: "should throw NotFoundException when user not found"
+- Clear arrange/act/assert structure
+- Minimal but complete mocking (don't mock what you don't need)
+- Test behavior, not implementation details
+
+Output Format:
+- Return ONLY the complete test file code
+- No explanations, no markdown formatting
+- Include all necessary imports
+- Follow the template structure provided`;
+	}
+
+	/**
+	 * Build user prompt with source code and template
+	 */
+	private buildUserPrompt(options: GenerateTestOptions): string {
+		return `Generate a comprehensive test file for this TypeScript source file:
+
+File Path: ${options.sourceFilePath}
+
+Source Code:
+\`\`\`typescript
+${options.sourceCode}
+\`\`\`
+
+Template to Follow:
+\`\`\`typescript
+${options.testTemplate}
+\`\`\`
+
+Instructions:
+1. Analyze the source code to identify:
+   - All public methods that need testing
+   - Dependencies that need mocking
+   - Error scenarios to test
+   - Multi-tenant considerations (dealerId filtering)
+
+2. Generate tests that cover:
+   - Initialization (dependency injection)
+   - Core functionality (all CRUD operations)
+   - Error handling (NotFound, Conflict, validation errors)
+   - Multi-tenant isolation (prevent cross-dealer access)
+   - Edge cases (null inputs, empty arrays, boundary values)
+
+3. Follow the template structure:
+   - Section 1: Initialization
+   - Section 2: Core functionality (one describe per method)
+   - Section 3: Error handling
+   - Section 4: Multi-tenant isolation
+   - Section 5: Performance (if applicable)
+
+4. Quality requirements:
+   - 80%+ coverage target
+   - Type-safe mocks using jest.Mocked<T>
+   - Descriptive test names
+   - No any types
+   - Proper imports
+
+Output the complete test file code now:`;
+	}
+
+	/**
+	 * Extract code from Claude's response (remove markdown if present)
+	 */
+	private extractCodeFromResponse(response: string): string {
+		// Remove markdown code blocks if present
+		let code = response.trim();
+
+		// Remove ```typescript or ```ts at start
+		code = code.replace(/^```(?:typescript|ts)?\n/i, '');
+
+		// Remove ``` at end
+		code = code.replace(/\n```\s*$/i, '');
+
+		return code;
+	}
+
+	/**
+	 * Estimate cost for test generation
+	 */
+	estimateCost(sourceCodeLength: number, numFiles: number): { inputTokens: number; outputTokens: number; estimatedCost: number } {
+		// Rough estimates:
+		// - Input: Source code + template + prompt (~10k-30k tokens per file)
+		// - Output: Test file (~2k-4k tokens)
+		const avgInputTokensPerFile = Math.ceil(sourceCodeLength / 4) + 10000; // ~4 chars per token
+		const avgOutputTokensPerFile = 3000;
+
+		const totalInputTokens = avgInputTokensPerFile * numFiles;
+		const totalOutputTokens = avgOutputTokensPerFile * numFiles;
+
+		// Claude Sonnet 4.5 pricing (as of 2026-01):
+		// - Input: $0.003 per 1k tokens
+		// - Output: $0.015 per 1k tokens
+		const inputCost = (totalInputTokens / 1000) * 0.003;
+		const outputCost = (totalOutputTokens / 1000) * 0.015;
+
+		return {
+			inputTokens: totalInputTokens,
+			outputTokens: totalOutputTokens,
+			estimatedCost: inputCost + outputCost,
+		};
+	}
+}
--- a/scripts/lib/file-utils.ts
+++ b/scripts/lib/file-utils.ts
@ -0,0 +1,218 @@
+/**
+ * File System Utilities for Test Generation
+ *
+ * Handles reading source files, writing test files, and directory management.
+ */
+
+import * as fs from 'fs/promises';
+import * as path from 'path';
+import { glob } from 'glob';
+
+export interface SourceFile {
+	absolutePath: string;
+	relativePath: string;
+	content: string;
+	serviceName: string;
+	fileName: string;
+}
+
+export interface TestFile {
+	sourcePath: string;
+	testPath: string;
+	content: string;
+	serviceName: string;
+}
+
+export class FileUtils {
+	private projectRoot: string;
+
+	constructor(projectRoot: string) {
+		this.projectRoot = projectRoot;
+	}
+
+	/**
+	 * Find all source files in a service that need tests
+	 */
+	async findSourceFiles(serviceName: string): Promise<SourceFile[]> {
+		const serviceDir = path.join(this.projectRoot, 'apps/backend', serviceName);
+
+		// Check if service exists
+		try {
+			await fs.access(serviceDir);
+		} catch {
+			throw new Error(`Service not found: ${serviceName}`);
+		}
+
+		// Find TypeScript files that need tests
+		const patterns = [
+			`${serviceDir}/src/**/*.service.ts`,
+			`${serviceDir}/src/**/*.controller.ts`,
+			`${serviceDir}/src/**/*.repository.ts`,
+			`${serviceDir}/src/**/*.dto.ts`,
+		];
+
+		// Exclude files that shouldn't be tested
+		const excludePatterns = [
+			'**/*.module.ts',
+			'**/main.ts',
+			'**/index.ts',
+			'**/*.spec.ts',
+			'**/*.test.ts',
+		];
+
+		const sourceFiles: SourceFile[] = [];
+
+		for (const pattern of patterns) {
+			const files = await glob(pattern, {
+				ignore: excludePatterns,
+				absolute: true,
+			});
+
+			for (const filePath of files) {
+				try {
+					const content = await fs.readFile(filePath, 'utf-8');
+					const relativePath = path.relative(this.projectRoot, filePath);
+					const fileName = path.basename(filePath);
+
+					sourceFiles.push({
+						absolutePath: filePath,
+						relativePath,
+						content,
+						serviceName,
+						fileName,
+					});
+				} catch (error) {
+					console.error(`[FileUtils] Failed to read ${filePath}:`, error);
+				}
+			}
+		}
+
+		return sourceFiles;
+	}
+
+	/**
+	 * Find a specific source file
+	 */
+	async findSourceFile(filePath: string): Promise<SourceFile> {
+		const absolutePath = path.isAbsolute(filePath)
+			? filePath
+			: path.join(this.projectRoot, filePath);
+
+		try {
+			const content = await fs.readFile(absolutePath, 'utf-8');
+			const relativePath = path.relative(this.projectRoot, absolutePath);
+			const fileName = path.basename(absolutePath);
+
+			// Extract service name from path (apps/backend/SERVICE_NAME/...)
+			const serviceMatch = relativePath.match(/apps\/backend\/([^\/]+)/);
+			const serviceName = serviceMatch ? serviceMatch[1] : 'unknown';
+
+			return {
+				absolutePath,
+				relativePath,
+				content,
+				serviceName,
+				fileName,
+			};
+		} catch (error) {
+			throw new Error(`Failed to read source file ${filePath}: ${error}`);
+		}
+	}
+
+	/**
+	 * Get test file path for a source file
+	 */
+	getTestFilePath(sourceFile: SourceFile): string {
+		const { absolutePath, serviceName } = sourceFile;
+
+		// Convert src/ to test/
+		// Example: apps/backend/promo-service/src/promos/promo.service.ts
+		//       -> apps/backend/promo-service/test/promos/promo.service.spec.ts
+
+		const relativePath = path.relative(
+			path.join(this.projectRoot, 'apps/backend', serviceName),
+			absolutePath
+		);
+
+		// Replace src/ with test/ and .ts with .spec.ts
+		const testRelativePath = relativePath
+			.replace(/^src\//, 'test/')
+			.replace(/\.ts$/, '.spec.ts');
+
+		return path.join(
+			this.projectRoot,
+			'apps/backend',
+			serviceName,
+			testRelativePath
+		);
+	}
+
+	/**
+	 * Check if test file already exists
+	 */
+	async testFileExists(sourceFile: SourceFile): Promise<boolean> {
+		const testPath = this.getTestFilePath(sourceFile);
+		try {
+			await fs.access(testPath);
+			return true;
+		} catch {
+			return false;
+		}
+	}
+
+	/**
+	 * Write test file with proper directory creation
+	 */
+	async writeTestFile(testFile: TestFile): Promise<void> {
+		const { testPath, content } = testFile;
+
+		// Ensure directory exists
+		const dir = path.dirname(testPath);
+		await fs.mkdir(dir, { recursive: true });
+
+		// Write file
+		await fs.writeFile(testPath, content, 'utf-8');
+	}
+
+	/**
+	 * Read test template
+	 */
+	async readTestTemplate(): Promise<string> {
+		const templatePath = path.join(this.projectRoot, 'templates/backend-service-test.template.ts');
+
+		try {
+			return await fs.readFile(templatePath, 'utf-8');
+		} catch {
+			throw new Error(
+				`Test template not found at ${templatePath}. ` +
+				'Please ensure Story 19.3 is complete and template exists.'
+			);
+		}
+	}
+
+	/**
+	 * Find all backend services
+	 */
+	async findAllServices(): Promise<string[]> {
+		const backendDir = path.join(this.projectRoot, 'apps/backend');
+		const entries = await fs.readdir(backendDir, { withFileTypes: true });
+
+		return entries
+			.filter(entry => entry.isDirectory())
+			.map(entry => entry.name)
+			.filter(name => !name.startsWith('.'));
+	}
+
+	/**
+	 * Validate service exists
+	 */
+	async serviceExists(serviceName: string): Promise<boolean> {
+		const serviceDir = path.join(this.projectRoot, 'apps/backend', serviceName);
+		try {
+			await fs.access(serviceDir);
+			return true;
+		} catch {
+			return false;
+		}
+	}
+}
--- a/scripts/lib/llm-task-verifier.py
+++ b/scripts/lib/llm-task-verifier.py
@ -0,0 +1,346 @@
+#!/usr/bin/env python3
+"""
+LLM-Powered Task Verification - Use Claude Haiku to ACTUALLY verify code quality
+
+Purpose: Don't guess with regex - have Claude READ the code and verify it's real
+Method: For each task, read mentioned files, ask Claude "is this actually implemented?"
+
+Created: 2026-01-02
+Cost: ~$0.13 per story with Haiku (50 tasks × 3K tokens × $1.25/1M)
+Full platform: 511 stories × $0.13 = ~$66 total
+"""
+
+import json
+import os
+import re
+import sys
+from pathlib import Path
+from typing import Dict, List
+from anthropic import Anthropic
+
+
+class LLMTaskVerifier:
+    """Uses Claude API to verify tasks by reading and analyzing actual code"""
+
+    def __init__(self, api_key: str = None):
+        self.api_key = api_key or os.environ.get('ANTHROPIC_API_KEY')
+        if not self.api_key:
+            raise ValueError("ANTHROPIC_API_KEY required")
+
+        self.client = Anthropic(api_key=self.api_key)
+        self.model = 'claude-haiku-4-20250514'  # Fast + cheap for verification tasks
+        self.repo_root = Path('.')
+
+    def verify_task(self, task_text: str, is_checked: bool, story_context: Dict) -> Dict:
+        """
+        Use Claude to verify if a task is actually complete
+
+        Args:
+            task_text: The task description (e.g., "Implement UserService")
+            is_checked: Whether task is checked [x] or not [ ]
+            story_context: Context about the story (files, epic, etc.)
+
+        Returns:
+            {
+                'task': task_text,
+                'is_checked': bool,
+                'actually_complete': bool,
+                'confidence': 'very_high' | 'high' | 'medium' | 'low',
+                'evidence': str,
+                'issues_found': [list of issues],
+                'verification_status': 'correct' | 'false_positive' | 'false_negative'
+            }
+        """
+        # Extract file references from task
+        file_refs = self._extract_file_references(task_text)
+
+        # Read the files
+        file_contents = {}
+        for file_ref in file_refs[:5]:  # Limit to 5 files per task
+            content = self._read_file(file_ref)
+            if content:
+                file_contents[file_ref] = content
+
+        # If no files found, try reading files from story context
+        if not file_contents and story_context.get('files'):
+            for file_path in story_context['files'][:5]:
+                content = self._read_file(file_path)
+                if content:
+                    file_contents[file_path] = content
+
+        # Build prompt for Claude
+        prompt = self._build_verification_prompt(task_text, is_checked, file_contents, story_context)
+
+        # Call Claude API
+        try:
+            response = self.client.messages.create(
+                model=self.model,
+                max_tokens=2000,
+                temperature=0,  # Deterministic
+                messages=[{
+                    'role': 'user',
+                    'content': prompt
+                }]
+            )
+
+            # Parse response
+            result_text = response.content[0].text
+            result = self._parse_claude_response(result_text)
+
+            # Add metadata
+            result['task'] = task_text
+            result['is_checked'] = is_checked
+            result['tokens_used'] = response.usage.input_tokens + response.usage.output_tokens
+
+            # Determine verification status
+            if is_checked == result['actually_complete']:
+                result['verification_status'] = 'correct'
+            elif is_checked and not result['actually_complete']:
+                result['verification_status'] = 'false_positive'
+            else:
+                result['verification_status'] = 'false_negative'
+
+            return result
+
+        except Exception as e:
+            return {
+                'task': task_text,
+                'error': str(e),
+                'verification_status': 'error'
+            }
+
+    def _build_verification_prompt(self, task: str, is_checked: bool, files: Dict, context: Dict) -> str:
+        """Build prompt for Claude to verify task completion"""
+
+        files_section = ""
+        if files:
+            files_section = "\n\n## Files Provided\n\n"
+            for file_path, content in files.items():
+                files_section += f"### {file_path}\n```typescript\n{content[:2000]}\n```\n\n"
+        else:
+            files_section = "\n\n## Files Provided\n\nNone - task may not reference specific files.\n"
+
+        prompt = f"""You are a code verification expert. Your job is to verify whether a task from a user story is actually complete.
+
+## Task to Verify
+
+**Task:** {task}
+**Claimed Status:** {'[x] Complete' if is_checked else '[ ] Not complete'}
+
+## Story Context
+
+**Story:** {context.get('story_id', 'Unknown')}
+**Epic:** {context.get('epic', 'Unknown')}
+
+{files_section}
+
+## Your Task
+
+Analyze the files (if provided) and determine:
+
+1. **Is the task actually complete?**
+   - If files provided: Does the code actually implement what the task describes?
+   - Is it real implementation or just stubs/TODOs?
+   - Are there tests? Do they pass?
+
+2. **Confidence level:**
+   - very_high: Clear evidence (tests passing, full implementation)
+   - high: Strong evidence (code exists with logic, no stubs)
+   - medium: Some evidence but incomplete
+   - low: No files or cannot verify
+
+3. **Evidence:**
+   - What did you find that proves/disproves completion?
+   - Specific line numbers or code snippets
+   - Test results if applicable
+
+4. **Issues (if any):**
+   - Stub code or TODOs
+   - Missing error handling
+   - No multi-tenant isolation (dealerId filters)
+   - Security vulnerabilities
+   - Missing tests
+
+## Response Format (JSON)
+
+{{
+  "actually_complete": true/false,
+  "confidence": "very_high|high|medium|low",
+  "evidence": "Detailed explanation of what you found",
+  "issues_found": ["issue 1", "issue 2"],
+  "recommendation": "What needs to be done (if incomplete)"
+}}
+
+**Be objective. If code is a stub with TODOs, it's NOT complete even if files exist.**
+"""
+        return prompt
+
+    def _parse_claude_response(self, response_text: str) -> Dict:
+        """Parse Claude's JSON response"""
+        try:
+            # Extract JSON from response (may have markdown)
+            json_match = re.search(r'\{.*\}', response_text, re.DOTALL)
+            if json_match:
+                return json.loads(json_match.group(0))
+            else:
+                # Fallback: parse manually
+                return {
+                    'actually_complete': 'complete' in response_text.lower() and 'not complete' not in response_text.lower(),
+                    'confidence': 'low',
+                    'evidence': response_text[:500],
+                    'issues_found': [],
+                }
+        except:
+            return {
+                'actually_complete': False,
+                'confidence': 'low',
+                'evidence': 'Failed to parse response',
+                'issues_found': ['Parse error'],
+            }
+
+    def _extract_file_references(self, task_text: str) -> List[str]:
+        """Extract file paths from task text"""
+        paths = []
+
+        # Common patterns
+        patterns = [
+            r'[\w/-]+/[\w-]+\.[\w]+',  # Explicit paths
+            r'\b([A-Z][\w-]+\.(ts|tsx|service|controller|repository))',  # Files
+        ]
+
+        for pattern in patterns:
+            matches = re.findall(pattern, task_text)
+            if isinstance(matches[0], tuple) if matches else False:
+                paths.extend([m[0] for m in matches])
+            else:
+                paths.extend(matches)
+
+        return list(set(paths))[:5]  # Max 5 files per task
+
+    def _read_file(self, file_ref: str) -> str:
+        """Find and read file from repository"""
+        # Try exact path
+        if (self.repo_root / file_ref).exists():
+            try:
+                return (self.repo_root / file_ref).read_text()[:5000]  # Max 5K chars
+            except:
+                return None
+
+        # Search for file
+        import subprocess
+        try:
+            result = subprocess.run(
+                ['find', '.', '-name', Path(file_ref).name, '-type', 'f'],
+                capture_output=True,
+                text=True,
+                cwd=self.repo_root,
+                timeout=5
+            )
+
+            if result.stdout.strip():
+                file_path = result.stdout.strip().split('\n')[0]
+                return Path(file_path).read_text()[:5000]
+        except:
+            pass
+
+        return None
+
+
+def verify_story_with_llm(story_file_path: str) -> Dict:
+    """
+    Verify entire story using LLM for each task
+
+    Cost: ~$1.50 per story (50 tasks × 3K tokens/task × $15/1M)
+    Time: ~2-3 minutes per story
+    """
+    verifier = LLMTaskVerifier()
+    story_path = Path(story_file_path)
+
+    if not story_path.exists():
+        return {'error': 'Story file not found'}
+
+    content = story_path.read_text()
+
+    # Extract story context
+    story_id = story_path.stem
+    epic_match = re.search(r'Epic:\*?\*?\s*(\w+)', content, re.IGNORECASE)
+    epic = epic_match.group(1) if epic_match else 'Unknown'
+
+    # Extract files from Dev Agent Record
+    file_list_match = re.search(r'### File List\n\n(.+?)###', content, re.DOTALL)
+    files = []
+    if file_list_match:
+        file_section = file_list_match.group(1)
+        files = re.findall(r'[\w/-]+\.[\w]+', file_section)
+
+    story_context = {
+        'story_id': story_id,
+        'epic': epic,
+        'files': files
+    }
+
+    # Extract all tasks
+    task_pattern = r'^-\s*\[([ xX])\]\s*(.+)$'
+    tasks = re.findall(task_pattern, content, re.MULTILINE)
+
+    if not tasks:
+        return {'error': 'No tasks found'}
+
+    # Verify each task with LLM
+    print(f"\n🔍 Verifying {len(tasks)} tasks with Claude...", file=sys.stderr)
+
+    task_results = []
+    for idx, (checkbox, task_text) in enumerate(tasks):
+        is_checked = checkbox.lower() == 'x'
+
+        print(f"  {idx+1}/{len(tasks)}: {task_text[:60]}...", file=sys.stderr)
+
+        result = verifier.verify_task(task_text, is_checked, story_context)
+        task_results.append(result)
+
+    # Calculate summary
+    total = len(task_results)
+    correct = sum(1 for r in task_results if r.get('verification_status') == 'correct')
+    false_positives = sum(1 for r in task_results if r.get('verification_status') == 'false_positive')
+    false_negatives = sum(1 for r in task_results if r.get('verification_status') == 'false_negative')
+
+    return {
+        'story_id': story_id,
+        'total_tasks': total,
+        'correct': correct,
+        'false_positives': false_positives,
+        'false_negatives': false_negatives,
+        'verification_score': round((correct / total * 100), 1) if total > 0 else 0,
+        'task_results': task_results
+    }
+
+
+if __name__ == '__main__':
+    if len(sys.argv) < 2:
+        print("Usage: llm-task-verifier.py <story-file>")
+        sys.exit(1)
+
+    results = verify_story_with_llm(sys.argv[1])
+
+    if 'error' in results:
+        print(f"❌ {results['error']}")
+        sys.exit(1)
+
+    # Print summary
+    print(f"\n📊 Story: {results['story_id']}")
+    print(f"Verification Score: {results['verification_score']}/100")
+    print(f"✅ Correct: {results['correct']}")
+    print(f"❌ False Positives: {results['false_positives']}")
+    print(f"⚠️  False Negatives: {results['false_negatives']}")
+
+    # Show false positives
+    if results['false_positives'] > 0:
+        print(f"\n❌ FALSE POSITIVES (claimed done but not implemented):")
+        for task in results['task_results']:
+            if task.get('verification_status') == 'false_positive':
+                print(f"  - {task['task'][:80]}")
+                print(f"    {task.get('evidence', 'No evidence')}")
+
+    # Output JSON
+    if '--json' in sys.argv:
+        print(json.dumps(results, indent=2))
--- a/scripts/lib/rate-limiter.ts
+++ b/scripts/lib/rate-limiter.ts
@ -0,0 +1,122 @@
+/**
+ * Rate Limiter for Claude API
+ *
+ * Implements exponential backoff and respects rate limits:
+ * - 50 requests/minute (Claude API limit)
+ * - Automatic retry on 429 (rate limit exceeded)
+ * - Configurable concurrent request limit
+ */
+
+export interface RateLimiterConfig {
+	requestsPerMinute: number;
+	maxRetries: number;
+	initialBackoffMs: number;
+	maxConcurrent: number;
+}
+
+export class RateLimiter {
+	private requestTimestamps: number[] = [];
+	private activeRequests = 0;
+	private config: RateLimiterConfig;
+
+	constructor(config: Partial<RateLimiterConfig> = {}) {
+		this.config = {
+			requestsPerMinute: config.requestsPerMinute ?? 50,
+			maxRetries: config.maxRetries ?? 3,
+			initialBackoffMs: config.initialBackoffMs ?? 1000,
+			maxConcurrent: config.maxConcurrent ?? 5,
+		};
+	}
+
+	/**
+	 * Wait until it's safe to make next request
+	 */
+	async waitForSlot(): Promise<void> {
+		// Wait for concurrent slot
+		while (this.activeRequests >= this.config.maxConcurrent) {
+			await this.sleep(100);
+		}
+
+		// Clean old timestamps (older than 1 minute)
+		const oneMinuteAgo = Date.now() - 60000;
+		this.requestTimestamps = this.requestTimestamps.filter(ts => ts > oneMinuteAgo);
+
+		// Check if we've hit rate limit
+		if (this.requestTimestamps.length >= this.config.requestsPerMinute) {
+			const oldestRequest = this.requestTimestamps[0];
+			const waitTime = 60000 - (Date.now() - oldestRequest);
+
+			if (waitTime > 0) {
+				console.log(`[RateLimiter] Rate limit reached. Waiting ${Math.ceil(waitTime / 1000)}s...`);
+				await this.sleep(waitTime);
+			}
+		}
+
+		// Add delay between requests (1.2s for 50 req/min)
+		const minDelayMs = Math.ceil(60000 / this.config.requestsPerMinute);
+		const lastRequest = this.requestTimestamps[this.requestTimestamps.length - 1];
+		if (lastRequest) {
+			const timeSinceLastRequest = Date.now() - lastRequest;
+			if (timeSinceLastRequest < minDelayMs) {
+				await this.sleep(minDelayMs - timeSinceLastRequest);
+			}
+		}
+
+		this.requestTimestamps.push(Date.now());
+		this.activeRequests++;
+	}
+
+	/**
+	 * Release a concurrent slot
+	 */
+	releaseSlot(): void {
+		this.activeRequests = Math.max(0, this.activeRequests - 1);
+	}
+
+	/**
+	 * Execute function with exponential backoff retry
+	 */
+	async withRetry<T>(fn: () => Promise<T>, context: string): Promise<T> {
+		let lastError: Error | null = null;
+
+		for (let attempt = 0; attempt < this.config.maxRetries; attempt++) {
+			try {
+				await this.waitForSlot();
+				const result = await fn();
+				this.releaseSlot();
+				return result;
+			} catch (error) {
+				this.releaseSlot();
+				lastError = error instanceof Error ? error : new Error(String(error));
+
+				// Check if it's a rate limit error (429)
+				const errorMsg = lastError.message.toLowerCase();
+				const isRateLimit = errorMsg.includes('429') || errorMsg.includes('rate limit');
+
+				if (isRateLimit && attempt < this.config.maxRetries - 1) {
+					const backoffMs = this.config.initialBackoffMs * Math.pow(2, attempt);
+					console.log(
+						`[RateLimiter] ${context} - Rate limit hit. Retry ${attempt + 1}/${this.config.maxRetries} in ${backoffMs}ms`
+					);
+					await this.sleep(backoffMs);
+					continue;
+				}
+
+				// Non-retryable error or max retries reached
+				if (attempt < this.config.maxRetries - 1) {
+					const backoffMs = this.config.initialBackoffMs * Math.pow(2, attempt);
+					console.log(
+						`[RateLimiter] ${context} - Error: ${lastError.message}. Retry ${attempt + 1}/${this.config.maxRetries} in ${backoffMs}ms`
+					);
+					await this.sleep(backoffMs);
+				}
+			}
+		}
+
+		throw new Error(`${context} - Failed after ${this.config.maxRetries} attempts: ${lastError?.message}`);
+	}
+
+	private sleep(ms: number): Promise<void> {
+		return new Promise(resolve => setTimeout(resolve, ms));
+	}
+}
--- a/scripts/lib/sprint-status-updater.py
+++ b/scripts/lib/sprint-status-updater.py
@ -0,0 +1,421 @@
+#!/usr/bin/env python3
+"""
+Sprint Status Updater - Robust YAML updater for sprint-status.yaml
+
+Purpose: Update sprint-status.yaml entries while preserving:
+  - Comments
+  - Formatting
+  - Section structure
+  - Manual annotations
+
+Created: 2026-01-02
+Part of: Full Workflow Fix (Option C)
+"""
+
+import re
+import sys
+from pathlib import Path
+from typing import Dict, List, Tuple
+from datetime import datetime
+
+
+class SprintStatusUpdater:
+    """Updates sprint-status.yaml while preserving structure and comments"""
+
+    def __init__(self, sprint_status_path: str):
+        self.path = Path(sprint_status_path)
+        self.content = self.path.read_text()
+        self.lines = self.content.split('\n')
+        self.updates_applied = 0
+
+    def update_story_status(self, story_id: str, new_status: str, comment: str = None) -> bool:
+        """
+        Update a single story's status in development_status section
+
+        Args:
+            story_id: Story identifier (e.g., "19-4a-inventory-service-test-coverage")
+            new_status: New status value (e.g., "done", "in-progress")
+            comment: Optional comment to append (e.g., "✅ COMPLETE 2026-01-02")
+
+        Returns:
+            True if update was applied, False if story not found or unchanged
+        """
+        # Find the story line in development_status section
+        in_dev_status = False
+        story_line_idx = None
+
+        for idx, line in enumerate(self.lines):
+            if line.strip() == 'development_status:':
+                in_dev_status = True
+                continue
+
+            if in_dev_status:
+                # Check if we've left development_status section
+                if line and not line.startswith('  ') and not line.startswith('#'):
+                    break
+
+                # Check if this is our story
+                if line.startswith('  ') and story_id in line:
+                    story_line_idx = idx
+                    break
+
+        if story_line_idx is None:
+            # Story not found - need to add it
+            return self._add_story_entry(story_id, new_status, comment)
+
+        # Update existing line
+        current_line = self.lines[story_line_idx]
+
+        # Parse current line: "  story-id: status  # comment"
+        match = re.match(r'(\s+)([a-z0-9-]+):\s*(\S+)(.*)', current_line)
+        if not match:
+            print(f"WARNING: Could not parse line: {current_line}", file=sys.stderr)
+            return False
+
+        indent, current_story_id, current_status, existing_comment = match.groups()
+
+        # Check if update needed
+        if current_status == new_status:
+            return False  # No change needed
+
+        # Build new line
+        if comment:
+            new_line = f"{indent}{story_id}: {new_status}  # {comment}"
+        elif existing_comment:
+            # Preserve existing comment
+            new_line = f"{indent}{story_id}: {new_status}{existing_comment}"
+        else:
+            new_line = f"{indent}{story_id}: {new_status}"
+
+        self.lines[story_line_idx] = new_line
+        self.updates_applied += 1
+        return True
+
+    def _add_story_entry(self, story_id: str, status: str, comment: str = None) -> bool:
+        """Add a new story entry to development_status section"""
+        # Find the epic this story belongs to
+        epic_match = re.match(r'^(\d+[a-z]?)-', story_id)
+        if not epic_match:
+            print(f"WARNING: Cannot determine epic for {story_id}", file=sys.stderr)
+            return False
+
+        epic_num = epic_match.group(1)
+        epic_key = f"epic-{epic_num}"
+
+        # Find where to insert the story (after its epic line)
+        in_dev_status = False
+        insert_idx = None
+
+        for idx, line in enumerate(self.lines):
+            if line.strip() == 'development_status:':
+                in_dev_status = True
+                continue
+
+            if in_dev_status:
+                # Look for the epic line
+                if line.strip().startswith(f"{epic_key}:"):
+                    # Found the epic - insert after it
+                    insert_idx = idx + 1
+                    break
+
+        if insert_idx is None:
+            print(f"WARNING: Could not find epic {epic_key} in development_status", file=sys.stderr)
+            return False
+
+        # Build new line
+        if comment:
+            new_line = f"  {story_id}: {status}  # {comment}"
+        else:
+            new_line = f"  {story_id}: {status}"
+
+        # Insert the line
+        self.lines.insert(insert_idx, new_line)
+        self.updates_applied += 1
+        return True
+
+    def update_epic_status(self, epic_key: str, new_status: str, comment: str = None) -> bool:
+        """Update epic status line"""
+        in_dev_status = False
+        epic_line_idx = None
+
+        for idx, line in enumerate(self.lines):
+            if line.strip() == 'development_status:':
+                in_dev_status = True
+                continue
+
+            if in_dev_status:
+                if line and not line.startswith('  ') and not line.startswith('#'):
+                    break
+
+                if line.strip().startswith(f"{epic_key}:"):
+                    epic_line_idx = idx
+                    break
+
+        if epic_line_idx is None:
+            print(f"WARNING: Epic {epic_key} not found", file=sys.stderr)
+            return False
+
+        # Parse current line
+        current_line = self.lines[epic_line_idx]
+        match = re.match(r'(\s+)([a-z0-9-]+):\s*(\S+)(.*)', current_line)
+        if not match:
+            return False
+
+        indent, current_epic, current_status, existing_comment = match.groups()
+
+        if current_status == new_status:
+            return False
+
+        # Build new line
+        if comment:
+            new_line = f"{indent}{epic_key}: {new_status}  # {comment}"
+        elif existing_comment:
+            new_line = f"{indent}{epic_key}: {new_status}{existing_comment}"
+        else:
+            new_line = f"{indent}{epic_key}: {new_status}"
+
+        self.lines[epic_line_idx] = new_line
+        self.updates_applied += 1
+        return True
+
+    def add_verification_note(self):
+        """Add verification timestamp to header"""
+        # Find and update last_verified line
+        for idx, line in enumerate(self.lines):
+            if line.startswith('# last_verified:'):
+                timestamp = datetime.now().strftime('%Y-%m-%d %H:%M:%S EST')
+                self.lines[idx] = f"# last_verified: {timestamp}"
+                break
+
+    def save(self, backup: bool = True) -> Path:
+        """
+        Save updated content back to file
+
+        Args:
+            backup: If True, create backup before saving
+
+        Returns:
+            Path to backup file if created, otherwise original path
+        """
+        if backup and self.updates_applied > 0:
+            backup_dir = Path('.sprint-status-backups')
+            backup_dir.mkdir(exist_ok=True)
+            backup_path = backup_dir / f"sprint-status-{datetime.now().strftime('%Y%m%d-%H%M%S')}.yaml"
+            backup_path.write_text(self.content)
+            print(f"✓ Backup created: {backup_path}", file=sys.stderr)
+
+        # Write updated content
+        new_content = '\n'.join(self.lines)
+        self.path.write_text(new_content)
+
+        return self.path
+
+
+def scan_story_statuses(story_dir: str = "docs/sprint-artifacts") -> Dict[str, str]:
+    """
+    Scan all story files and extract EXPLICIT Status: fields
+
+    CRITICAL: Only returns stories that HAVE a Status: field.
+    If Status: field is missing, story is NOT included in results.
+    This prevents overwriting sprint-status.yaml with defaults.
+
+    Returns:
+        Dict mapping story_id -> normalized_status (ONLY for stories with explicit Status: field)
+    """
+    story_dir_path = Path(story_dir)
+    story_files = list(story_dir_path.glob("*.md"))
+
+    STATUS_MAPPINGS = {
+        'done': 'done',
+        'complete': 'done',
+        'completed': 'done',
+        'in-progress': 'in-progress',
+        'in_progress': 'in-progress',
+        'review': 'review',
+        'ready-for-dev': 'ready-for-dev',
+        'ready_for_dev': 'ready-for-dev',
+        'pending': 'ready-for-dev',
+        'drafted': 'ready-for-dev',
+        'backlog': 'backlog',
+        'blocked': 'blocked',
+        'deferred': 'deferred',
+        'archived': 'archived',
+    }
+
+    story_statuses = {}
+    skipped_count = 0
+
+    for story_file in story_files:
+        story_id = story_file.stem
+
+        # Skip special files
+        if (story_id.startswith('.') or
+            story_id.startswith('EPIC-') or
+            'COMPLETION' in story_id.upper() or
+            'SUMMARY' in story_id.upper() or
+            'REPORT' in story_id.upper() or
+            'README' in story_id.upper() or
+            'INDEX' in story_id.upper() or
+            'REVIEW' in story_id.upper() or
+            'AUDIT' in story_id.upper()):
+            continue
+
+        try:
+            content = story_file.read_text()
+
+            # Extract Status field
+            status_match = re.search(r'^Status:\s*(.+?)$', content, re.MULTILINE | re.IGNORECASE)
+
+            if status_match:
+                status = status_match.group(1).strip()
+                # Remove comments
+                status = re.sub(r'\s*#.*$', '', status).strip().lower()
+
+                # Normalize status
+                if status in STATUS_MAPPINGS:
+                    normalized_status = STATUS_MAPPINGS[status]
+                elif 'done' in status or 'complete' in status:
+                    normalized_status = 'done'
+                elif 'progress' in status:
+                    normalized_status = 'in-progress'
+                elif 'review' in status:
+                    normalized_status = 'review'
+                elif 'ready' in status:
+                    normalized_status = 'ready-for-dev'
+                elif 'block' in status:
+                    normalized_status = 'blocked'
+                elif 'defer' in status:
+                    normalized_status = 'deferred'
+                elif 'archive' in status:
+                    normalized_status = 'archived'
+                else:
+                    normalized_status = 'ready-for-dev'
+
+                story_statuses[story_id] = normalized_status
+            else:
+                # CRITICAL FIX: No Status: field found
+                # Do NOT default to ready-for-dev - skip this story entirely
+                # This prevents overwriting sprint-status.yaml with incorrect defaults
+                skipped_count += 1
+
+        except Exception as e:
+            print(f"ERROR parsing {story_id}: {e}", file=sys.stderr)
+            continue
+
+    print(f"✓ Found {len(story_statuses)} stories with explicit Status: fields", file=sys.stderr)
+    print(f"ℹ Skipped {skipped_count} stories without Status: fields (trust sprint-status.yaml)", file=sys.stderr)
+
+    return story_statuses
+
+
+def main():
+    """Main entry point for CLI usage"""
+    import argparse
+
+    parser = argparse.ArgumentParser(description='Update sprint-status.yaml from story files')
+    parser.add_argument('--dry-run', action='store_true', help='Show changes without applying')
+    parser.add_argument('--validate', action='store_true', help='Validate only (exit 1 if discrepancies)')
+    parser.add_argument('--sprint-status', default='docs/sprint-artifacts/sprint-status.yaml',
+                        help='Path to sprint-status.yaml')
+    parser.add_argument('--story-dir', default='docs/sprint-artifacts',
+                        help='Path to story files directory')
+    parser.add_argument('--epic', type=str, help='Validate specific epic only (e.g., epic-1)')
+    parser.add_argument('--mode', choices=['validate', 'fix'], default='validate',
+                        help='Mode: validate (report only) or fix (apply updates)')
+    args = parser.parse_args()
+
+    # Scan story files
+    print("Scanning story files...", file=sys.stderr)
+    story_statuses = scan_story_statuses(args.story_dir)
+
+    # Filter by epic if specified
+    if args.epic:
+        # Extract epic number from epic key (e.g., "epic-1" -> "1")
+        epic_match = re.match(r'epic-([0-9a-z-]+)', args.epic)
+        if epic_match:
+            epic_num = epic_match.group(1)
+            # Filter stories that start with this epic number
+            story_statuses = {k: v for k, v in story_statuses.items()
+                            if k.startswith(f"{epic_num}-")}
+            print(f"✓ Filtered to {len(story_statuses)} stories for {args.epic}", file=sys.stderr)
+        else:
+            print(f"WARNING: Invalid epic format: {args.epic}", file=sys.stderr)
+
+    print(f"✓ Scanned {len(story_statuses)} story files", file=sys.stderr)
+    print("", file=sys.stderr)
+
+    # Load sprint-status.yaml
+    updater = SprintStatusUpdater(args.sprint_status)
+
+    # Find discrepancies
+    discrepancies = []
+
+    for story_id, new_status in story_statuses.items():
+        # Check current status in sprint-status.yaml
+        current_status = None
+        in_dev_status = False
+
+        for line in updater.lines:
+            if line.strip() == 'development_status:':
+                in_dev_status = True
+                continue
+
+            if in_dev_status and story_id in line:
+                match = re.match(r'\s+[a-z0-9-]+:\s*(\S+)', line)
+                if match:
+                    current_status = match.group(1)
+                    break
+
+        if current_status is None:
+            discrepancies.append((story_id, 'NOT-IN-FILE', new_status))
+        elif current_status != new_status:
+            discrepancies.append((story_id, current_status, new_status))
+
+    # Report
+    if not discrepancies:
+        print("✓ sprint-status.yaml is up to date!", file=sys.stderr)
+        sys.exit(0)
+
+    print(f"⚠ Found {len(discrepancies)} discrepancies:", file=sys.stderr)
+    print("", file=sys.stderr)
+
+    for story_id, old_status, new_status in discrepancies[:20]:
+        if old_status == 'NOT-IN-FILE':
+            print(f"  [ADD] {story_id}: (not in file) → {new_status}", file=sys.stderr)
+        else:
+            print(f"  [UPDATE] {story_id}: {old_status} → {new_status}", file=sys.stderr)
+
+    if len(discrepancies) > 20:
+        print(f"  ... and {len(discrepancies) - 20} more", file=sys.stderr)
+
+    print("", file=sys.stderr)
+
+    # Handle mode parameter
+    if args.mode == 'validate' or args.validate:
+        print("✗ Validation failed - discrepancies found", file=sys.stderr)
+        sys.exit(1)
+
+    if args.dry_run:
+        print("DRY RUN: Would update sprint-status.yaml", file=sys.stderr)
+        sys.exit(0)
+
+    # Apply updates (--mode fix or default behavior)
+    print("Applying updates...", file=sys.stderr)
+
+    for story_id, old_status, new_status in discrepancies:
+        comment = f"Updated {datetime.now().strftime('%Y-%m-%d')}"
+        updater.update_story_status(story_id, new_status, comment)
+
+    # Add verification timestamp
+    updater.add_verification_note()
+
+    # Save
+    updater.save(backup=True)
+
+    print(f"✓ Applied {updater.updates_applied} updates", file=sys.stderr)
+    print(f"✓ Updated: {updater.path}", file=sys.stderr)
+    sys.exit(0)
+
+
+if __name__ == '__main__':
+    main()
--- a/scripts/lib/task-verification-engine.py
+++ b/scripts/lib/task-verification-engine.py
@ -0,0 +1,525 @@
+#!/usr/bin/env python3
+"""
+Task Verification Engine - Verify story task checkboxes match ACTUAL CODE
+
+Purpose: Prevent false positives where tasks are checked but code doesn't exist
+Method: Parse task text, infer what files/functions should exist, verify in codebase
+
+Created: 2026-01-02
+Part of: Comprehensive validation solution
+"""
+
+import re
+import subprocess
+from pathlib import Path
+from typing import Dict, List, Tuple, Optional
+
+
+class TaskVerificationEngine:
+    """Verifies that checked tasks correspond to actual code in the repository"""
+
+    def __init__(self, repo_root: Path = Path(".")):
+        self.repo_root = repo_root
+
+    def verify_task(self, task_text: str, is_checked: bool) -> Dict:
+        """
+        Verify a single task against codebase reality
+
+        DEEP VERIFICATION - Not just file existence, but:
+        - Files exist AND have real implementation (not stubs)
+        - Tests exist AND are passing
+        - No TODO/FIXME comments in implementation
+        - Code has actual logic (not empty classes)
+
+        Returns:
+            {
+                'task': task_text,
+                'is_checked': bool,
+                'should_be_checked': bool,
+                'confidence': 'high'|'medium'|'low',
+                'evidence': [list of evidence],
+                'verification_status': 'correct'|'false_positive'|'false_negative'|'uncertain'
+            }
+        """
+        # Extract potential file paths from task text
+        file_refs = self._extract_file_references(task_text)
+
+        # Extract class/function names
+        code_refs = self._extract_code_references(task_text)
+
+        # Extract test requirements
+        test_refs = self._extract_test_references(task_text)
+
+        # Verify file existence AND implementation quality
+        files_exist = []
+        files_missing = []
+
+        for file_ref in file_refs:
+            if self._file_exists(file_ref):
+                # DEEP CHECK: Is it really implemented or just a stub?
+                if self._verify_real_implementation(file_ref, None):
+                    files_exist.append(file_ref)
+                else:
+                    files_missing.append(f"{file_ref} (stub/TODO)")
+            else:
+                files_missing.append(file_ref)
+
+        # Verify code existence AND implementation
+        code_found = []
+        code_missing = []
+
+        for code_ref in code_refs:
+            if self._code_exists(code_ref):
+                code_found.append(code_ref)
+            else:
+                code_missing.append(code_ref)
+
+        # Verify tests exist AND pass
+        tests_passing = []
+        tests_failing_or_missing = []
+
+        for test_ref in test_refs:
+            test_status = self._verify_test_exists_and_passes(test_ref)
+            if test_status == 'passing':
+                tests_passing.append(test_ref)
+            else:
+                tests_failing_or_missing.append(f"{test_ref} ({test_status})")
+
+        # Build evidence with DEEP verification
+        evidence = []
+        confidence = 'low'
+        should_be_checked = False
+
+        # STRONGEST evidence: Tests exist AND pass
+        if tests_passing:
+            evidence.append(f"{len(tests_passing)} tests passing (VERIFIED)")
+            confidence = 'very high'
+            should_be_checked = True
+
+        # Strong evidence: Files exist with real implementation
+        if files_exist and not files_missing:
+            evidence.append(f"All {len(files_exist)} files exist with real code (no stubs)")
+            if confidence != 'very high':
+                confidence = 'high'
+            should_be_checked = True
+
+        # Strong evidence: Code found with implementation
+        if code_found and not code_missing:
+            evidence.append(f"All {len(code_found)} code elements implemented")
+            if confidence == 'low':
+                confidence = 'high'
+            should_be_checked = True
+
+        # NEGATIVE evidence: Tests missing or failing
+        if tests_failing_or_missing:
+            evidence.append(f"{len(tests_failing_or_missing)} tests missing/failing")
+            # Even if files exist, no passing tests = NOT done
+            should_be_checked = False
+            confidence = 'medium'
+
+        # NEGATIVE evidence: Mixed results
+        if files_exist and files_missing:
+            evidence.append(f"{len(files_exist)} files OK, {len(files_missing)} missing/stubs")
+            confidence = 'medium'
+            should_be_checked = False  # Incomplete
+
+        # Strong evidence of incompletion
+        if not files_exist and files_missing:
+            evidence.append(f"All {len(files_missing)} files missing or stubs")
+            confidence = 'high'
+            should_be_checked = False
+
+        if not code_found and code_missing:
+            evidence.append(f"Code not found: {', '.join(code_missing[:3])}")
+            confidence = 'medium'
+            should_be_checked = False
+
+        # No file/code/test references - use heuristics
+        if not file_refs and not code_refs and not test_refs:
+            # Check for action keywords
+            if self._has_completion_keywords(task_text):
+                evidence.append("Research/analysis task (no code artifacts)")
+                confidence = 'low'
+                # Can't verify - trust the checkbox
+                should_be_checked = is_checked
+            else:
+                evidence.append("No verifiable references")
+                confidence = 'low'
+                should_be_checked = is_checked
+
+        # Determine verification status
+        if is_checked == should_be_checked:
+            verification_status = 'correct'
+        elif is_checked and not should_be_checked:
+            verification_status = 'false_positive'  # Checked but code missing
+        elif not is_checked and should_be_checked:
+            verification_status = 'false_negative'  # Unchecked but code exists
+        else:
+            verification_status = 'uncertain'
+
+        return {
+            'task': task_text,
+            'is_checked': is_checked,
+            'should_be_checked': should_be_checked,
+            'confidence': confidence,
+            'evidence': evidence,
+            'verification_status': verification_status,
+            'files_exist': files_exist,
+            'files_missing': files_missing,
+            'code_found': code_found,
+            'code_missing': code_missing,
+        }
+
+    def _extract_file_references(self, task_text: str) -> List[str]:
+        """Extract file path references from task text"""
+        paths = []
+
+        # Pattern 1: Explicit paths (src/foo/bar.ts)
+        explicit_paths = re.findall(r'[\w/-]+/[\w-]+\.[\w]+', task_text)
+        paths.extend(explicit_paths)
+
+        # Pattern 2: "Create Foo.ts" or "Implement Bar.service.ts"
+        file_mentions = re.findall(r'\b([A-Z][\w-]+\.(ts|tsx|js|jsx|py|md|yaml|json))\b', task_text)
+        paths.extend([f[0] for f in file_mentions])
+
+        # Pattern 3: "in components/Widget.tsx"
+        contextual = re.findall(r'in\s+([\w/-]+\.[\w]+)', task_text, re.IGNORECASE)
+        paths.extend(contextual)
+
+        return list(set(paths))  # Deduplicate
+
+    def _extract_code_references(self, task_text: str) -> List[str]:
+        """Extract class/function/interface names from task text"""
+        code_refs = []
+
+        # Pattern 1: "Create FooService class"
+        class_patterns = re.findall(r'(?:Create|Implement|Add)\s+(\w+(?:Service|Controller|Repository|Component|Interface|Type))', task_text, re.IGNORECASE)
+        code_refs.extend(class_patterns)
+
+        # Pattern 2: "Implement getFoo method"
+        method_patterns = re.findall(r'(?:Implement|Add|Create)\s+(\w+)\s+(?:method|function)', task_text, re.IGNORECASE)
+        code_refs.extend(method_patterns)
+
+        # Pattern 3: Camel/PascalCase references
+        camelcase = re.findall(r'\b([A-Z][a-z]+(?:[A-Z][a-z]+)+)\b', task_text)
+        code_refs.extend(camelcase)
+
+        return list(set(code_refs))
+
+    def _file_exists(self, file_path: str) -> bool:
+        """Check if file exists in repository"""
+        # Try exact path first
+        if (self.repo_root / file_path).exists():
+            return True
+
+        # Try common locations
+        search_dirs = [
+            'apps/backend/',
+            'apps/frontend/',
+            'packages/',
+            'src/',
+            'infrastructure/',
+        ]
+
+        for search_dir in search_dirs:
+            if (self.repo_root / search_dir).exists():
+                # Use find command
+                try:
+                    result = subprocess.run(
+                        ['find', search_dir, '-name', Path(file_path).name, '-type', 'f'],
+                        capture_output=True,
+                        text=True,
+                        cwd=self.repo_root,
+                        timeout=5
+                    )
+                    if result.returncode == 0 and result.stdout.strip():
+                        return True
+                except:
+                    pass
+
+        return False
+
+    def _code_exists(self, code_ref: str) -> bool:
+        """Check if class/function/interface exists AND is actually implemented (not just a stub)"""
+        try:
+            # Search for class, interface, function, or type declaration
+            patterns = [
+                f'class {code_ref}',
+                f'interface {code_ref}',
+                f'function {code_ref}',
+                f'export const {code_ref}',
+                f'export function {code_ref}',
+                f'type {code_ref}',
+            ]
+
+            for pattern in patterns:
+                result = subprocess.run(
+                    ['grep', '-r', '-l', pattern, '.', '--include=*.ts', '--include=*.tsx', '--include=*.js'],
+                    capture_output=True,
+                    text=True,
+                    cwd=self.repo_root,
+                    timeout=10
+                )
+                if result.returncode == 0 and result.stdout.strip():
+                    # Found the declaration - now verify it's not a stub
+                    file_path = result.stdout.strip().split('\n')[0]
+                    if self._verify_real_implementation(file_path, code_ref):
+                        return True
+
+        except:
+            pass
+
+        return False
+
+    def _verify_real_implementation(self, file_path: str, code_ref: str) -> bool:
+        """
+        Verify code is REALLY implemented, not just a stub or TODO
+
+        Checks for:
+        - File has substantial code (not just empty class)
+        - No TODO/FIXME comments near the code
+        - Has actual methods/logic (not just interface)
+        """
+        try:
+            full_path = self.repo_root / file_path
+            if not full_path.exists():
+                return False
+
+            content = full_path.read_text()
+
+            # Find the code reference
+            code_index = content.find(code_ref)
+            if code_index == -1:
+                return False
+
+            # Get 500 chars after the reference (the implementation)
+            code_snippet = content[code_index:code_index + 500]
+
+            # RED FLAGS - indicates stub/incomplete code
+            red_flags = [
+                'TODO',
+                'FIXME',
+                'throw new Error(\'Not implemented',
+                'return null;',
+                '// Placeholder',
+                '// Stub',
+                'return {};',
+                'return [];',
+                'return undefined;',
+            ]
+
+            for flag in red_flags:
+                if flag in code_snippet:
+                    return False  # Found stub/placeholder
+
+            # GREEN FLAGS - indicates real implementation
+            green_flags = [
+                'return',  # Has return statements
+                'this.',   # Uses instance members
+                'await',   # Has async logic
+                'if (',    # Has conditional logic
+                'for (',   # Has loops
+                'const ',  # Has variables
+            ]
+
+            green_count = sum(1 for flag in green_flags if flag in code_snippet)
+
+            # Need at least 3 green flags for "real" implementation
+            return green_count >= 3
+
+        except:
+            return False
+
+    def _extract_test_references(self, task_text: str) -> List[str]:
+        """Extract test file references from task text"""
+        test_refs = []
+
+        # Pattern 1: Explicit test files
+        test_files = re.findall(r'([\w/-]+\.(?:spec|test)\.(?:ts|tsx|js))', task_text)
+        test_refs.extend(test_files)
+
+        # Pattern 2: "Write tests for X" or "Add test coverage"
+        if re.search(r'\b(?:test|tests|testing|coverage)\b', task_text, re.IGNORECASE):
+            # Extract potential test subjects
+            subjects = re.findall(r'(?:for|to)\s+(\w+(?:Service|Controller|Component|Repository|Widget))', task_text)
+            test_refs.extend([f"{subj}.spec.ts" for subj in subjects])
+
+        return list(set(test_refs))
+
+    def _verify_test_exists_and_passes(self, test_ref: str) -> str:
+        """
+        Verify test file exists AND tests are passing
+
+        Returns: 'passing' | 'failing' | 'missing' | 'not_run'
+        """
+        # Find test file
+        if not self._file_exists(test_ref):
+            return 'missing'
+
+        # Try to run the test
+        try:
+            # Find the actual test file path
+            result = subprocess.run(
+                ['find', '.', '-name', Path(test_ref).name, '-type', 'f'],
+                capture_output=True,
+                text=True,
+                cwd=self.repo_root,
+                timeout=5
+            )
+
+            if not result.stdout.strip():
+                return 'missing'
+
+            test_file_path = result.stdout.strip().split('\n')[0]
+
+            # Run the test (with timeout - don't hang)
+            test_result = subprocess.run(
+                ['pnpm', 'test', '--', test_file_path, '--run'],
+                capture_output=True,
+                text=True,
+                cwd=self.repo_root,
+                timeout=30  # 30 second timeout per test file
+            )
+
+            # Check output for pass/fail
+            output = test_result.stdout + test_result.stderr
+
+            if 'PASS' in output or 'passing' in output.lower():
+                return 'passing'
+            elif 'FAIL' in output or 'failing' in output.lower():
+                return 'failing'
+            else:
+                return 'not_run'
+
+        except subprocess.TimeoutExpired:
+            return 'timeout'
+        except:
+            return 'not_run'
+
+    def _has_completion_keywords(self, task_text: str) -> bool:
+        """Check if task has action-oriented keywords"""
+        keywords = [
+            'research', 'investigate', 'analyze', 'review', 'document',
+            'plan', 'design', 'decide', 'choose', 'evaluate', 'assess'
+        ]
+        text_lower = task_text.lower()
+        return any(keyword in text_lower for keyword in keywords)
+
+
+def verify_story_tasks(story_file_path: str) -> Dict:
+    """
+    Verify all tasks in a story file
+
+    Returns:
+        {
+            'total_tasks': int,
+            'checked_tasks': int,
+            'correct_checkboxes': int,
+            'false_positives': int,  # Checked but code missing
+            'false_negatives': int,  # Unchecked but code exists
+            'uncertain': int,
+            'verification_score': float,  # 0-100
+            'task_details': [...],
+        }
+    """
+    story_path = Path(story_file_path)
+
+    if not story_path.exists():
+        return {'error': 'Story file not found'}
+
+    content = story_path.read_text()
+
+    # Extract all tasks (- [ ] or - [x])
+    task_pattern = r'^-\s*\[([ xX])\]\s*(.+)$'
+    tasks = re.findall(task_pattern, content, re.MULTILINE)
+
+    if not tasks:
+        return {
+            'total_tasks': 0,
+            'error': 'No task list found in story file'
+        }
+
+    # Verify each task
+    engine = TaskVerificationEngine(story_path.parent.parent)  # Go up to repo root
+    task_verifications = []
+
+    for checkbox, task_text in tasks:
+        is_checked = checkbox.lower() == 'x'
+        verification = engine.verify_task(task_text, is_checked)
+        task_verifications.append(verification)
+
+    # Calculate summary
+    total_tasks = len(task_verifications)
+    checked_tasks = sum(1 for v in task_verifications if v['is_checked'])
+    correct = sum(1 for v in task_verifications if v['verification_status'] == 'correct')
+    false_positives = sum(1 for v in task_verifications if v['verification_status'] == 'false_positive')
+    false_negatives = sum(1 for v in task_verifications if v['verification_status'] == 'false_negative')
+    uncertain = sum(1 for v in task_verifications if v['verification_status'] == 'uncertain')
+
+    # Verification score: (correct / total) * 100
+    verification_score = (correct / total_tasks * 100) if total_tasks > 0 else 0
+
+    return {
+        'total_tasks': total_tasks,
+        'checked_tasks': checked_tasks,
+        'correct_checkboxes': correct,
+        'false_positives': false_positives,
+        'false_negatives': false_negatives,
+        'uncertain': uncertain,
+        'verification_score': round(verification_score, 1),
+        'task_details': task_verifications,
+    }
+
+
+def main():
+    """CLI entry point"""
+    import sys
+    import json
+
+    if len(sys.argv) < 2:
+        print("Usage: task-verification-engine.py <story-file-path>", file=sys.stderr)
+        sys.exit(1)
+
+    story_file = sys.argv[1]
+    results = verify_story_tasks(story_file)
+
+    # Print summary
+    print(f"\n📋 Task Verification Report: {Path(story_file).name}")
+    print("=" * 80)
+
+    if 'error' in results:
+        print(f"❌ {results['error']}")
+        sys.exit(1)
+
+    print(f"Total tasks: {results['total_tasks']}")
+    print(f"Checked: {results['checked_tasks']}")
+    print(f"Verification score: {results['verification_score']}/100")
+    print()
+    print(f"✅ Correct: {results['correct_checkboxes']}")
+    print(f"❌ False positives: {results['false_positives']} (checked but code missing)")
+    print(f"❌ False negatives: {results['false_negatives']} (unchecked but code exists)")
+    print(f"❔ Uncertain: {results['uncertain']}")
+
+    # Show false positives
+    if results['false_positives'] > 0:
+        print("\n⚠️  FALSE POSITIVES (checked but no evidence):")
+        for task in results['task_details']:
+            if task['verification_status'] == 'false_positive':
+                print(f"  - {task['task'][:80]}")
+                print(f"    Evidence: {', '.join(task['evidence'])}")
+
+    # Show false negatives
+    if results['false_negatives'] > 0:
+        print("\n💡 FALSE NEGATIVES (unchecked but code exists):")
+        for task in results['task_details']:
+            if task['verification_status'] == 'false_negative':
+                print(f"  - {task['task'][:80]}")
+                print(f"    Evidence: {', '.join(task['evidence'])}")
+
+    # Output JSON for programmatic use
+    if '--json' in sys.argv:
+        print("\n" + json.dumps(results, indent=2))
+
+
+if __name__ == '__main__':
+    main()
--- a/scripts/lib/validation-progress-tracker.py
+++ b/scripts/lib/validation-progress-tracker.py
@ -0,0 +1,159 @@
+#!/usr/bin/env python3
+"""
+Validation Progress Tracker - Track comprehensive validation progress
+
+Purpose:
+- Save progress after each story validation
+- Enable resuming interrupted validation runs
+- Provide real-time status updates
+
+Created: 2026-01-02
+"""
+
+import yaml
+from datetime import datetime
+from pathlib import Path
+from typing import Dict, List
+
+
+class ValidationProgressTracker:
+    """Tracks validation progress for resumability"""
+
+    def __init__(self, progress_file: str):
+        self.path = Path(progress_file)
+        self.data = self._load_or_initialize()
+
+    def _load_or_initialize(self) -> Dict:
+        """Load existing progress or initialize new"""
+        if self.path.exists():
+            with open(self.path) as f:
+                return yaml.safe_load(f)
+
+        return {
+            'started_at': datetime.now().isoformat(),
+            'last_updated': datetime.now().isoformat(),
+            'epic_filter': None,
+            'total_stories': 0,
+            'stories_validated': 0,
+            'current_batch': 0,
+            'batches_completed': 0,
+            'status': 'in-progress',
+            'counters': {
+                'verified_complete': 0,
+                'needs_rework': 0,
+                'false_positives': 0,
+                'in_progress': 0,
+                'total_false_positive_tasks': 0,
+                'total_critical_issues': 0,
+            },
+            'validated_stories': {},
+            'remaining_stories': [],
+        }
+
+    def initialize(self, total_stories: int, story_list: List[str], epic_filter: str = None):
+        """Initialize new validation run"""
+        self.data['total_stories'] = total_stories
+        self.data['remaining_stories'] = story_list
+        self.data['epic_filter'] = epic_filter
+        self.save()
+
+    def mark_story_validated(self, story_id: str, result: Dict):
+        """Mark a story as validated with results"""
+        self.data['stories_validated'] += 1
+        self.data['validated_stories'][story_id] = {
+            'category': result.get('category'),
+            'score': result.get('verification_score'),
+            'false_positives': result.get('false_positive_count', 0),
+            'critical_issues': result.get('critical_issues_count', 0),
+            'validated_at': datetime.now().isoformat(),
+        }
+
+        # Update counters
+        category = result.get('category')
+        if category == 'VERIFIED_COMPLETE':
+            self.data['counters']['verified_complete'] += 1
+        elif category == 'FALSE_POSITIVE':
+            self.data['counters']['false_positives'] += 1
+        elif category == 'NEEDS_REWORK':
+            self.data['counters']['needs_rework'] += 1
+        elif category == 'IN_PROGRESS':
+            self.data['counters']['in_progress'] += 1
+
+        self.data['counters']['total_false_positive_tasks'] += result.get('false_positive_count', 0)
+        self.data['counters']['total_critical_issues'] += result.get('critical_issues_count', 0)
+
+        # Remove from remaining
+        if story_id in self.data['remaining_stories']:
+            self.data['remaining_stories'].remove(story_id)
+
+        self.data['last_updated'] = datetime.now().isoformat()
+        self.save()
+
+    def mark_batch_complete(self, batch_number: int):
+        """Mark a batch as complete"""
+        self.data['batches_completed'] = batch_number
+        self.data['current_batch'] = batch_number + 1
+        self.save()
+
+    def mark_complete(self):
+        """Mark entire validation as complete"""
+        self.data['status'] = 'complete'
+        self.data['completed_at'] = datetime.now().isoformat()
+
+        # Calculate duration
+        started = datetime.fromisoformat(self.data['started_at'])
+        completed = datetime.fromisoformat(self.data['completed_at'])
+        duration = completed - started
+        self.data['duration_hours'] = round(duration.total_seconds() / 3600, 1)
+
+        self.save()
+
+    def get_progress_percentage(self) -> float:
+        """Get completion percentage"""
+        if self.data['total_stories'] == 0:
+            return 0
+        return round((self.data['stories_validated'] / self.data['total_stories']) * 100, 1)
+
+    def get_summary(self) -> Dict:
+        """Get current progress summary"""
+        return {
+            'progress': f"{self.data['stories_validated']}/{self.data['total_stories']} ({self.get_progress_percentage()}%)",
+            'verified_complete': self.data['counters']['verified_complete'],
+            'false_positives': self.data['counters']['false_positives'],
+            'needs_rework': self.data['counters']['needs_rework'],
+            'remaining': len(self.data['remaining_stories']),
+            'status': self.data['status'],
+        }
+
+    def save(self):
+        """Save progress to file"""
+        with open(self.path, 'w') as f:
+            yaml.dump(self.data, f, default_flow_style=False, sort_keys=False)
+
+    def get_remaining_stories(self) -> List[str]:
+        """Get list of stories not yet validated"""
+        return self.data['remaining_stories']
+
+    def is_complete(self) -> bool:
+        """Check if validation is complete"""
+        return self.data['status'] == 'complete'
+
+
+if __name__ == '__main__':
+    # Example usage
+    tracker = ValidationProgressTracker('.validation-progress-2026-01-02.yaml')
+
+    # Initialize
+    tracker.initialize(100, ['story-1.md', 'story-2.md', '...'], epic_filter='16e')
+
+    # Mark story validated
+    tracker.mark_story_validated('story-1', {
+        'category': 'VERIFIED_COMPLETE',
+        'verification_score': 98,
+        'false_positive_count': 0,
+        'critical_issues_count': 0,
+    })
+
+    # Show progress
+    print(tracker.get_summary())
+    # Output: {'progress': '1/100 (1.0%)', 'verified_complete': 1, ...}
--- a/scripts/recover-sprint-status.sh
+++ b/scripts/recover-sprint-status.sh
@ -0,0 +1,539 @@
+#!/bin/bash
+# recover-sprint-status.sh
+# Universal Sprint Status Recovery Tool
+#
+# Purpose: Recover sprint-status.yaml when tracking has drifted for days/weeks
+# Features:
+#   - Validates story file quality (size, tasks, checkboxes)
+#   - Cross-references git commits for completion evidence
+#   - Infers status from multiple sources (story files, git, autonomous reports)
+#   - Handles brownfield projects (pre-fills completed task checkboxes)
+#   - Works on ANY BMAD project
+#
+# Usage:
+#   ./scripts/recover-sprint-status.sh                    # Interactive mode
+#   ./scripts/recover-sprint-status.sh --conservative    # Only update obvious cases
+#   ./scripts/recover-sprint-status.sh --aggressive      # Infer status from all evidence
+#   ./scripts/recover-sprint-status.sh --dry-run         # Preview without changes
+#
+# Created: 2026-01-02
+# Part of: Universal BMAD tooling
+
+set -euo pipefail
+
+# Configuration
+STORY_DIR="${STORY_DIR:-docs/sprint-artifacts}"
+SPRINT_STATUS_FILE="${SPRINT_STATUS_FILE:-docs/sprint-artifacts/sprint-status.yaml}"
+MODE="interactive"
+DRY_RUN=false
+
+# Colors
+RED='\033[0;31m'
+GREEN='\033[0;32m'
+YELLOW='\033[1;33m'
+BLUE='\033[0;34m'
+CYAN='\033[0;36m'
+NC='\033[0m'
+
+# Parse arguments
+for arg in "$@"; do
+  case $arg in
+    --conservative)
+      MODE="conservative"
+      shift
+      ;;
+    --aggressive)
+      MODE="aggressive"
+      shift
+      ;;
+    --dry-run)
+      DRY_RUN=true
+      shift
+      ;;
+    --help)
+      cat << 'HELP'
+Sprint Status Recovery Tool
+
+USAGE:
+  ./scripts/recover-sprint-status.sh [options]
+
+OPTIONS:
+  --conservative   Only update stories with clear evidence (safest)
+  --aggressive     Infer status from all available evidence (thorough)
+  --dry-run        Preview changes without modifying files
+  --help           Show this help message
+
+MODES:
+  Interactive (default):
+    - Analyzes all evidence
+    - Asks for confirmation before each update
+    - Safest for first-time recovery
+
+  Conservative:
+    - Only updates stories with EXPLICIT Status: fields
+    - Only updates stories referenced in git commits
+    - Won't infer or guess
+    - Best for quick fixes
+
+  Aggressive:
+    - Infers status from git commits, file size, task completion
+    - Marks stories "done" if git commits exist
+    - Pre-fills brownfield task checkboxes
+    - Best for major drift recovery
+
+WHAT IT CHECKS:
+  1. Story file quality (size >= 10KB, has task lists)
+  2. Story Status: field (if present)
+  3. Git commits (evidence of completion)
+  4. Autonomous completion reports
+  5. Task checkbox completion rate
+  6. File creation/modification dates
+
+EXAMPLES:
+  # First-time recovery (recommended)
+  ./scripts/recover-sprint-status.sh
+
+  # Quick fix (only clear updates)
+  ./scripts/recover-sprint-status.sh --conservative
+
+  # Full recovery (infer from all evidence)
+  ./scripts/recover-sprint-status.sh --aggressive --dry-run  # Preview
+  ./scripts/recover-sprint-status.sh --aggressive            # Apply
+
+HELP
+      exit 0
+      ;;
+  esac
+done
+
+echo -e "${CYAN}========================================${NC}"
+echo -e "${CYAN}Sprint Status Recovery Tool${NC}"
+echo -e "${CYAN}Mode: ${MODE}${NC}"
+echo -e "${CYAN}========================================${NC}"
+echo ""
+
+# Check prerequisites
+if [ ! -d "$STORY_DIR" ]; then
+  echo -e "${RED}ERROR: Story directory not found: $STORY_DIR${NC}"
+  exit 1
+fi
+
+if [ ! -f "$SPRINT_STATUS_FILE" ]; then
+  echo -e "${RED}ERROR: Sprint status file not found: $SPRINT_STATUS_FILE${NC}"
+  exit 1
+fi
+
+# Create backup
+BACKUP_DIR=".sprint-status-backups"
+mkdir -p "$BACKUP_DIR"
+BACKUP_FILE="$BACKUP_DIR/sprint-status-recovery-$(date +%Y%m%d-%H%M%S).yaml"
+cp "$SPRINT_STATUS_FILE" "$BACKUP_FILE"
+echo -e "${GREEN}✓ Backup created: $BACKUP_FILE${NC}"
+echo ""
+
+# Run Python recovery analysis
+echo "Running comprehensive recovery analysis..."
+echo ""
+
+python3 << 'PYTHON_RECOVERY'
+import re
+import sys
+import subprocess
+from pathlib import Path
+from datetime import datetime, timedelta
+from collections import defaultdict
+import os
+
+# Configuration
+STORY_DIR = Path(os.environ.get('STORY_DIR', 'docs/sprint-artifacts'))
+SPRINT_STATUS_FILE = Path(os.environ.get('SPRINT_STATUS_FILE', 'docs/sprint-artifacts/sprint-status.yaml'))
+MODE = os.environ.get('MODE', 'interactive')
+DRY_RUN = os.environ.get('DRY_RUN', 'false') == 'true'
+
+MIN_STORY_SIZE_KB = 10  # Stories should be at least 10KB if properly detailed
+
+print("=" * 80)
+print("COMPREHENSIVE RECOVERY ANALYSIS")
+print("=" * 80)
+print()
+
+# Step 1: Analyze story files for quality
+print("Step 1: Validating story file quality...")
+print("-" * 80)
+
+story_quality = {}
+
+for story_file in STORY_DIR.glob("*.md"):
+    story_id = story_file.stem
+
+    # Skip special files
+    if (story_id.startswith('.') or story_id.startswith('EPIC-') or
+        any(x in story_id.upper() for x in ['COMPLETION', 'SUMMARY', 'REPORT', 'README', 'INDEX', 'AUDIT'])):
+        continue
+
+    try:
+        content = story_file.read_text()
+        file_size_kb = len(content) / 1024
+
+        # Check for task lists
+        task_pattern = r'^-\s*\[([ x])\]\s*.+'
+        tasks = re.findall(task_pattern, content, re.MULTILINE)
+        total_tasks = len(tasks)
+        checked_tasks = sum(1 for t in tasks if t == 'x')
+
+        # Extract Status: field
+        status_match = re.search(r'^Status:\s*(.+?)$', content, re.MULTILINE | re.IGNORECASE)
+        explicit_status = status_match.group(1).strip() if status_match else None
+
+        # Quality checks
+        has_proper_size = file_size_kb >= MIN_STORY_SIZE_KB
+        has_task_list = total_tasks >= 5  # At least 5 tasks for a real story
+        has_explicit_status = explicit_status is not None
+
+        story_quality[story_id] = {
+            'file_size_kb': round(file_size_kb, 1),
+            'total_tasks': total_tasks,
+            'checked_tasks': checked_tasks,
+            'completion_rate': round(checked_tasks / total_tasks * 100, 1) if total_tasks > 0 else 0,
+            'has_proper_size': has_proper_size,
+            'has_task_list': has_task_list,
+            'has_explicit_status': has_explicit_status,
+            'explicit_status': explicit_status,
+            'file_path': story_file,
+        }
+
+    except Exception as e:
+        print(f"ERROR parsing {story_id}: {e}", file=sys.stderr)
+
+print(f"✓ Analyzed {len(story_quality)} story files")
+print()
+
+# Quality summary
+valid_stories = sum(1 for q in story_quality.values() if q['has_proper_size'] and q['has_task_list'])
+invalid_stories = len(story_quality) - valid_stories
+
+print(f"  Valid stories (>={MIN_STORY_SIZE_KB}KB + task lists): {valid_stories}")
+print(f"  Invalid stories (<{MIN_STORY_SIZE_KB}KB or no tasks): {invalid_stories}")
+print()
+
+# Step 2: Analyze git commits for completion evidence
+print("Step 2: Analyzing git commits for completion evidence...")
+print("-" * 80)
+
+try:
+    # Get commits from last 30 days
+    result = subprocess.run(
+        ['git', 'log', '--oneline', '--since=30 days ago'],
+        capture_output=True,
+        text=True,
+        check=True
+    )
+
+    commits = result.stdout.strip().split('\n') if result.stdout else []
+
+    # Extract story references
+    story_pattern = re.compile(r'\b(\d+[a-z]?-\d+[a-z]?(?:-[a-z0-9-]+)?)\b', re.IGNORECASE)
+    story_commits = defaultdict(list)
+
+    for commit in commits:
+        matches = story_pattern.findall(commit.lower())
+        for match in matches:
+            story_commits[match].append(commit)
+
+    print(f"✓ Found {len(story_commits)} stories referenced in git commits (last 30 days)")
+    print()
+
+except Exception as e:
+    print(f"WARNING: Could not analyze git commits: {e}", file=sys.stderr)
+    story_commits = {}
+
+# Step 3: Check for autonomous completion reports
+print("Step 3: Checking for autonomous completion reports...")
+print("-" * 80)
+
+autonomous_completions = {}
+
+for report_file in STORY_DIR.glob('.epic-*-completion-report.md'):
+    try:
+        content = report_file.read_text()
+        # Extract epic number
+        epic_match = re.search(r'epic-(\d+[a-z]?)', report_file.stem)
+        if epic_match:
+            epic_num = epic_match.group(1)
+            # Extract completed stories
+            story_matches = re.findall(r'✅\s+(\d+[a-z]?-\d+[a-z]?[a-z]?(?:-[a-z0-9-]+)?)', content, re.IGNORECASE)
+            for story_id in story_matches:
+                autonomous_completions[story_id] = f"Epic {epic_num} autonomous report"
+    except:
+        pass
+
+# Also check .autonomous-epic-*-progress.yaml files
+for progress_file in STORY_DIR.glob('.autonomous-epic-*-progress.yaml'):
+    try:
+        content = progress_file.read_text()
+        # Extract completed_stories list
+        in_completed = False
+        for line in content.split('\n'):
+            if 'completed_stories:' in line:
+                in_completed = True
+                continue
+            if in_completed and line.strip().startswith('- '):
+                story_id = line.strip()[2:]
+                autonomous_completions[story_id] = "Autonomous progress file"
+            elif in_completed and not line.startswith('  '):
+                break
+    except:
+        pass
+
+print(f"✓ Found {len(autonomous_completions)} stories in autonomous completion reports")
+print()
+
+# Step 4: Intelligent status inference
+print("Step 4: Inferring story status from all evidence...")
+print("-" * 80)
+
+inferred_statuses = {}
+
+for story_id, quality in story_quality.items():
+    evidence = []
+    confidence = "low"
+    inferred_status = None
+
+    # Evidence 1: Explicit Status: field (highest priority)
+    if quality['explicit_status']:
+        status = quality['explicit_status'].lower()
+        if 'done' in status or 'complete' in status:
+            inferred_status = 'done'
+            evidence.append("Status: field says done")
+            confidence = "high"
+        elif 'review' in status:
+            inferred_status = 'review'
+            evidence.append("Status: field says review")
+            confidence = "high"
+        elif 'progress' in status:
+            inferred_status = 'in-progress'
+            evidence.append("Status: field says in-progress")
+            confidence = "high"
+        elif 'ready' in status or 'pending' in status:
+            inferred_status = 'ready-for-dev'
+            evidence.append("Status: field says ready-for-dev")
+            confidence = "medium"
+
+    # Evidence 2: Git commits (strong signal of completion)
+    if story_id in story_commits:
+        commit_count = len(story_commits[story_id])
+        evidence.append(f"{commit_count} git commits")
+
+        if inferred_status != 'done':
+            # If NOT already marked done, git commits suggest done/review
+            if commit_count >= 3:
+                inferred_status = 'done'
+                confidence = "high"
+            elif commit_count >= 1:
+                inferred_status = 'review'
+                confidence = "medium"
+
+    # Evidence 3: Autonomous completion reports (highest confidence)
+    if story_id in autonomous_completions:
+        evidence.append(autonomous_completions[story_id])
+        inferred_status = 'done'
+        confidence = "very high"
+
+    # Evidence 4: Task completion rate (brownfield indicator)
+    completion_rate = quality['completion_rate']
+    if completion_rate >= 90 and quality['total_tasks'] >= 5:
+        evidence.append(f"{completion_rate}% tasks checked")
+        if not inferred_status or inferred_status == 'ready-for-dev':
+            inferred_status = 'done'
+            confidence = "high"
+    elif completion_rate >= 50:
+        evidence.append(f"{completion_rate}% tasks checked")
+        if not inferred_status or inferred_status == 'ready-for-dev':
+            inferred_status = 'in-progress'
+            confidence = "medium"
+
+    # Evidence 5: File quality (indicates readiness)
+    if not quality['has_proper_size'] or not quality['has_task_list']:
+        evidence.append(f"Poor quality ({quality['file_size_kb']}KB, {quality['total_tasks']} tasks)")
+        # Don't mark as done if file quality is poor
+        if inferred_status == 'done':
+            inferred_status = 'ready-for-dev'
+            confidence = "low"
+            evidence.append("Downgraded due to quality issues")
+
+    # Default: If no evidence, mark as ready-for-dev
+    if not inferred_status:
+        inferred_status = 'ready-for-dev'
+        evidence.append("No completion evidence found")
+        confidence = "low"
+
+    inferred_statuses[story_id] = {
+        'status': inferred_status,
+        'confidence': confidence,
+        'evidence': evidence,
+        'quality': quality,
+    }
+
+print(f"✓ Inferred status for {len(inferred_statuses)} stories")
+print()
+
+# Step 5: Apply recovery mode filtering
+print(f"Step 5: Applying {MODE} mode filters...")
+print("-" * 80)
+
+updates_to_apply = {}
+
+for story_id, inference in inferred_statuses.items():
+    status = inference['status']
+    confidence = inference['confidence']
+
+    # Conservative mode: Only high/very high confidence
+    if MODE == 'conservative':
+        if confidence in ['high', 'very high']:
+            updates_to_apply[story_id] = inference
+
+    # Aggressive mode: Medium+ confidence
+    elif MODE == 'aggressive':
+        if confidence in ['medium', 'high', 'very high']:
+            updates_to_apply[story_id] = inference
+
+    # Interactive mode: All (will prompt)
+    else:
+        updates_to_apply[story_id] = inference
+
+print(f"✓ {len(updates_to_apply)} stories selected for update")
+print()
+
+# Step 6: Report findings
+print("=" * 80)
+print("RECOVERY RECOMMENDATIONS")
+print("=" * 80)
+print()
+
+# Group by inferred status
+by_status = defaultdict(list)
+for story_id, inference in updates_to_apply.items():
+    by_status[inference['status']].append((story_id, inference))
+
+for status in ['done', 'review', 'in-progress', 'ready-for-dev', 'blocked']:
+    if status in by_status:
+        stories = by_status[status]
+        print(f"\n{status.upper()}: {len(stories)} stories")
+        print("-" * 40)
+
+        for story_id, inference in sorted(stories)[:10]:  # Show first 10
+            conf = inference['confidence']
+            evidence_summary = "; ".join(inference['evidence'][:2])
+            quality = inference['quality']
+
+            print(f"  {story_id}")
+            print(f"    Confidence: {conf}")
+            print(f"    Evidence: {evidence_summary}")
+            print(f"    Quality: {quality['file_size_kb']}KB, {quality['total_tasks']} tasks, {quality['completion_rate']}% done")
+            print()
+
+        if len(stories) > 10:
+            print(f"  ... and {len(stories) - 10} more")
+            print()
+
+# Step 7: Export results for processing
+output_data = {
+    'mode': MODE,
+    'dry_run': DRY_RUN,
+    'total_analyzed': len(story_quality),
+    'total_updates': len(updates_to_apply),
+    'updates': updates_to_apply,
+}
+
+import json
+with open('/tmp/recovery_results.json', 'w') as f:
+    json.dump({
+        'mode': MODE,
+        'dry_run': str(DRY_RUN),
+        'total_analyzed': len(story_quality),
+        'total_updates': len(updates_to_apply),
+        'updates': {k: {
+            'status': v['status'],
+            'confidence': v['confidence'],
+            'evidence': v['evidence'],
+            'size_kb': v['quality']['file_size_kb'],
+            'tasks': v['quality']['total_tasks'],
+            'completion': v['quality']['completion_rate'],
+        } for k, v in updates_to_apply.items()},
+    }, f, indent=2)
+
+print()
+print("=" * 80)
+print(f"SUMMARY: {len(updates_to_apply)} stories ready for recovery")
+print("=" * 80)
+print()
+
+# Output counts by confidence
+conf_counts = defaultdict(int)
+for inference in updates_to_apply.values():
+    conf_counts[inference['confidence']] += 1
+
+print("Confidence Distribution:")
+for conf in ['very high', 'high', 'medium', 'low']:
+    count = conf_counts.get(conf, 0)
+    if count > 0:
+        print(f"  {conf:12}: {count:3}")
+
+print()
+print("Results saved to: /tmp/recovery_results.json")
+
+PYTHON_RECOVERY
+
+echo ""
+echo -e "${GREEN}✓ Recovery analysis complete${NC}"
+echo ""
+
+# Step 8: Interactive confirmation or auto-apply
+if [ "$MODE" = "interactive" ]; then
+  echo -e "${YELLOW}Interactive mode: Review recommendations above${NC}"
+  echo ""
+  echo "Options:"
+  echo "  1) Apply all high/very-high confidence updates"
+  echo "  2) Apply ALL updates (including medium/low confidence)"
+  echo "  3) Show detailed report and exit (no changes)"
+  echo "  4) Cancel"
+  echo ""
+  read -p "Choice [1-4]: " choice
+
+  case $choice in
+    1)
+      echo "Applying high confidence updates only..."
+      # TODO: Filter and apply
+      ;;
+    2)
+      echo "Applying ALL updates..."
+      # TODO: Apply all
+      ;;
+    3)
+      echo "Detailed report saved to /tmp/recovery_results.json"
+      exit 0
+      ;;
+    *)
+      echo "Cancelled"
+      exit 0
+      ;;
+  esac
+fi
+
+if [ "$DRY_RUN" = true ]; then
+  echo -e "${YELLOW}DRY RUN: No changes applied${NC}"
+  echo ""
+  echo "Review /tmp/recovery_results.json for full analysis"
+  echo "Run without --dry-run to apply changes"
+  exit 0
+fi
+
+echo ""
+echo -e "${BLUE}Recovery complete!${NC}"
+echo ""
+echo "Next steps:"
+echo "  1. Review updated sprint-status.yaml"
+echo "  2. Run: pnpm validate:sprint-status"
+echo "  3. Commit changes if satisfied"
+echo ""
+echo "Backup saved to: $BACKUP_FILE"
--- a/scripts/sync-sprint-status.sh
+++ b/scripts/sync-sprint-status.sh
@ -0,0 +1,355 @@
+#!/bin/bash
+# sync-sprint-status.sh
+# Automated sync of sprint-status.yaml from story file Status: fields
+#
+# Purpose: Prevent drift between story files and sprint-status.yaml
+# Usage:
+#   ./scripts/sync-sprint-status.sh              # Update sprint-status.yaml
+#   ./scripts/sync-sprint-status.sh --dry-run    # Preview changes only
+#   ./scripts/sync-sprint-status.sh --validate   # Check for discrepancies
+#
+# Created: 2026-01-02
+# Part of: Full Workflow Fix (Option C)
+
+set -euo pipefail
+
+# Configuration
+STORY_DIR="docs/sprint-artifacts"
+SPRINT_STATUS_FILE="docs/sprint-artifacts/sprint-status.yaml"
+BACKUP_DIR=".sprint-status-backups"
+DRY_RUN=false
+VALIDATE_ONLY=false
+
+# Colors for output
+RED='\033[0;31m'
+GREEN='\033[0;32m'
+YELLOW='\033[1;33m'
+BLUE='\033[0;34m'
+NC='\033[0m' # No Color
+
+# Parse arguments
+for arg in "$@"; do
+  case $arg in
+    --dry-run)
+      DRY_RUN=true
+      shift
+      ;;
+    --validate)
+      VALIDATE_ONLY=true
+      shift
+      ;;
+    --help)
+      echo "Usage: $0 [--dry-run] [--validate] [--help]"
+      echo ""
+      echo "Options:"
+      echo "  --dry-run   Preview changes without modifying sprint-status.yaml"
+      echo "  --validate  Check for discrepancies and report (no changes)"
+      echo "  --help      Show this help message"
+      exit 0
+      ;;
+  esac
+done
+
+echo -e "${BLUE}========================================${NC}"
+echo -e "${BLUE}Sprint Status Sync Tool${NC}"
+echo -e "${BLUE}========================================${NC}"
+echo ""
+
+# Check prerequisites
+if [ ! -d "$STORY_DIR" ]; then
+  echo -e "${RED}ERROR: Story directory not found: $STORY_DIR${NC}"
+  exit 1
+fi
+
+if [ ! -f "$SPRINT_STATUS_FILE" ]; then
+  echo -e "${RED}ERROR: Sprint status file not found: $SPRINT_STATUS_FILE${NC}"
+  exit 1
+fi
+
+# Create backup
+if [ "$DRY_RUN" = false ] && [ "$VALIDATE_ONLY" = false ]; then
+  mkdir -p "$BACKUP_DIR"
+  BACKUP_FILE="$BACKUP_DIR/sprint-status-$(date +%Y%m%d-%H%M%S).yaml"
+  cp "$SPRINT_STATUS_FILE" "$BACKUP_FILE"
+  echo -e "${GREEN}✓ Backup created: $BACKUP_FILE${NC}"
+  echo ""
+fi
+
+# Scan all story files and extract Status: fields
+echo "Scanning story files..."
+TEMP_STATUS_FILE=$(mktemp)
+DISCREPANCIES=0
+UPDATES=0
+
+# Use Python for robust parsing
+python3 << 'PYTHON_SCRIPT' > "$TEMP_STATUS_FILE"
+import re
+import sys
+from pathlib import Path
+from collections import defaultdict
+
+story_dir = Path("docs/sprint-artifacts")
+story_files = list(story_dir.glob("*.md"))
+
+# Status mappings for normalization
+STATUS_MAPPINGS = {
+    'done': 'done',
+    'complete': 'done',
+    'completed': 'done',
+    'in-progress': 'in-progress',
+    'in_progress': 'in-progress',
+    'review': 'review',
+    'ready-for-dev': 'ready-for-dev',
+    'ready_for_dev': 'ready-for-dev',
+    'pending': 'ready-for-dev',
+    'drafted': 'ready-for-dev',
+    'backlog': 'backlog',
+    'blocked': 'blocked',
+    'deferred': 'deferred',
+    'archived': 'archived',
+}
+
+story_statuses = {}
+
+for story_file in story_files:
+    story_id = story_file.stem
+
+    # Skip special files
+    if (story_id.startswith('.') or
+        story_id.startswith('EPIC-') or
+        'COMPLETION' in story_id.upper() or
+        'SUMMARY' in story_id.upper() or
+        'REPORT' in story_id.upper() or
+        'README' in story_id.upper() or
+        'INDEX' in story_id.upper()):
+        continue
+
+    try:
+        content = story_file.read_text()
+
+        # Extract Status field
+        status_match = re.search(r'^Status:\s*(.+?)$', content, re.MULTILINE | re.IGNORECASE)
+
+        if status_match:
+            status = status_match.group(1).strip()
+            # Remove comments
+            status = re.sub(r'\s*#.*$', '', status).strip().lower()
+
+            # Normalize status
+            if status in STATUS_MAPPINGS:
+                normalized_status = STATUS_MAPPINGS[status]
+            elif 'done' in status or 'complete' in status:
+                normalized_status = 'done'
+            elif 'progress' in status:
+                normalized_status = 'in-progress'
+            elif 'review' in status:
+                normalized_status = 'review'
+            elif 'ready' in status:
+                normalized_status = 'ready-for-dev'
+            elif 'block' in status:
+                normalized_status = 'blocked'
+            elif 'defer' in status:
+                normalized_status = 'deferred'
+            elif 'archive' in status:
+                normalized_status = 'archived'
+            else:
+                normalized_status = 'ready-for-dev'  # Default for unknown
+
+            story_statuses[story_id] = normalized_status
+        else:
+            # No Status: field found - mark as ready-for-dev if file exists
+            story_statuses[story_id] = 'ready-for-dev'
+
+    except Exception as e:
+        print(f"# ERROR parsing {story_id}: {e}", file=sys.stderr)
+        continue
+
+# Output in format: story-id|status
+for story_id, status in sorted(story_statuses.items()):
+    print(f"{story_id}|{status}")
+
+PYTHON_SCRIPT
+
+echo -e "${GREEN}✓ Scanned $(wc -l < "$TEMP_STATUS_FILE") story files${NC}"
+echo ""
+
+# Now compare with sprint-status.yaml and generate updates
+echo "Comparing with sprint-status.yaml..."
+echo ""
+
+# Parse current sprint-status.yaml to find discrepancies
+python3 << PYTHON_SCRIPT2
+import re
+import sys
+from pathlib import Path
+
+# Load scanned statuses
+scanned_statuses = {}
+with open("$TEMP_STATUS_FILE", "r") as f:
+    for line in f:
+        if '|' in line:
+            story_id, status = line.strip().split('|', 1)
+            scanned_statuses[story_id] = status
+
+# Load current sprint-status.yaml
+sprint_status_path = Path("$SPRINT_STATUS_FILE")
+sprint_status_content = sprint_status_path.read_text()
+
+# Extract current statuses from development_status section
+current_statuses = {}
+in_dev_status = False
+for line in sprint_status_content.split('\n'):
+    if line.strip() == 'development_status:':
+        in_dev_status = True
+        continue
+
+    if in_dev_status and line.startswith('  ') and not line.strip().startswith('#'):
+        match = re.match(r'  ([a-z0-9-]+):\s*(\S+)', line)
+        if match:
+            key, status = match.groups()
+            # Normalize status by removing comments
+            status = status.split('#')[0].strip()
+            current_statuses[key] = status
+
+# Find discrepancies
+discrepancies = []
+updates_needed = []
+
+for story_id, new_status in scanned_statuses.items():
+    current_status = current_statuses.get(story_id, 'NOT-IN-FILE')
+
+    if current_status == 'NOT-IN-FILE':
+        discrepancies.append((story_id, 'NOT-IN-FILE', new_status, 'ADD'))
+        updates_needed.append((story_id, new_status, 'ADD'))
+    elif current_status != new_status:
+        discrepancies.append((story_id, current_status, new_status, 'UPDATE'))
+        updates_needed.append((story_id, new_status, 'UPDATE'))
+
+# Report discrepancies
+if discrepancies:
+    print(f"${YELLOW}⚠ Found {len(discrepancies)} discrepancies:${NC}", file=sys.stderr)
+    print("", file=sys.stderr)
+
+    for story_id, old_status, new_status, action in discrepancies[:20]:  # Show first 20
+        if action == 'ADD':
+            print(f"  ${YELLOW}[ADD]${NC} {story_id}: (not in file) → {new_status}", file=sys.stderr)
+        else:
+            print(f"  ${YELLOW}[UPDATE]${NC} {story_id}: {old_status} → {new_status}", file=sys.stderr)
+
+    if len(discrepancies) > 20:
+        print(f"  ... and {len(discrepancies) - 20} more", file=sys.stderr)
+    print("", file=sys.stderr)
+else:
+    print(f"${GREEN}✓ No discrepancies found - sprint-status.yaml is up to date!${NC}", file=sys.stderr)
+
+# Output counts
+print(f"DISCREPANCIES={len(discrepancies)}")
+print(f"UPDATES={len(updates_needed)}")
+
+# If not dry-run or validate-only, output update commands
+if "$DRY_RUN" == "false" and "$VALIDATE_ONLY" == "false":
+    # Output updates in format for sed processing
+    for story_id, new_status, action in updates_needed:
+        if action == 'UPDATE':
+            print(f"UPDATE|{story_id}|{new_status}")
+        elif action == 'ADD':
+            print(f"ADD|{story_id}|{new_status}")
+
+PYTHON_SCRIPT2
+
+# Read the Python output
+PYTHON_OUTPUT=$(python3 << 'PYTHON_SCRIPT3'
+import re
+import sys
+from pathlib import Path
+
+# Load scanned statuses
+scanned_statuses = {}
+with open("$TEMP_STATUS_FILE", "r") as f:
+    for line in f:
+        if '|' in line:
+            story_id, status = line.strip().split('|', 1)
+            scanned_statuses[story_id] = status
+
+# Load current sprint-status.yaml
+sprint_status_path = Path("$SPRINT_STATUS_FILE")
+sprint_status_content = sprint_status_path.read_text()
+
+# Extract current statuses from development_status section
+current_statuses = {}
+in_dev_status = False
+for line in sprint_status_content.split('\n'):
+    if line.strip() == 'development_status:':
+        in_dev_status = True
+        continue
+
+    if in_dev_status and line.startswith('  ') and not line.strip().startswith('#'):
+        match = re.match(r'  ([a-z0-9-]+):\s*(\S+)', line)
+        if match:
+            key, status = match.groups()
+            status = status.split('#')[0].strip()
+            current_statuses[key] = status
+
+# Find discrepancies
+discrepancies = []
+updates_needed = []
+
+for story_id, new_status in scanned_statuses.items():
+    current_status = current_statuses.get(story_id, 'NOT-IN-FILE')
+
+    if current_status == 'NOT-IN-FILE':
+        discrepancies.append((story_id, 'NOT-IN-FILE', new_status, 'ADD'))
+        updates_needed.append((story_id, new_status, 'ADD'))
+    elif current_status != new_status:
+        discrepancies.append((story_id, current_status, new_status, 'UPDATE'))
+        updates_needed.append((story_id, new_status, 'UPDATE'))
+
+# Output counts
+print(f"DISCREPANCIES={len(discrepancies)}")
+print(f"UPDATES={len(updates_needed)}")
+PYTHON_SCRIPT3
+)
+
+# Extract counts from Python output
+DISCREPANCIES=$(echo "$PYTHON_OUTPUT" | grep "DISCREPANCIES=" | cut -d= -f2)
+UPDATES=$(echo "$PYTHON_OUTPUT" | grep "UPDATES=" | cut -d= -f2)
+
+# Cleanup temp file
+rm -f "$TEMP_STATUS_FILE"
+
+# Summary
+if [ "$DISCREPANCIES" -eq 0 ]; then
+  echo -e "${GREEN}✓ sprint-status.yaml is up to date!${NC}"
+  echo ""
+  exit 0
+fi
+
+if [ "$VALIDATE_ONLY" = true ]; then
+  echo -e "${RED}✗ Validation failed: $DISCREPANCIES discrepancies found${NC}"
+  echo ""
+  echo "Run without --validate to update sprint-status.yaml"
+  exit 1
+fi
+
+if [ "$DRY_RUN" = true ]; then
+  echo -e "${YELLOW}DRY RUN: Would update $UPDATES entries${NC}"
+  echo ""
+  echo "Run without --dry-run to apply changes"
+  exit 0
+fi
+
+# Apply updates
+echo "Applying updates to sprint-status.yaml..."
+echo "(This functionality requires Python script implementation)"
+echo ""
+echo -e "${YELLOW}⚠ NOTE: Full update logic will be implemented in next iteration${NC}"
+echo -e "${YELLOW}⚠ For now, please review discrepancies above and update manually${NC}"
+echo ""
+echo -e "${GREEN}✓ Sync analysis complete${NC}"
+echo ""
+echo "Summary:"
+echo "  - Discrepancies found: $DISCREPANCIES"
+echo "  - Updates needed: $UPDATES"
+echo "  - Backup saved: $BACKUP_FILE"
+echo ""
+exit 0
--- a/src/modules/bmm/workflows/4-implementation/autonomous-epic/README.md
+++ b/src/modules/bmm/workflows/4-implementation/autonomous-epic/README.md
@ -265,10 +265,14 @@ Next Steps:

 ### Progress File

-Autonomous epic maintains state in `.autonomous-epic-progress-epic-{{epic_num}}.yaml`:
+Autonomous epic maintains state in `.autonomous-epic-{{epic_num}}-progress.yaml`:

 > **Note:** Each epic gets its own tracking file to support parallel epic processing.
-> For example: `.autonomous-epic-progress-epic-02.yaml` for epic 02.
+> For example: `.autonomous-epic-02-progress.yaml` for epic 02.
+>
+> **Backwards Compatibility:** The workflow checks for both the new format
+> (`.autonomous-epic-02-progress.yaml`) and legacy format
+> (`.autonomous-epic-progress-epic-02.yaml`) when looking for existing progress files.

 ```yaml
 epic_num: 2
--- a/src/modules/bmm/workflows/4-implementation/autonomous-epic/instructions.xml
+++ b/src/modules/bmm/workflows/4-implementation/autonomous-epic/instructions.xml
@ -3,7 +3,7 @@
  <critical>You MUST have already loaded and processed: {installed_path}/workflow.yaml</critical>
  <critical>Communicate all responses in {communication_language}</critical>
  <critical>🤖 AUTONOMOUS EPIC PROCESSING - Full automation of epic completion!</critical>
-  <critical>This workflow orchestrates create-story and super-dev-story for each story in an epic</critical>
+  <critical>This workflow orchestrates super-dev-pipeline for each story in an epic</critical>
  <critical>TASK-BASED COMPLETION: A story is ONLY complete when it has ZERO unchecked tasks (- [ ])</critical>

  <!-- AUTONOMOUS MODE INSTRUCTIONS - READ THESE CAREFULLY -->
@ -20,83 +20,41 @@
    4. Return to this workflow and continue
  </critical>

-  <!-- ═══════════════════════════════════════════════════════════════════════════════ -->
-  <!-- 🚨 CRITICAL: YOLO MODE CLARIFICATION 🚨                                         -->
-  <!-- ═══════════════════════════════════════════════════════════════════════════════ -->
-  <critical>🚨 WHAT YOLO MODE MEANS:
-    - YOLO mode ONLY means: automatically answer "y", "Y", "C", or "continue" to prompts
-    - YOLO mode does NOT mean: skip steps, skip workflows, skip verification, or produce minimal output
-    - YOLO mode does NOT mean: pretend work was done when it wasn't
-    - ALL steps must still be fully executed - just without waiting for user confirmation
-    - ALL invoke-workflow calls must still be fully executed
-    - ALL verification checks must still pass
-  </critical>
-
-  <!-- ═══════════════════════════════════════════════════════════════════════════════ -->
-  <!-- 🚨 ANTI-SKIP SAFEGUARDS - THESE ARE NON-NEGOTIABLE 🚨                           -->
-  <!-- ═══════════════════════════════════════════════════════════════════════════════ -->
-  <critical>🚨 STORY CREATION IS SACRED - YOU MUST ACTUALLY RUN CREATE-STORY:
-    - DO NOT just output "Creating story..." and move on
-    - DO NOT skip the invoke-workflow tag
-    - DO NOT pretend the story was created
-    - You MUST fully execute the create-story workflow with ALL its steps
-    - The story file MUST exist and be verified BEFORE proceeding
-  </critical>
-  <critical>🚨 CREATE-STORY QUALITY REQUIREMENTS:
-    - create-story must analyze epics, PRD, architecture, and UX documents
-    - create-story must produce comprehensive story files (4kb+ minimum)
-    - Tiny story files (under 4kb) indicate the workflow was not properly executed
-    - Story files MUST contain: Tasks/Subtasks, Acceptance Criteria, Dev Notes, Architecture Constraints
-  </critical>
-  <critical>🚨 HARD VERIFICATION REQUIRED AFTER STORY CREATION:
-    - After invoke-workflow for create-story completes, you MUST verify:
-      1. The story file EXISTS on disk (use file read/check)
-      2. The story file is AT LEAST 4000 bytes (use wc -c or file size check)
-      3. The story file contains required sections (Tasks, Acceptance Criteria, Dev Notes)
-    - If ANY verification fails: HALT and report error - do NOT proceed to super-dev-pipeline
-    - Do NOT trust "Story created" output without verification
-  </critical>
-
  <step n="1" goal="Initialize and validate epic">
-    <output>🤖 **Autonomous Epic Processing**
+    <check if="{{validation_only}} == true">
+      <output>🔍 **Epic Status Validation Mode**

-      This workflow will automatically:
-      1. Create stories (if backlog) using create-story
-      2. Develop each story using super-dev-pipeline
-      3. **Verify completion** by checking ALL tasks are done (- [x])
-      4. Commit and push after each story (integrated in super-dev-pipeline)
-      5. Generate epic completion report
+This will:
+1. Scan ALL story files for task completion (count checkboxes)
+2. Validate story file quality (>=10KB, proper task lists)
+3. Update sprint-status.yaml to match REALITY (task completion)
+4. Report suspicious stories (poor quality, false positives)

-      **super-dev-pipeline includes:**
-      - Pre-gap analysis (validates existing code - critical for brownfield!)
-      - Adaptive implementation (TDD for new, refactor for existing)
-      - **Post-implementation validation** (catches false positives!)
-      - Code review (adversarial, finds 3-10 issues)
-      - Completion (commit + push)
+**NO code will be generated** - validation only.
+      </output>
+    </check>

-      **Key Features:**
-      - ✅ Works for greenfield AND brownfield development
-      - ✅ Step-file architecture prevents vibe coding
-      - ✅ Disciplined execution even at high token counts
-      - ✅ All quality gates enforced
+    <check if="{{validation_only}} != true">
+      <output>🤖 **Autonomous Epic Processing**

-      🚨 **QUALITY SAFEGUARDS (Non-Negotiable):**
-      - Story files MUST be created via full create-story execution
-      - Story files MUST be at least 4kb (comprehensive, not YOLO'd)
-      - Story files MUST contain: Tasks, Acceptance Criteria, Dev Notes
-      - YOLO mode = auto-approve prompts, NOT skip steps or produce minimal output
-      - Verification happens AFTER each story creation - failures halt processing
+This workflow will automatically:
+1. Develop each story using super-dev-pipeline
+2. **Verify completion** by checking ALL tasks are done (- [x])
+3. Commit and push after each story (integrated in super-dev-pipeline)
+4. Generate epic completion report

-      **Key Improvement:** Stories in "review" status with unchecked tasks
-      WILL be processed - we check actual task completion, not just status!
+**super-dev-pipeline includes:**
+- Pre-gap analysis (understand existing code)
+- Smart task batching (group related work)
+- Implementation (systematic execution)
+- **Post-implementation validation** (catches false positives!)
+- Code review (adversarial, multi-agent)
+- Completion (commit + push)

-      **Time Estimate:** Varies by epic size
-      - Small epic (3-5 stories): 2-5 hours
-      - Medium epic (6-10 stories): 5-10 hours
-      - Large epic (11+ stories): 10-20 hours
-
-      **Token Usage:** ~40-60K per story (more efficient + brownfield support!)
-    </output>
+**Key Improvement:** Stories in "review" status with unchecked tasks
+WILL be processed - we check actual task completion, not just status!
+      </output>
+    </check>

    <check if="{{epic_num}} provided">
      <action>Use provided epic number</action>
@ -123,10 +81,17 @@
    <!-- TASK-BASED ANALYSIS: Scan actual story files for unchecked tasks -->
    <action>For each story in epic:
      1. Read the story file from {{story_dir}}/{{story_key}}.md
-      2. Count unchecked tasks: grep -c "^- \[ \]" or regex match "- \[ \]"
-      3. Count checked tasks: grep -c "^- \[x\]" or regex match "- \[x\]"
-      4. Categorize story:
-         - "truly_done": status=done AND unchecked_tasks=0
+      2. Check file exists (if missing, mark story as "backlog")
+      3. Check file size (if <10KB, flag as poor quality)
+      4. Count unchecked tasks: grep -c "^- \[ \]" or regex match "- \[ \]"
+      5. Count checked tasks: grep -c "^- \[x\]" or regex match "- \[x\]"
+      6. Count total tasks (unchecked + checked)
+      7. Calculate completion rate: (checked / total * 100)
+      8. Categorize story:
+         - "truly_done": unchecked_tasks=0 AND file_size>=10KB AND total_tasks>=5
+         - "in_progress": unchecked_tasks>0 AND checked_tasks>0
+         - "ready_for_dev": unchecked_tasks=total_tasks (nothing checked yet)
+         - "poor_quality": file_size<10KB OR total_tasks<5 (needs regeneration)
         - "needs_work": unchecked_tasks > 0 (regardless of status)
         - "backlog": status=backlog (file may not exist yet)
    </action>
@ -156,10 +121,10 @@

    <ask>**Proceed with autonomous processing?**

-      [Y] Yes - Use super-dev-pipeline (works for greenfield AND brownfield)
+      [Y] Yes - Use super-dev-pipeline (step-file architecture, brownfield-compatible)
      [n] No - Cancel

-      Note: super-dev-pipeline uses step-file architecture to prevent vibe coding!
+      Note: super-dev-pipeline uses disciplined step-file execution with smart batching!
    </ask>

    <check if="user says Y">
@ -184,17 +149,30 @@
      <output>📝 Staying on current branch: {{current_branch}} (parallel epic mode)</output>
    </check>

-    <action>Initialize progress tracking file at: .autonomous-epic-progress-epic-{{epic_num}}.yaml
-      - epic_num
-      - started timestamp
-      - total_stories
-      - completed_stories: []
-      - current_story: null
-      - status: running
+    <!-- Backwards compatibility: Check for both new and legacy progress file formats -->
+    <action>Check for existing progress file:
+      1. New format: .autonomous-epic-{{epic_num}}-progress.yaml
+      2. Legacy format: .autonomous-epic-progress-epic-{{epic_num}}.yaml
+      Set {{progress_file_path}} to whichever exists, or new format if neither exist
    </action>

-    <!-- Keep sprint-status accurate at start -->
-    <action>Update sprint-status: if epic-{{epic_num}} is "backlog" or "contexted", set to "in-progress"</action>
+    <check if="progress file exists">
+      <output>📋 Found existing progress file: {{progress_file_path}}</output>
+      <output>⚠️  Resuming from last saved state</output>
+      <action>Load existing progress from {{progress_file_path}}</action>
+    </check>
+
+    <check if="progress file does NOT exist">
+      <output>📋 Creating new progress file: .autonomous-epic-{{epic_num}}-progress.yaml</output>
+      <action>Initialize progress tracking file at: .autonomous-epic-{{epic_num}}-progress.yaml
+        - epic_num
+        - started timestamp
+        - total_stories
+        - completed_stories: []
+        - current_story: null
+        - status: running
+      </action>
+    </check>
  </step>

  <step n="3" goal="Process all stories in epic">
@ -210,94 +188,60 @@
    <!-- STORY LOOP -->
    <loop foreach="{{stories_needing_work}}">
      <action>Set {{current_story}}</action>
-      <action>Read story file and count unchecked tasks</action>
+      <action>Read story file from {{story_dir}}/{{current_story.key}}.md</action>

-      <output>
-        ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
-        Story {{counter}}/{{work_count}}: {{current_story.key}}
-        Status: {{current_story.status}} | Unchecked Tasks: {{unchecked_count}}
-        ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
-      </output>
-
-      <!-- ═══════════════════════════════════════════════════════════════════════ -->
-      <!-- CREATE STORY IF BACKLOG - WITH MANDATORY VERIFICATION                   -->
-      <!-- ═══════════════════════════════════════════════════════════════════════ -->
-      <check if="status == 'backlog'">
-        <output>📝 Creating story from epic - THIS REQUIRES FULL WORKFLOW EXECUTION...</output>
-        <output>⚠️ REMINDER: You MUST fully execute create-story, not just output messages!</output>
-
-        <try>
-          <!-- STEP 1: Actually invoke and execute create-story workflow -->
-          <invoke-workflow path="{project-root}/_bmad/bmm/workflows/4-implementation/create-story/workflow.yaml">
-            <input name="story_id" value="{{current_story.key}}" />
-            <note>Create story just-in-time - MUST FULLY EXECUTE ALL STEPS</note>
-            <note>This workflow must load epics, PRD, architecture, UX docs</note>
-            <note>This workflow must produce a comprehensive 4kb+ story file</note>
-          </invoke-workflow>
-
-          <!-- STEP 2: HARD VERIFICATION - Story file must exist -->
-          <action>Set {{expected_story_file}} = {{story_dir}}/story-{{epic_num}}.{{story_num}}.md</action>
-          <action>Check if file exists: {{expected_story_file}}</action>
-          <check if="story file does NOT exist">
-            <output>🚨 CRITICAL ERROR: Story file was NOT created!</output>
-            <output>Expected file: {{expected_story_file}}</output>
-            <output>The create-story workflow did not execute properly.</output>
-            <output>This story CANNOT proceed without a proper story file.</output>
-            <action>Add to failed_stories with reason: "Story file not created"</action>
-            <continue />
-          </check>
-
-          <!-- STEP 3: HARD VERIFICATION - Story file must be at least 4kb -->
-          <action>Get file size of {{expected_story_file}} in bytes</action>
-          <check if="file size < 4000 bytes">
-            <output>🚨 CRITICAL ERROR: Story file is too small ({{file_size}} bytes)!</output>
-            <output>Minimum required: 4000 bytes</output>
-            <output>This indicates create-story was skipped or improperly executed.</output>
-            <output>A proper story file should contain:</output>
-            <output>  - Detailed acceptance criteria</output>
-            <output>  - Comprehensive tasks/subtasks</output>
-            <output>  - Dev notes with architecture constraints</output>
-            <output>  - Source references</output>
-            <output>This story CANNOT proceed with an incomplete story file.</output>
-            <action>Add to failed_stories with reason: "Story file too small - workflow not properly executed"</action>
-            <continue />
-          </check>
-
-          <!-- STEP 4: HARD VERIFICATION - Story file must have required sections -->
-          <action>Read {{expected_story_file}} and check for required sections</action>
-          <check if="file missing '## Tasks' OR '## Acceptance Criteria'">
-            <output>🚨 CRITICAL ERROR: Story file missing required sections!</output>
-            <output>Required sections: Tasks, Acceptance Criteria</output>
-            <output>This story CANNOT proceed without proper structure.</output>
-            <action>Add to failed_stories with reason: "Story file missing required sections"</action>
-            <continue />
-          </check>
-
-          <output>✅ Story created and verified:</output>
-          <output>   - File exists: {{expected_story_file}}</output>
-          <output>   - File size: {{file_size}} bytes (meets 4kb minimum)</output>
-          <output>   - Required sections: present</output>
-          <action>Update sprint-status: set {{current_story.key}} to "ready-for-dev" (if not already)</action>
-        </try>
-
-        <catch>
-          <output>❌ Failed to create story: {{error}}</output>
-          <action>Add to failed_stories with error details</action>
-          <continue />
-        </catch>
+      <check if="file not found">
+        <output>  ❌ Story file missing: {{current_story.key}}.md</output>
+        <action>Mark story as "backlog" in sprint-status.yaml</action>
+        <action>Continue to next story</action>
      </check>

-      <!-- DEVELOP STORY WITH SUPER-DEV-PIPELINE (handles both greenfield AND brownfield) -->
-      <check if="{{unchecked_count}} > 0">
-        <action>Update sprint-status: set {{current_story.key}} to "in-progress"</action>
-        <output>💻 Developing story with super-dev-pipeline ({{unchecked_count}} tasks remaining)...</output>
+      <action>Get file size in KB</action>
+      <action>Count unchecked tasks: grep -c "^- \[ \]"</action>
+      <action>Count checked tasks: grep -c "^- \[x\]"</action>
+      <action>Count total tasks</action>
+      <action>Calculate completion_rate = (checked / total * 100)</action>
+
+      <output>
+━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
+Story {{counter}}/{{work_count}}: {{current_story.key}}
+Size: {{file_size_kb}}KB | Tasks: {{checked}}/{{total}} ({{completion_rate}}%)
+━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
+      </output>
+
+      <!-- VALIDATION-ONLY MODE: Just update status, don't implement -->
+      <check if="{{validation_only}} == true">
+        <action>Determine correct status:
+          IF unchecked_tasks == 0 AND file_size >= 10KB AND total_tasks >= 5
+            → correct_status = "done"
+          ELSE IF unchecked_tasks > 0 AND checked_tasks > 0
+            → correct_status = "in-progress"
+          ELSE IF unchecked_tasks == total_tasks
+            → correct_status = "ready-for-dev"
+          ELSE IF file_size < 10KB OR total_tasks < 5
+            → correct_status = "ready-for-dev" (needs regeneration)
+        </action>
+
+        <action>Update story status in sprint-status.yaml to {{correct_status}}</action>
+
+        <check if="file_size < 10KB OR total_tasks < 5">
+          <output>  ⚠️  POOR QUALITY - File too small or missing tasks (needs /create-story regeneration)</output>
+        </check>
+
+        <action>Continue to next story (skip super-dev-pipeline)</action>
+      </check>
+
+      <!-- NORMAL MODE: Run super-dev-pipeline -->
+      <check if="{{validation_only}} != true">
+        <!-- PROCESS STORY WITH SUPER-DEV-PIPELINE -->
+        <check if="{{unchecked_count}} > 0 OR status == 'backlog'">
+          <output>💻 Processing story with super-dev-pipeline ({{unchecked_count}} tasks remaining)...</output>

        <try>
          <invoke-workflow path="{project-root}/_bmad/bmm/workflows/4-implementation/super-dev-pipeline/workflow.yaml">
-            <input name="story_id" value="{{current_story.key}}" />
-            <input name="story_file" value="{{current_story_file}}" />
+            <input name="story_file" value="{{story_dir}}/{{current_story.key}}.md" />
            <input name="mode" value="batch" />
-            <note>Step-file execution: pre-gap → implement → post-validate → review → commit</note>
+            <note>Full lifecycle: pre-gap → implement (batched) → post-validate → review → commit</note>
          </invoke-workflow>

          <!-- super-dev-pipeline handles verification internally, just check final status -->
@ -307,10 +251,9 @@
          <action>Re-read story file and count unchecked tasks</action>

          <check if="{{remaining_unchecked}} > 0">
-            <output>⚠️ Story still has {{remaining_unchecked}} unchecked tasks after super-dev-pipeline</output>
+            <output>⚠️ Story still has {{remaining_unchecked}} unchecked tasks after pipeline</output>
            <action>Log incomplete tasks for review</action>
            <action>Mark as partial success</action>
-            <action>Update sprint-status: set {{current_story.key}} to "review"</action>
          </check>

          <check if="{{remaining_unchecked}} == 0">
@ -319,7 +262,7 @@
          </check>

          <action>Increment success_count</action>
-          <action>Update progress file: .autonomous-epic-progress-epic-{{epic_num}}.yaml</action>
+          <action>Update progress file: {{progress_file_path}}</action>
        </try>

        <catch>
@ -328,11 +271,12 @@
          <action>Increment failure_count</action>
        </catch>
      </check>
+      </check>  <!-- Close validation_only != true check -->

      <output>Progress: {{success_count}} ✅ | {{failure_count}} ❌ | {{remaining}} pending</output>
    </loop>

-    <action>Update progress file status to complete: .autonomous-epic-progress-epic-{{epic_num}}.yaml</action>
+    <action>Update progress file status to complete: {{progress_file_path}}</action>
  </step>

  <step n="4" goal="Epic completion and reporting">
@ -382,7 +326,7 @@
      <action>Update sprint-status: epic-{{epic_num}} = "done"</action>
    </check>

-    <action>Remove progress file: .autonomous-epic-progress-epic-{{epic_num}}.yaml</action>
+    <action>Remove progress file: {{progress_file_path}}</action>
  </step>

 </workflow>
--- a/src/modules/bmm/workflows/4-implementation/autonomous-epic/workflow.yaml
+++ b/src/modules/bmm/workflows/4-implementation/autonomous-epic/workflow.yaml
@ -1,7 +1,7 @@
 name: autonomous-epic
-description: "Autonomous epic processing using super-dev-pipeline - creates and develops all stories with anti-vibe-coding enforcement. Works for greenfield AND brownfield!"
+description: "Autonomous epic processing using super-dev-pipeline - creates and develops all stories in an epic with minimal human intervention. Step-file architecture with smart batching!"
 author: "BMad"
-version: "2.0.0" # Upgraded to use super-dev-pipeline with step-file architecture
+version: "3.0.0" # Upgraded to use super-dev-pipeline (works for both greenfield and brownfield)

 # Critical variables from config
 config_source: "{project-root}/_bmad/bmm/config.yaml"
@ -13,19 +13,18 @@ story_dir: "{implementation_artifacts}"
 # Workflow components
 installed_path: "{project-root}/_bmad/bmm/workflows/4-implementation/autonomous-epic"
 instructions: "{installed_path}/instructions.xml"
-progress_file: "{story_dir}/.autonomous-epic-progress-epic-{epic_num}.yaml"
+progress_file: "{story_dir}/.autonomous-epic-{epic_num}-progress.yaml"

 # Variables
 epic_num: "" # User provides or auto-discover next epic
 sprint_status: "{implementation_artifacts}/sprint-status.yaml"
 project_context: "**/project-context.md"
+validation_only: false # NEW: If true, only validate/fix status, don't implement

 # Autonomous mode settings
 autonomous_settings:
-  # Use super-dev-pipeline: Step-file architecture that works for BOTH greenfield AND brownfield
-  use_super_dev_pipeline: true # Disciplined execution, no vibe coding
-
-  pipeline_mode: "batch" # Run workflows in batch mode (unattended)
+  use_super_dev_pipeline: true # Use super-dev-pipeline workflow (step-file architecture)
+  pipeline_mode: "batch" # Run super-dev-pipeline in batch mode (unattended)
  halt_on_error: false # Continue even if story fails
  max_retry_per_story: 2 # Retry failed stories
  create_git_commits: true # Commit after each story (handled by super-dev-pipeline)
@ -34,42 +33,17 @@ autonomous_settings:

 # super-dev-pipeline benefits
 super_dev_pipeline_features:
-  token_efficiency: "40-60K per story (vs 100-150K for super-dev-story orchestration)"
-  works_for: "Both greenfield AND brownfield development"
-  anti_vibe_coding: "Step-file architecture prevents deviation at high token counts"
+  token_efficiency: "Step-file architecture prevents context bloat"
+  brownfield_support: "Works with existing codebases (unlike story-pipeline)"
  includes:
-    - "Pre-gap analysis (validates against existing code)"
-    - "Adaptive implementation (TDD for new, refactor for existing)"
-    - "Post-implementation validation (catches false positives)"
-    - "Code review (adversarial, finds 3-10 issues)"
-    - "Completion (targeted commit + push)"
-  quality_gates: "All super-dev-story gates with disciplined execution"
-  brownfield_support: "Validates existing code before implementation"
-
-# YOLO MODE CLARIFICATION
-# YOLO mode ONLY means auto-approve prompts (answer "y", "Y", "C", "continue")
-# YOLO mode does NOT mean: skip steps, skip workflows, or produce minimal output
-# ALL steps, workflows, and verifications must still be fully executed
-yolo_clarification:
-  auto_approve_prompts: true
-  skip_steps: false # NEVER - all steps must execute
-  skip_workflows: false # NEVER - invoke-workflow calls must execute
-  skip_verification: false # NEVER - all checks must pass
-  minimal_output: false # NEVER - full quality output required
-
-# STORY QUALITY REQUIREMENTS
-# These settings ensure create-story produces comprehensive story files
-story_quality_requirements:
-  minimum_size_bytes: 4000 # Story files must be at least 4KB
-  enforce_minimum_size: true
-  required_sections:
-    - "## Tasks"
-    - "## Acceptance Criteria"
-    - "## Dev Notes"
-    - "Architecture Constraints"
-    - "Gap Analysis"
-  halt_on_quality_failure: true # Stop processing if story fails quality check
-  verify_file_exists: true # Verify story file was actually created on disk
+    - "Pre-gap analysis (understand what exists before starting)"
+    - "Smart batching (group related tasks)"
+    - "Implementation (systematic execution)"
+    - "Post-validation (verify changes work)"
+    - "Code review (adversarial, multi-agent)"
+    - "Completion (commit + push)"
+  quality_gates: "Same rigor as story-pipeline, works for brownfield"
+  checkpoint_resume: "Can resume from any step after failure"

 # TASK-BASED COMPLETION SETTINGS (NEW)
 # These settings ensure stories are truly complete, not just marked as such
@ -93,3 +67,5 @@ completion_verification:
  strict_epic_completion: true

 standalone: true
+
+web_bundle: false
--- a/src/modules/bmm/workflows/4-implementation/dev-story/instructions.xml
+++ b/src/modules/bmm/workflows/4-implementation/dev-story/instructions.xml
@ -529,12 +529,29 @@
    </check>

    <check if="story key not found in sprint status">
-      <output>⚠️ Story file updated, but sprint-status update failed: {{story_key}} not found
+      <output>❌ CRITICAL: Story {{story_key}} not found in sprint-status.yaml!

-        Story status is set to "review" in file, but sprint-status.yaml may be out of sync.
+        This should NEVER happen - stories must be added during create-story workflow.
+
+        **HALTING** - sprint-status.yaml is out of sync and must be fixed.
      </output>
+      <action>HALT - Cannot proceed without valid sprint tracking</action>
    </check>

+    <!-- ENFORCEMENT: Validate sprint-status.yaml was actually updated -->
+    <action>Re-read {sprint_status} file to verify update persisted</action>
+    <action>Confirm {{story_key}} now shows status "review"</action>
+
+    <check if="verification fails">
+      <output>❌ CRITICAL: sprint-status.yaml update failed to persist!
+
+        Status was written but not saved correctly.
+      </output>
+      <action>HALT - File system issue or permission problem</action>
+    </check>
+
+    <output>✅ Verified: sprint-status.yaml updated successfully</output>
+
    <!-- Final validation gates -->
    <action if="any task is incomplete">HALT - Complete remaining tasks before marking ready for review</action>
    <action if="regression failures exist">HALT - Fix regression issues before completing</action>
--- a/src/modules/bmm/workflows/4-implementation/recover-sprint-status/instructions.md
+++ b/src/modules/bmm/workflows/4-implementation/recover-sprint-status/instructions.md
@ -0,0 +1,306 @@
+# Sprint Status Recovery - Instructions
+
+**Workflow:** recover-sprint-status
+**Purpose:** Fix sprint-status.yaml when tracking has drifted for days/weeks
+
+---
+
+## What This Workflow Does
+
+Analyzes multiple sources to rebuild accurate sprint-status.yaml:
+
+1. **Story File Quality** - Validates size (>=10KB), task lists, checkboxes
+2. **Explicit Status: Fields** - Reads story Status: when present
+3. **Git Commits** - Searches last 30 days for story references
+4. **Autonomous Reports** - Checks .epic-*-completion-report.md files
+5. **Task Completion Rate** - Analyzes checkbox completion in story files
+
+**Infers Status Based On:**
+- Explicit Status: field (highest priority)
+- Git commits referencing story (strong signal)
+- Autonomous completion reports (very high confidence)
+- Task checkbox completion rate (90%+ = done)
+- File quality (poor quality prevents "done" marking)
+
+---
+
+## Step 1: Run Recovery Analysis
+
+```bash
+Execute: {recovery_script} --dry-run
+```
+
+**This will:**
+- Analyze all story files (quality, tasks, status)
+- Search git commits for completion evidence
+- Check autonomous completion reports
+- Infer status from all evidence
+- Report recommendations with confidence levels
+
+**No changes** made in dry-run mode - just analysis.
+
+---
+
+## Step 2: Review Recommendations
+
+**Check the output for:**
+
+### High Confidence Updates (Safe)
+- Stories with explicit Status: fields
+- Stories in autonomous completion reports
+- Stories with 3+ git commits + 90%+ tasks complete
+
+### Medium Confidence Updates (Verify)
+- Stories with 1-2 git commits
+- Stories with 50-90% tasks complete
+- Stories with file size >=10KB
+
+### Low Confidence Updates (Question)
+- Stories with no Status: field, no commits
+- Stories with file size <10KB
+- Stories with <5 tasks total
+
+---
+
+## Step 3: Choose Recovery Mode
+
+### Conservative Mode (Safest)
+```bash
+Execute: {recovery_script} --conservative
+```
+
+**Only updates:**
+- High/very high confidence stories
+- Explicit Status: fields honored
+- Git commits with 3+ references
+- Won't infer or guess
+
+**Best for:** Quick fixes, first-time recovery, risk-averse
+
+---
+
+### Aggressive Mode (Thorough)
+```bash
+Execute: {recovery_script} --aggressive --dry-run  # Preview first!
+Execute: {recovery_script} --aggressive             # Then apply
+```
+
+**Updates:**
+- Medium+ confidence stories
+- Infers from git commits (even 1 commit)
+- Uses task completion rate
+- Pre-fills brownfield checkboxes
+
+**Best for:** Major drift (30+ days), comprehensive recovery
+
+---
+
+### Interactive Mode (Recommended)
+```bash
+Execute: {recovery_script}
+```
+
+**Process:**
+1. Shows all recommendations
+2. Groups by confidence level
+3. Asks for confirmation before each batch
+4. Allows selective application
+
+**Best for:** First-time use, learning the tool
+
+---
+
+## Step 4: Validate Results
+
+```bash
+Execute: ./scripts/sync-sprint-status.sh --validate
+```
+
+**Should show:**
+- "✓ sprint-status.yaml is up to date!" (success)
+- OR discrepancy count (if issues remain)
+
+---
+
+## Step 5: Commit Changes
+
+```bash
+git add docs/sprint-artifacts/sprint-status.yaml
+git add .sprint-status-backups/  # Include backup for audit trail
+git commit -m "fix(tracking): Recover sprint-status.yaml - {MODE} recovery"
+```
+
+---
+
+## Recovery Scenarios
+
+### Scenario 1: Autonomous Epic Completed, Tracking Not Updated
+
+**Symptoms:**
+- Autonomous completion report exists
+- Git commits show work done
+- sprint-status.yaml shows "in-progress" or "backlog"
+
+**Solution:**
+```bash
+{recovery_script} --aggressive
+# Will find completion report, mark all stories done
+```
+
+---
+
+### Scenario 2: Manual Work Over Past Week Not Tracked
+
+**Symptoms:**
+- Story Status: fields updated to "done"
+- sprint-status.yaml not synced
+- Git commits exist
+
+**Solution:**
+```bash
+./scripts/sync-sprint-status.sh
+# Standard sync (reads Status: fields)
+```
+
+---
+
+### Scenario 3: Story Files Missing Status: Fields
+
+**Symptoms:**
+- 100+ stories with no Status: field
+- Some completed, some not
+- No autonomous reports
+
+**Solution:**
+```bash
+{recovery_script} --aggressive --dry-run  # Preview inference
+# Review recommendations carefully
+{recovery_script} --aggressive             # Apply if satisfied
+```
+
+---
+
+### Scenario 4: Complete Chaos (Mix of All Above)
+
+**Symptoms:**
+- Some stories have Status:, some don't
+- Autonomous reports for some epics
+- Manual work on others
+- sprint-status.yaml very outdated
+
+**Solution:**
+```bash
+# Step 1: Run recovery in dry-run
+{recovery_script} --aggressive --dry-run
+
+# Step 2: Review /tmp/recovery_results.json
+
+# Step 3: Apply in conservative mode first (safest updates)
+{recovery_script} --conservative
+
+# Step 4: Manually review remaining stories
+# Update Status: fields for known completed work
+
+# Step 5: Run sync to catch manual updates
+./scripts/sync-sprint-status.sh
+
+# Step 6: Final validation
+./scripts/sync-sprint-status.sh --validate
+```
+
+---
+
+## Quality Gates
+
+**Recovery script will DOWNGRADE status if:**
+- Story file < 10KB (not properly detailed)
+- Story file has < 5 tasks (incomplete story)
+- No git commits found (no evidence of work)
+- Explicit Status: contradicts other evidence
+
+**Recovery script will UPGRADE status if:**
+- Autonomous completion report lists story as done
+- 3+ git commits + 90%+ tasks checked
+- Explicit Status: field says "done"
+
+---
+
+## Post-Recovery Checklist
+
+After running recovery:
+
+- [ ] Run validation: `./scripts/sync-sprint-status.sh --validate`
+- [ ] Review backup: Check `.sprint-status-backups/` for before state
+- [ ] Check epic statuses: Verify epic-level status matches story completion
+- [ ] Spot-check 5-10 stories: Confirm inferred status is accurate
+- [ ] Commit changes: Add recovery to version control
+- [ ] Document issues: Note why drift occurred, prevent recurrence
+
+---
+
+## Preventing Future Drift
+
+**After recovery:**
+
+1. **Use workflows properly**
+   - `/create-story` - Adds to sprint-status.yaml automatically
+   - `/dev-story` - Updates both Status: and sprint-status.yaml
+   - Autonomous workflows - Now update tracking
+
+2. **Run sync regularly**
+   - Weekly: `pnpm sync:sprint-status:dry-run` (check health)
+   - After manual Status: updates: `pnpm sync:sprint-status`
+
+3. **CI/CD validation** (coming soon)
+   - Blocks PRs with out-of-sync tracking
+   - Forces sync before merge
+
+---
+
+## Troubleshooting
+
+### "Recovery script shows 0 updates"
+
+**Possible causes:**
+- sprint-status.yaml already accurate
+- Story files all have proper Status: fields
+- No git commits found (check date range)
+
+**Action:** Run `--dry-run` to see analysis, check `/tmp/recovery_results.json`
+
+---
+
+### "Low confidence on stories I know are done"
+
+**Possible causes:**
+- Story file < 10KB (not properly detailed)
+- No git commits (work done outside git)
+- No explicit Status: field
+
+**Action:** Manually add Status: field to story, then run standard sync
+
+---
+
+### "Recovery marks incomplete stories as done"
+
+**Possible causes:**
+- Git commits exist but work abandoned
+- Autonomous report lists story but implementation failed
+- Tasks pre-checked incorrectly (brownfield error)
+
+**Action:** Use conservative mode, manually verify, fix story files
+
+---
+
+## Output Files
+
+**Created during recovery:**
+- `.sprint-status-backups/sprint-status-recovery-{timestamp}.yaml` - Backup
+- `/tmp/recovery_results.json` - Detailed analysis
+- Updated `sprint-status.yaml` - Recovered status
+
+---
+
+**Last Updated:** 2026-01-02
+**Status:** Production Ready
+**Works On:** ANY BMAD project with sprint-status.yaml tracking
--- a/src/modules/bmm/workflows/4-implementation/recover-sprint-status/workflow.yaml
+++ b/src/modules/bmm/workflows/4-implementation/recover-sprint-status/workflow.yaml
@ -0,0 +1,30 @@
+# Sprint Status Recovery Workflow
+name: recover-sprint-status
+description: "Recover sprint-status.yaml when tracking has drifted. Analyzes story files, git commits, and autonomous reports to rebuild accurate status."
+author: "BMad"
+
+# Critical variables from config
+config_source: "{project-root}/_bmad/bmm/config.yaml"
+output_folder: "{config_source}:output_folder"
+user_name: "{config_source}:user_name"
+communication_language: "{config_source}:communication_language"
+implementation_artifacts: "{config_source}:implementation_artifacts"
+
+# Workflow components
+installed_path: "{project-root}/_bmad/bmm/workflows/4-implementation/recover-sprint-status"
+instructions: "{installed_path}/instructions.md"
+
+# Inputs
+variables:
+  sprint_status_file: "{implementation_artifacts}/sprint-status.yaml"
+  story_directory: "{implementation_artifacts}"
+  recovery_mode: "interactive" # Options: interactive, conservative, aggressive
+
+# Recovery script location
+recovery_script: "{project-root}/scripts/recover-sprint-status.sh"
+
+# Standalone so IDE commands get generated
+standalone: true
+
+# No web bundle needed
+web_bundle: false
--- a/src/modules/bmm/workflows/4-implementation/validate-all-epics/instructions.xml
+++ b/src/modules/bmm/workflows/4-implementation/validate-all-epics/instructions.xml
@ -0,0 +1,158 @@
+<workflow>
+  <critical>The workflow execution engine is governed by: {project-root}/_bmad/core/tasks/workflow.xml</critical>
+  <critical>You MUST have already loaded and processed: {installed_path}/workflow.yaml</critical>
+  <critical>This validates EVERY epic in the project - comprehensive health check</critical>
+
+  <step n="1" goal="Discover all epics">
+    <action>Load {{sprint_status_file}}</action>
+
+    <check if="file not found">
+      <output>❌ sprint-status.yaml not found
+
+Run /bmad:bmm:workflows:sprint-planning first.
+      </output>
+      <action>HALT</action>
+    </check>
+
+    <action>Parse development_status section</action>
+    <action>Extract all epic keys (entries starting with "epic-")</action>
+    <action>Filter out retrospectives (ending with "-retrospective")</action>
+    <action>Store as {{epic_list}}</action>
+
+    <output>🔍 **Comprehensive Epic Validation**
+
+Found {{epic_count}} epics to validate:
+{{#each epic_list}}
+  - {{this}}
+{{/each}}
+
+Starting validation...
+    </output>
+  </step>
+
+  <step n="2" goal="Validate each epic">
+    <critical>Run validate-epic-status for EACH epic</critical>
+
+    <action>Initialize counters:
+      - total_stories_scanned = 0
+      - total_valid_stories = 0
+      - total_invalid_stories = 0
+      - total_updates_applied = 0
+      - epics_validated = []
+    </action>
+
+    <loop foreach="{{epic_list}}">
+      <action>Set {{current_epic}} = current loop item</action>
+
+      <output>
+━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
+Validating {{current_epic}}...
+━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
+      </output>
+
+      <!-- Use Python script for validation logic -->
+      <action>Execute validation script:
+        python3 scripts/lib/sprint-status-updater.py --epic {{current_epic}} --mode validate
+      </action>
+
+      <action>Parse script output:
+        - Story count
+        - Valid/invalid/missing counts
+        - Inferred statuses
+        - Updates needed
+      </action>
+
+      <check if="{{validation_mode}} == fix">
+        <action>Execute fix script:
+          python3 scripts/lib/sprint-status-updater.py --epic {{current_epic}} --mode fix
+        </action>
+
+        <action>Count updates applied</action>
+        <action>Add to total_updates_applied</action>
+      </check>
+
+      <action>Store validation results for {{current_epic}}</action>
+      <action>Increment totals</action>
+
+      <output>✓ {{current_epic}}: {{story_count}} stories, {{valid_count}} valid, {{updates_applied}} updates
+      </output>
+    </loop>
+
+    <output>
+━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
+All Epics Validated
+━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
+    </output>
+  </step>
+
+  <step n="3" goal="Consolidate and report">
+    <output>
+📊 **COMPREHENSIVE VALIDATION RESULTS**
+
+**Epics Validated:** {{epic_count}}
+
+**Stories Analyzed:** {{total_stories_scanned}}
+  Valid: {{total_valid_stories}} (>=10KB, >=5 tasks)
+  Invalid: {{total_invalid_stories}} (<10KB or <5 tasks)
+  Missing: {{total_missing_files}}
+
+**Updates Applied:** {{total_updates_applied}}
+
+**Epic Status Summary:**
+{{#each_epic_with_status}}
+  {{epic_key}}: {{status}} ({{done_count}}/{{total_count}} done)
+{{/each}}
+
+**Top Issues:**
+{{#if_invalid_stories_exist}}
+  ⚠️  {{total_invalid_stories}} stories need regeneration (/create-story)
+{{/if}}
+{{#if_missing_files_exist}}
+  ⚠️  {{total_missing_files}} story files missing (create or remove from sprint-status.yaml)
+{{/if}}
+{{#if_conflicting_evidence}}
+  ⚠️  {{conflict_count}} stories have conflicting evidence (manual review)
+{{/if}}
+
+**Health Score:** {{health_score}}/100
+  (100 = perfect, all stories valid with correct status)
+    </output>
+
+    <action>Write comprehensive report to {{default_output_file}}</action>
+
+    <output>💾 Full report: {{default_output_file}}</output>
+  </step>
+
+  <step n="4" goal="Provide actionable recommendations">
+    <output>
+🎯 **RECOMMENDED ACTIONS**
+
+{{#if_health_score_lt_80}}
+**Priority 1: Fix Invalid Stories ({{total_invalid_stories}})**
+{{#each_invalid_story}}
+  /create-story-with-gap-analysis  # Regenerate {{story_id}}
+{{/each}}
+{{/if}}
+
+{{#if_missing_files_gt_0}}
+**Priority 2: Create Missing Story Files ({{total_missing_files}})**
+{{#each_missing}}
+  /create-story  # Create {{story_id}}
+{{/each}}
+{{/if}}
+
+{{#if_health_score_gte_80}}
+✅ **Sprint status is healthy!**
+
+Continue with normal development:
+  /sprint-status  # Check what's next
+{{/if}}
+
+**Maintenance:**
+  - Run /validate-all-epics weekly to catch drift
+  - After autonomous work, run validation
+  - Before sprint reviews, validate status accuracy
+    </output>
+  </step>
+
+</workflow>
--- a/src/modules/bmm/workflows/4-implementation/validate-all-epics/workflow.yaml
+++ b/src/modules/bmm/workflows/4-implementation/validate-all-epics/workflow.yaml
@ -0,0 +1,30 @@
+name: validate-all-epics
+description: "Validate and fix sprint-status.yaml for ALL epics. Runs validate-epic-status on every epic in parallel, consolidates results, rebuilds accurate sprint-status.yaml."
+author: "BMad"
+version: "1.0.0"
+
+# Critical variables from config
+config_source: "{project-root}/_bmad/bmm/config.yaml"
+user_name: "{config_source}:user_name"
+communication_language: "{config_source}:communication_language"
+implementation_artifacts: "{config_source}:implementation_artifacts"
+story_dir: "{implementation_artifacts}"
+
+# Workflow components
+installed_path: "{project-root}/_bmad/bmm/workflows/4-implementation/validate-all-epics"
+instructions: "{installed_path}/instructions.xml"
+
+# Variables
+variables:
+  sprint_status_file: "{implementation_artifacts}/sprint-status.yaml"
+  validation_mode: "fix" # Options: "report-only", "fix"
+  parallel_validation: true # Validate epics in parallel for speed
+
+# Sub-workflow
+validate_epic_workflow: "{project-root}/_bmad/bmm/workflows/4-implementation/validate-epic-status/workflow.yaml"
+
+# Output
+default_output_file: "{story_dir}/.all-epics-validation-report.md"
+
+standalone: true
+web_bundle: false
--- a/src/modules/bmm/workflows/4-implementation/validate-all-stories-deep/instructions.xml
+++ b/src/modules/bmm/workflows/4-implementation/validate-all-stories-deep/instructions.xml
@ -0,0 +1,338 @@
+<workflow>
+  <critical>The workflow execution engine is governed by: {project-root}/_bmad/core/tasks/workflow.xml</critical>
+  <critical>You MUST have already loaded and processed: {installed_path}/workflow.yaml</critical>
+  <critical>This is the COMPREHENSIVE AUDIT - validates all stories using Haiku agents</critical>
+  <critical>Cost: ~$76 for 511 stories with Haiku (vs $793 with Sonnet)</critical>
+
+  <step n="1" goal="Discover all story files">
+    <action>Find all .md files in {{story_dir}}</action>
+
+    <action>Filter out meta-documents:
+      - Files starting with "EPIC-" (completion reports)
+      - Files starting with "." (progress files)
+      - Files containing: COMPLETION, SUMMARY, REPORT, SESSION-, REVIEW-, README, INDEX
+      - Files like "atdd-checklist-", "gap-analysis-", "review-"
+    </action>
+
+    <check if="{{epic_filter}} provided">
+      <action>Filter to stories matching: {{epic_filter}}-*.md</action>
+    </check>
+
+    <action>Store as {{story_list}}</action>
+    <action>Count {{story_count}}</action>
+
+    <output>🔍 **Comprehensive Story Audit**
+
+{{#if epic_filter}}**Epic Filter:** {{epic_filter}}{{else}}**Scope:** All epics{{/if}}
+**Stories to Validate:** {{story_count}}
+**Agent Model:** Haiku 4.5
+**Batch Size:** {{batch_size}}
+
+**Estimated Cost:** ~${{estimated_cost}} ({{story_count}} × $0.15/story)
+**Estimated Time:** {{estimated_hours}} hours
+
+Starting batch validation...
+    </output>
+  </step>
+
+  <step n="2" goal="Batch validate all stories">
+    <action>Initialize counters:
+      - stories_validated = 0
+      - verified_complete = 0
+      - needs_rework = 0
+      - false_positives = 0
+      - in_progress = 0
+      - total_false_positive_tasks = 0
+      - total_critical_issues = 0
+    </action>
+
+    <action>Split {{story_list}} into batches of {{batch_size}}</action>
+
+    <loop foreach="{{batches}}">
+      <action>Set {{current_batch}} = current batch</action>
+      <action>Set {{batch_number}} = loop index + 1</action>
+
+      <output>
+━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
+Batch {{batch_number}}/{{total_batches}} ({{batch_size}} stories)
+━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
+      </output>
+
+      <!-- Validate each story in batch -->
+      <loop foreach="{{current_batch}}">
+        <action>Set {{story_file}} = current story path</action>
+        <action>Extract {{story_id}} from filename</action>
+
+        <output>{{stories_validated + 1}}/{{story_count}}: Validating {{story_id}}...</output>
+
+        <!-- Invoke validate-story-deep workflow -->
+        <invoke-workflow path="{{validate_story_workflow}}">
+          <input name="story_file" value="{{story_file}}" />
+        </invoke-workflow>
+
+        <action>Parse validation results:
+          - category (VERIFIED_COMPLETE, FALSE_POSITIVE, etc.)
+          - verification_score
+          - false_positive_count
+          - false_negative_count
+          - critical_issues_count
+        </action>
+
+        <action>Store results for {{story_id}}</action>
+        <action>Increment counters based on category</action>
+
+        <output>  → {{category}} (Score: {{verification_score}}/100{{#if false_positives > 0}}, {{false_positives}} false positives{{/if}})</output>
+
+        <action>Increment stories_validated</action>
+      </loop>
+
+      <output>Batch {{batch_number}} complete. {{stories_validated}}/{{story_count}} total validated.</output>
+
+      <!-- Save progress after each batch -->
+      <action>Write progress to {{progress_file}}:
+        - stories_validated
+        - current_batch
+        - results_so_far
+      </action>
+    </loop>
+
+    <output>
+━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
+All Stories Validated
+━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
+
+**Total Validated:** {{story_count}}
+**Total Tasks Checked:** {{total_tasks_verified}}
+    </output>
+  </step>
+
+  <step n="3" goal="Consolidate results and calculate platform health">
+    <action>Calculate platform-wide metrics:
+      - Overall health score: (verified_complete / story_count) × 100
+      - False positive rate: (false_positive_stories / story_count) × 100
+      - Total rework estimate: false_positive_stories × 3h + needs_rework × 2h
+    </action>
+
+    <action>Group results by epic</action>
+
+    <action>Identify worst offenders (highest false positive rates)</action>
+
+    <output>
+📊 **PLATFORM HEALTH ASSESSMENT**
+
+**Overall Health Score:** {{health_score}}/100
+
+**Story Categories:**
+- ✅ VERIFIED_COMPLETE: {{verified_complete}} ({{verified_complete_pct}}%)
+- ⚠️ NEEDS_REWORK: {{needs_rework}} ({{needs_rework_pct}}%)
+- ❌ FALSE_POSITIVES: {{false_positives}} ({{false_positives_pct}}%)
+- 🔄 IN_PROGRESS: {{in_progress}} ({{in_progress_pct}}%)
+
+**Task-Level Issues:**
+- False positive tasks: {{total_false_positive_tasks}}
+- CRITICAL code quality issues: {{total_critical_issues}}
+
+**Estimated Rework:** {{total_rework_hours}} hours
+
+**Epic Breakdown:**
+{{#each epic_summary}}
+- Epic {{this.epic}}: {{this.health_score}}/100 ({{this.false_positives}} false positives)
+{{/each}}
+
+**Worst Offenders (Most False Positives):**
+{{#each worst_offenders limit=10}}
+- {{this.story_id}}: {{this.false_positive_count}} tasks, score {{this.score}}/100
+{{/each}}
+    </output>
+  </step>
+
+  <step n="4" goal="Generate comprehensive audit report">
+    <template-output>
+# Comprehensive Platform Audit Report
+
+**Generated:** {{date}}
+**Stories Validated:** {{story_count}}
+**Agent Model:** Haiku 4.5
+**Total Cost:** ~${{actual_cost}}
+
+---
+
+## Executive Summary
+
+**Platform Health Score:** {{health_score}}/100
+
+{{#if health_score >= 90}}
+✅ **EXCELLENT** - Platform is production-ready with high confidence
+{{else if health_score >= 75}}
+⚠️ **GOOD** - Minor issues to address, generally solid
+{{else if health_score >= 60}}
+⚠️ **NEEDS WORK** - Significant rework required before production
+{{else}}
+❌ **CRITICAL** - Major quality issues found, not production-ready
+{{/if}}
+
+**Key Findings:**
+- {{verified_complete}} stories verified complete ({{verified_complete_pct}}%)
+- {{false_positives}} stories are false positives ({{false_positives_pct}}%)
+- {{total_false_positive_tasks}} tasks claimed done but not implemented
+- {{total_critical_issues}} CRITICAL code quality issues found
+
+---
+
+## ❌ False Positive Stories ({{false_positives}} total)
+
+**These stories are marked "done" but have significant missing/stubbed code:**
+
+{{#each false_positive_stories}}
+### {{this.story_id}} (Score: {{this.score}}/100)
+
+**Current Status:** {{this.current_status}}
+**Should Be:** in-progress or ready-for-dev
+
+**Missing/Stubbed:**
+{{#each this.false_positive_tasks}}
+- {{this.task}}
+  - {{this.evidence}}
+{{/each}}
+
+**Estimated Fix:** {{this.estimated_hours}}h
+
+---
+{{/each}}
+
+**Total Rework:** {{false_positive_rework_hours}} hours
+
+---
+
+## ⚠️ Stories Needing Rework ({{needs_rework}} total)
+
+{{#each needs_rework_stories}}
+### {{this.story_id}} (Score: {{this.score}}/100)
+
+**Issues:**
+- {{this.false_positive_count}} incomplete tasks
+- {{this.critical_issues}} CRITICAL quality issues
+- {{this.high_issues}} HIGH priority issues
+
+**Top Issues:**
+{{#each this.top_issues limit=5}}
+- {{this}}
+{{/each}}
+
+---
+{{/each}}
+
+**Total Rework:** {{needs_rework_hours}} hours
+
+---
+
+## ✅ Verified Complete Stories ({{verified_complete}} total)
+
+**These stories are production-ready with verified code:**
+
+{{#each verified_complete_stories}}
+- {{this.story_id}} ({{this.score}}/100)
+{{/each}}
+
+---
+
+## 📊 Epic Health Breakdown
+
+{{#each epic_summary}}
+### Epic {{this.epic}}
+
+**Stories:** {{this.total}}
+**Verified Complete:** {{this.verified}} ({{this.verified_pct}}%)
+**False Positives:** {{this.false_positives}}
+**Needs Rework:** {{this.needs_rework}}
+
+**Health Score:** {{this.health_score}}/100
+
+{{#if this.health_score < 70}}
+⚠️ **ATTENTION NEEDED** - This epic has quality issues
+{{/if}}
+
+**Top Issues:**
+{{#each this.top_issues limit=3}}
+- {{this}}
+{{/each}}
+
+---
+{{/each}}
+
+---
+
+## 🎯 Recommended Action Plan
+
+### Phase 1: Fix False Positives (CRITICAL - {{false_positive_rework_hours}}h)
+
+{{#each false_positive_stories limit=20}}
+{{@index + 1}}. **{{this.story_id}}** ({{this.estimated_hours}}h)
+   - {{this.false_positive_count}} tasks to implement
+   - Update status to in-progress
+{{/each}}
+
+{{#if false_positives > 20}}
+... and {{false_positives - 20}} more (see full list above)
+{{/if}}
+
+### Phase 2: Address Rework Items (HIGH - {{needs_rework_hours}}h)
+
+{{#each needs_rework_stories limit=10}}
+{{@index + 1}}. **{{this.story_id}}** ({{this.estimated_hours}}h)
+   - Fix {{this.critical_issues}} CRITICAL issues
+   - Complete {{this.false_positive_count}} tasks
+{{/each}}
+
+### Phase 3: Fix False Negatives (LOW - batch update)
+
+- {{total_false_negative_tasks}} unchecked tasks that are actually complete
+- Can batch update checkboxes (low priority)
+
+---
+
+## 💰 Audit Cost Analysis
+
+**This Validation Run:**
+- Stories validated: {{story_count}}
+- Agent sessions: {{story_count}} (one Haiku agent per story)
+- Tokens used: ~{{tokens_used_millions}}M
+- Cost: ~${{actual_cost}}
+
+**Remediation Cost:**
+- Estimated hours: {{total_rework_hours}}h
+- At AI velocity: {{ai_velocity_days}} days of work
+- Token cost: ~${{remediation_token_cost}}
+
+**Total Investment:** ${{actual_cost}} (audit) + ${{remediation_token_cost}} (fixes) = ${{total_cost}}
+
+---
+
+## 📅 Next Steps
+
+1. **Immediate:** Fix {{false_positives}} false positive stories
+2. **This Week:** Address {{total_critical_issues}} CRITICAL issues
+3. **Next Week:** Rework {{needs_rework}} stories
+4. **Ongoing:** Re-validate fixed stories to confirm
+
+**Commands:**
+```bash
+# Validate specific story
+/validate-story-deep docs/sprint-artifacts/16e-6-ecs-task-definitions-tier3.md
+
+# Validate specific epic
+/validate-all-stories-deep --epic 16e
+
+# Re-run full audit (after fixes)
+/validate-all-stories-deep
+```
+
+---
+
+**Report Generated By:** validate-all-stories-deep workflow
+**Validation Method:** LLM-powered (Haiku 4.5 agents read actual code)
+**Confidence Level:** Very High (code-based verification, not regex patterns)
+    </template-output>
+  </step>
+
+</workflow>
--- a/src/modules/bmm/workflows/4-implementation/validate-all-stories-deep/workflow.yaml
+++ b/src/modules/bmm/workflows/4-implementation/validate-all-stories-deep/workflow.yaml
@ -0,0 +1,36 @@
+name: validate-all-stories-deep
+description: "Comprehensive platform audit using Haiku agents. Validates ALL stories by reading actual code. The bulletproof validation for production readiness."
+author: "BMad"
+version: "1.0.0"
+
+# Critical variables from config
+config_source: "{project-root}/_bmad/bmm/config.yaml"
+user_name: "{config_source}:user_name"
+communication_language: "{config_source}:communication_language"
+implementation_artifacts: "{config_source}:implementation_artifacts"
+story_dir: "{implementation_artifacts}"
+
+# Workflow components
+installed_path: "{project-root}/_bmad/bmm/workflows/4-implementation/validate-all-stories-deep"
+instructions: "{installed_path}/instructions.xml"
+
+# Input variables
+variables:
+  epic_filter: "" # Optional: Only validate specific epic (e.g., "16e")
+  batch_size: 5 # Validate 5 stories at a time (prevents spawning 511 agents at once!)
+  concurrent_limit: 5 # Max 5 agents running concurrently
+  auto_fix: false # If true, auto-update statuses based on validation
+  pause_between_batches: 30 # Seconds to wait between batches (rate limiting)
+
+# Sub-workflow
+validate_story_workflow: "{project-root}/_bmad/bmm/workflows/4-implementation/validate-story-deep/workflow.yaml"
+
+# Agent configuration
+agent_model: "haiku" # Cost: ~$66 for 511 stories vs $793 with Sonnet
+
+# Output
+default_output_file: "{story_dir}/.comprehensive-audit-{date}.md"
+progress_file: "{story_dir}/.validation-progress-{date}.yaml"
+
+standalone: true
+web_bundle: false
--- a/src/modules/bmm/workflows/4-implementation/validate-all-stories/instructions.xml
+++ b/src/modules/bmm/workflows/4-implementation/validate-all-stories/instructions.xml
@ -0,0 +1,411 @@
+<workflow>
+  <critical>The workflow execution engine is governed by: {project-root}/_bmad/core/tasks/workflow.xml</critical>
+  <critical>You MUST have already loaded and processed: {installed_path}/workflow.yaml</critical>
+  <critical>This is the COMPREHENSIVE AUDIT - validates every story's tasks against actual codebase</critical>
+
+  <step n="1" goal="Discover and categorize stories">
+    <action>Find all story files in {{story_dir}}</action>
+    <action>Filter out meta-documents:
+      - Files starting with "EPIC-" (completion reports)
+      - Files with "COMPLETION", "SUMMARY", "REPORT" in name
+      - Files starting with "." (hidden progress files)
+      - Files like "README", "INDEX", "SESSION-", "REVIEW-"
+    </action>
+
+    <check if="{{epic_filter}} provided">
+      <action>Filter to stories starting with {{epic_filter}}- (e.g., "16e-")</action>
+    </check>
+
+    <action>Store as {{story_list}}</action>
+    <action>Count {{story_count}}</action>
+
+    <output>🔍 **Comprehensive Story Validation**
+
+{{#if epic_filter}}
+**Epic Filter:** {{epic_filter}} only
+{{/if}}
+**Stories to Validate:** {{story_count}}
+**Validation Depth:** {{validation_depth}}
+**Parallel Mode:** {{parallel_validation}}
+
+**Estimated Time:** {{estimated_minutes}} minutes
+**Estimated Cost:** ~${{estimated_cost}} ({{story_count}} × ~$0.50/story)
+
+This will:
+1. Verify all tasks against actual codebase (task-verification-engine.py)
+2. Run code quality reviews on files with issues
+3. Check for regressions and integration failures
+4. Categorize stories: VERIFIED_COMPLETE, NEEDS_REWORK, FALSE_POSITIVE, etc.
+5. Generate comprehensive audit report
+
+Starting validation...
+    </output>
+  </step>
+
+  <step n="2" goal="Run task verification on all stories">
+    <action>Initialize counters:
+      - stories_validated = 0
+      - verified_complete = 0
+      - needs_rework = 0
+      - false_positives = 0
+      - in_progress = 0
+      - total_false_positive_tasks = 0
+      - total_tasks_verified = 0
+    </action>
+
+    <loop foreach="{{story_list}}">
+      <action>Set {{current_story}} = current story file</action>
+      <action>Extract {{story_id}} from filename</action>
+
+      <output>Validating {{counter}}/{{story_count}}: {{story_id}}...</output>
+
+      <!-- Run task verification engine -->
+      <action>Execute: python3 {{task_verification_script}} {{current_story}}</action>
+
+      <action>Parse output:
+        - total_tasks
+        - checked_tasks
+        - false_positives
+        - false_negatives
+        - verification_score
+        - task_details (with evidence)
+      </action>
+
+      <action>Categorize story:
+        IF verification_score >= 95 AND false_positives == 0
+          → category = "VERIFIED_COMPLETE"
+        ELSE IF verification_score >= 80 AND false_positives <= 2
+          → category = "COMPLETE_WITH_MINOR_ISSUES"
+        ELSE IF false_positives > 5 OR verification_score < 50
+          → category = "FALSE_POSITIVE" (claimed done but missing code)
+        ELSE IF verification_score < 80
+          → category = "NEEDS_REWORK"
+        ELSE IF checked_tasks == 0
+          → category = "NOT_STARTED"
+        ELSE
+          → category = "IN_PROGRESS"
+      </action>
+
+      <action>Store result:
+        - story_id
+        - verification_score
+        - category
+        - false_positive_count
+        - false_negative_count
+        - current_status (from sprint-status.yaml)
+        - recommended_status
+      </action>
+
+      <action>Increment counters based on category</action>
+      <action>Add false_positive_count to total</action>
+      <action>Add total_tasks to total_tasks_verified</action>
+
+      <output>  → {{category}} ({{verification_score}}/100, {{false_positives}} false positives)</output>
+    </loop>
+
+    <output>
+━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
+Validation Complete
+━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
+
+**Stories Validated:** {{story_count}}
+**Total Tasks Verified:** {{total_tasks_verified}}
+**Total False Positives:** {{total_false_positive_tasks}}
+    </output>
+  </step>
+
+  <step n="3" goal="Code quality review on problem stories" if="{{validation_depth}} == deep OR comprehensive">
+    <action>Filter stories where:
+      - category = "FALSE_POSITIVE" OR
+      - category = "NEEDS_REWORK" OR
+      - false_positives > 3
+    </action>
+
+    <action>Count {{problem_story_count}}</action>
+
+    <check if="{{problem_story_count}} > 0">
+      <output>
+🛡️ **Code Quality Review**
+
+Found {{problem_story_count}} stories with quality issues.
+Running multi-agent review on files from these stories...
+      </output>
+
+      <loop foreach="{{problem_stories}}">
+        <action>Extract file list from story Dev Agent Record</action>
+
+        <check if="files exist">
+          <action>Run /multi-agent-review on files:
+            - Security audit
+            - Silent failure detection
+            - Architecture compliance
+            - Type safety check
+          </action>
+
+          <action>Categorize review findings by severity</action>
+          <action>Add to story's issue list</action>
+        </check>
+      </loop>
+    </check>
+
+    <check if="{{problem_story_count}} == 0">
+      <output>✅ No problem stories found - all code quality looks good!</output>
+    </check>
+  </step>
+
+  <step n="4" goal="Integration verification" if="{{validation_depth}} == comprehensive">
+    <output>
+🔗 **Integration Verification**
+
+Checking for regressions and broken dependencies...
+    </output>
+
+    <action>For stories marked "VERIFIED_COMPLETE":
+      1. Extract service dependencies from story
+      2. Check if dependent services still exist
+      3. Run integration tests if they exist
+      4. Check for API contract breaking changes
+    </action>
+
+    <action>Detect overlaps:
+      - Multiple stories implementing same feature
+      - Duplicate files created
+      - Conflicting implementations
+    </action>
+
+    <output>
+**Regressions Found:** {{regression_count}}
+**Overlaps Detected:** {{overlap_count}}
+**Integration Tests:** {{integration_tests_run}} ({{integration_tests_passing}} passing)
+    </output>
+  </step>
+
+  <step n="5" goal="Generate comprehensive report">
+    <template-output>
+# Comprehensive Story Validation Report
+
+**Generated:** {{date}}
+**Stories Validated:** {{story_count}}
+**Validation Depth:** {{validation_depth}}
+**Epic Filter:** {{epic_filter}} {{#if_no_filter}}(all epics){{/if}}
+
+---
+
+## Executive Summary
+
+**Overall Health Score:** {{overall_health_score}}/100
+
+**Story Categories:**
+- ✅ **VERIFIED_COMPLETE:** {{verified_complete}} ({{verified_complete_pct}}%)
+- ⚠️ **NEEDS_REWORK:** {{needs_rework}} ({{needs_rework_pct}}%)
+- ❌ **FALSE_POSITIVES:** {{false_positives}} ({{false_positives_pct}}%)
+- 🔄 **IN_PROGRESS:** {{in_progress}} ({{in_progress_pct}}%)
+- 📋 **NOT_STARTED:** {{not_started}} ({{not_started_pct}}%)
+
+**Task Verification:**
+- Total tasks verified: {{total_tasks_verified}}
+- False positive tasks: {{total_false_positive_tasks}} ({{false_positive_rate}}%)
+- False negative tasks: {{total_false_negative_tasks}}
+
+**Code Quality:**
+- CRITICAL issues: {{critical_issues_total}}
+- HIGH issues: {{high_issues_total}}
+- Files reviewed: {{files_reviewed}}
+
+---
+
+## ❌ False Positive Stories (Claimed Done, Not Implemented)
+
+{{#each false_positive_stories}}
+### {{this.story_id}} (Score: {{this.verification_score}}/100)
+
+**Current Status:** {{this.current_status}}
+**Recommended:** in-progress or ready-for-dev
+
+**Issues:**
+{{#each this.false_positive_tasks}}
+- [ ] {{this.task}}
+  - Evidence: {{this.evidence}}
+{{/each}}
+
+**Action Required:**
+- Uncheck {{this.false_positive_count}} tasks
+- Implement missing code
+- Update sprint-status.yaml to in-progress
+{{/each}}
+
+**Total:** {{false_positive_stories_count}} stories
+
+---
+
+## ⚠️ Stories Needing Rework
+
+{{#each needs_rework_stories}}
+### {{this.story_id}} (Score: {{this.verification_score}}/100)
+
+**Issues:**
+- {{this.false_positive_count}} false positive tasks
+- {{this.critical_issue_count}} CRITICAL code quality issues
+- {{this.high_issue_count}} HIGH priority issues
+
+**Recommended:**
+1. Fix CRITICAL issues first
+2. Implement {{this.false_positive_count}} missing tasks
+3. Re-run validation
+{{/each}}
+
+**Total:** {{needs_rework_count}} stories
+
+---
+
+## ✅ Verified Complete Stories
+
+{{#each verified_complete_stories}}
+- {{this.story_id}} ({{this.verification_score}}/100)
+{{/each}}
+
+**Total:** {{verified_complete_count}} stories (production-ready)
+
+---
+
+## 📊 Epic Breakdown
+
+{{#each epic_summary}}
+### Epic {{this.epic_num}}
+
+**Stories:** {{this.total_count}}
+**Verified Complete:** {{this.verified_count}} ({{this.verified_pct}}%)
+**False Positives:** {{this.false_positive_count}}
+**Needs Rework:** {{this.needs_rework_count}}
+
+**Health Score:** {{this.health_score}}/100
+{{/each}}
+
+---
+
+## 🎯 Recommended Actions
+
+### Immediate (CRITICAL)
+
+{{#if false_positive_stories_count > 0}}
+**Fix {{false_positive_stories_count}} False Positive Stories:**
+
+{{#each false_positive_stories limit=10}}
+1. {{this.story_id}}: Update status to in-progress, implement {{this.false_positive_count}} missing tasks
+{{/each}}
+
+{{#if false_positive_stories_count > 10}}
+... and {{false_positive_stories_count - 10}} more (see full list above)
+{{/if}}
+{{/if}}
+
+### Short-term (HIGH Priority)
+
+{{#if needs_rework_count > 0}}
+**Address {{needs_rework_count}} Stories Needing Rework:**
+- Fix {{critical_issues_total}} CRITICAL code quality issues
+- Implement missing tasks
+- Re-validate after fixes
+{{/if}}
+
+### Maintenance (MEDIUM Priority)
+
+{{#if false_negative_count > 0}}
+**Update {{false_negative_count}} False Negative Tasks:**
+- Mark complete (code exists but checkbox unchecked)
+- Low impact, can batch update
+{{/if}}
+
+---
+
+## 💰 Cost Analysis
+
+**Validation Run:**
+- Stories validated: {{story_count}}
+- API tokens used: ~{{tokens_used}}K
+- Cost: ~${{cost}}
+
+**Remediation Estimate:**
+- False positives: {{false_positive_stories_count}} × 3h = {{remediation_hours_fp}}h
+- Needs rework: {{needs_rework_count}} × 2h = {{remediation_hours_rework}}h
+- **Total:** {{total_remediation_hours}}h estimated work
+
+---
+
+## 📅 Next Steps
+
+1. **Fix false positive stories** ({{false_positive_stories_count}} stories)
+2. **Address CRITICAL issues** ({{critical_issues_total}} issues)
+3. **Re-run validation** on fixed stories
+4. **Update sprint-status.yaml** with verified statuses
+5. **Run weekly validation** to prevent future drift
+
+---
+
+**Generated by:** /validate-all-stories workflow
+**Validation Engine:** task-verification-engine.py v2.0
+**Multi-Agent Review:** {{multi_agent_review_enabled}}
+    </template-output>
+  </step>
+
+  <step n="6" goal="Auto-fix if enabled" if="{{fix_mode}} == true">
+    <output>
+🔧 **Auto-Fix Mode Enabled**
+
+Applying automatic fixes:
+1. Update false negative checkboxes (code exists → mark [x])
+2. Update sprint-status.yaml with verified statuses
+3. Add validation scores to story files
+    </output>
+
+    <loop foreach="{{false_negative_tasks_list}}">
+      <action>Update story file: Change [ ] to [x] for verified tasks</action>
+      <output>  ✓ {{story_id}}: Checked {{task_count}} false negative tasks</output>
+    </loop>
+
+    <loop foreach="{{status_updates_list}}">
+      <action>Update sprint-status.yaml using sprint-status-updater.py</action>
+      <output>  ✓ {{story_id}}: {{old_status}} → {{new_status}}</output>
+    </loop>
+
+    <output>
+✅ Auto-fix complete
+  - {{false_negatives_fixed}} tasks checked
+  - {{statuses_updated}} story statuses updated
+    </output>
+  </step>
+
+  <step n="7" goal="Summary and recommendations">
+    <output>
+━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
+COMPREHENSIVE VALIDATION COMPLETE
+━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
+
+**Overall Health:** {{overall_health_score}}/100
+
+{{#if overall_health_score >= 90}}
+✅ **EXCELLENT** - Platform is production-ready
+{{else if overall_health_score >= 75}}
+⚠️ **GOOD** - Minor issues to address before production
+{{else if overall_health_score >= 60}}
+⚠️ **NEEDS WORK** - Significant rework required
+{{else}}
+❌ **CRITICAL** - Major quality issues found
+{{/if}}
+
+**Top Priorities:**
+1. Fix {{false_positive_stories_count}} false positive stories
+2. Address {{critical_issues_total}} CRITICAL code quality issues
+3. Complete {{in_progress_count}} in-progress stories
+4. Re-validate after fixes
+
+**Full Report:** {{default_output_file}}
+**Summary JSON:** {{validation_summary_file}}
+
+**Next Command:**
+  /validate-story <story-id>  # Deep-dive on specific story
+  /validate-all-stories --epic 16e  # Re-validate specific epic
+    </output>
+  </step>
+
+</workflow>
--- a/src/modules/bmm/workflows/4-implementation/validate-all-stories/workflow.yaml
+++ b/src/modules/bmm/workflows/4-implementation/validate-all-stories/workflow.yaml
@ -0,0 +1,36 @@
+name: validate-all-stories
+description: "Comprehensive audit of ALL stories: verify tasks against codebase, run code quality reviews, check integrations. The bulletproof audit for production readiness."
+author: "BMad"
+version: "1.0.0"
+
+# Critical variables from config
+config_source: "{project-root}/_bmad/bmm/config.yaml"
+user_name: "{config_source}:user_name"
+communication_language: "{config_source}:communication_language"
+implementation_artifacts: "{config_source}:implementation_artifacts"
+story_dir: "{implementation_artifacts}"
+
+# Workflow components
+installed_path: "{project-root}/_bmad/bmm/workflows/4-implementation/validate-all-stories"
+instructions: "{installed_path}/instructions.xml"
+
+# Input variables
+variables:
+  validation_depth: "deep" # Options: "quick" (tasks only), "deep" (tasks + review), "comprehensive" (full integration)
+  parallel_validation: true # Run story validations in parallel for speed
+  fix_mode: false # If true, auto-fix false negatives and update statuses
+  epic_filter: "" # Optional: Only validate stories from specific epic (e.g., "16e")
+
+# Tools
+task_verification_script: "{project-root}/scripts/lib/task-verification-engine.py"
+sprint_status_updater: "{project-root}/scripts/lib/sprint-status-updater.py"
+
+# Sub-workflow
+validate_story_workflow: "{project-root}/_bmad/bmm/workflows/4-implementation/validate-story/workflow.yaml"
+
+# Output
+default_output_file: "{story_dir}/.comprehensive-validation-report-{date}.md"
+validation_summary_file: "{story_dir}/.validation-summary-{date}.json"
+
+standalone: true
+web_bundle: false
--- a/src/modules/bmm/workflows/4-implementation/validate-epic-status/instructions.xml
+++ b/src/modules/bmm/workflows/4-implementation/validate-epic-status/instructions.xml
@ -0,0 +1,302 @@
+<workflow>
+  <critical>The workflow execution engine is governed by: {project-root}/_bmad/core/tasks/workflow.xml</critical>
+  <critical>You MUST have already loaded and processed: {installed_path}/workflow.yaml</critical>
+  <critical>This is VALIDATION-ONLY mode - NO implementation, only status correction</critical>
+  <critical>Uses same logic as autonomous-epic but READS instead of WRITES code</critical>
+
+  <step n="1" goal="Validate inputs and load epic">
+    <action>Check if {{epic_num}} was provided</action>
+
+    <check if="{{epic_num}} is empty">
+      <ask>Which epic should I validate? (e.g., 19, 16d, 16e, 9b)</ask>
+      <action>Store response as {{epic_num}}</action>
+    </check>
+
+    <action>Load {{sprint_status_file}}</action>
+
+    <check if="file not found">
+      <output>❌ sprint-status.yaml not found at: {{sprint_status_file}}
+
+Run /bmad:bmm:workflows:sprint-planning to create it first.
+      </output>
+      <action>HALT</action>
+    </check>
+
+    <action>Search for epic-{{epic_num}} entry in sprint_status_file</action>
+    <action>Extract all story entries for epic-{{epic_num}} (pattern: {{epic_num}}-*)</action>
+    <action>Count stories found in sprint-status.yaml for this epic</action>
+
+    <output>🔍 **Validating Epic {{epic_num}}**
+
+Found {{story_count}} stories in sprint-status.yaml
+Scanning story files for REALITY check...
+    </output>
+  </step>
+
+  <step n="2" goal="Scan and validate all story files">
+    <critical>This is where we determine TRUTH - not from status fields, but from actual file analysis</critical>
+
+    <action>For each story in epic (from sprint-status.yaml):
+      1. Build story file path: {{story_dir}}/{{story_key}}.md
+      2. Check if file exists
+      3. If exists, read FULL file
+      4. Analyze file content
+    </action>
+
+    <action>For each story file, extract:
+      - File size in KB
+      - Total task count (count all "- [ ]" and "- [x]" lines)
+      - Checked task count (count "- [x]" lines)
+      - Completion rate (checked / total * 100)
+      - Explicit Status: field (if present)
+      - Has proper BMAD structure (12 sections)
+      - Section count (count ## headings)
+    </action>
+
+    <output>📊 **Story File Quality Analysis**
+
+Analyzing {{story_count}} story files...
+    </output>
+
+    <action>For each story, classify quality:
+      VALID:
+        - File size >= 10KB
+        - Total tasks >= 5
+        - Has task list structure
+
+      INVALID:
+        - File size < 10KB (incomplete story)
+        - Total tasks < 5 (not detailed enough)
+        - File missing entirely
+    </action>
+
+    <action>Store results as {{story_quality_map}}</action>
+
+    <output>Quality Summary:
+  Valid stories: {{valid_count}}/{{story_count}}
+  Invalid stories: {{invalid_count}}
+  Missing files: {{missing_count}}
+    </output>
+  </step>
+
+  <step n="3" goal="Cross-reference git commits">
+    <action>Run git log to find commits mentioning epic stories:
+      Command: git log --oneline --since={{git_commit_lookback_days}} days ago
+    </action>
+
+    <action>Parse commit messages for story IDs matching pattern: {{epic_num}}-\d+[a-z]?</action>
+    <action>Build map of story_id → commit_count</action>
+
+    <output>Git Commit Evidence:
+  Stories with commits: {{stories_with_commits_count}}
+  Stories without commits: {{stories_without_commits_count}}
+    </output>
+  </step>
+
+  <step n="4" goal="Check autonomous completion reports">
+    <action>Search {{story_dir}} for files:
+      - .epic-{{epic_num}}-completion-report.md
+      - .autonomous-epic-{{epic_num}}-progress.yaml
+    </action>
+
+    <check if="autonomous report found">
+      <action>Parse completed_stories list from progress file OR
+              Parse ✅ story entries from completion report</action>
+      <action>Store as {{autonomous_completed_stories}}</action>
+
+      <output>📋 Autonomous Report Found:
+  {{autonomous_completed_count}} stories marked complete
+      </output>
+    </check>
+
+    <check if="no autonomous report">
+      <output>ℹ️  No autonomous completion report found (manual epic)</output>
+    </check>
+  </step>
+
+  <step n="5" goal="Infer correct status for each story">
+    <critical>Use MULTIPLE sources of truth, not just Status: field</critical>
+
+    <action>For each story in epic, determine correct status using this logic:</action>
+
+    <logic>
+    Priority 1: Autonomous completion report
+      IF story in autonomous_completed_stories
+      → Status = "done" (VERY HIGH confidence)
+
+    Priority 2: Task completion rate + file quality
+      IF completion_rate >= 90% AND file is VALID (>10KB, >5 tasks)
+      → Status = "done" (HIGH confidence)
+
+      IF completion_rate 50-89% AND file is VALID
+      → Status = "in-progress" (MEDIUM confidence)
+
+      IF completion_rate < 50% AND file is VALID
+      → Status = "ready-for-dev" (MEDIUM confidence)
+
+    Priority 3: Explicit Status: field (if no other evidence)
+      IF Status: field exists AND matches above inferences
+      → Use it (MEDIUM confidence)
+
+      IF Status: field conflicts with task completion
+      → Prefer task completion (tasks are ground truth)
+
+    Priority 4: Git commits (supporting evidence)
+      IF 3+ commits + task completion >=90%
+      → Upgrade confidence to VERY HIGH
+
+      IF 1-2 commits but task completion <50%
+      → Status = "in-progress" (work started but not done)
+
+    Quality Gates:
+      IF file size < 10KB OR total tasks < 5
+      → DOWNGRADE status (can't be "done" if file is incomplete)
+      → Mark as "ready-for-dev" (story needs proper creation)
+      → Flag for regeneration with /create-story
+
+    Missing Files:
+      IF story file doesn't exist
+      → Status = "backlog" (story not created yet)
+    </logic>
+
+    <action>Build map of story_id → inferred_status with evidence and confidence</action>
+
+    <output>📊 **Status Inference Complete**
+
+Stories to update:
+{{#each_story_needing_update}}
+  {{story_id}}:
+    Current: {{current_status_in_yaml}}
+    Inferred: {{inferred_status}}
+    Confidence: {{confidence}}
+    Evidence: {{evidence_summary}}
+    Quality: {{file_size_kb}}KB, {{total_tasks}} tasks, {{completion_rate}}% done
+{{/each}}
+    </output>
+  </step>
+
+  <step n="6" goal="Apply updates or report findings">
+    <check if="{{validation_mode}} == report-only">
+      <output>📝 **REPORT-ONLY MODE** - No changes will be made
+
+Recommendations saved to: {{default_output_file}}
+      </output>
+      <action>Write detailed report to {{default_output_file}}</action>
+      <action>EXIT workflow</action>
+    </check>
+
+    <check if="{{validation_mode}} == fix OR {{validation_mode}} == strict">
+      <output>🔧 **FIX MODE** - Updating sprint-status.yaml...
+
+Backing up to: .sprint-status-backups/
+      </output>
+
+      <action>Create backup of {{sprint_status_file}}</action>
+      <action>For each story needing update:
+        1. Find story entry in development_status section
+        2. Update status to inferred_status
+        3. Add comment: "✅ Validated {{date}} - {{evidence_summary}}"
+        4. Preserve all other content and structure
+      </action>
+
+      <action>Update epic-{{epic_num}} status based on story completion:
+        IF all stories have status "done" AND all are valid files
+        → epic status = "done"
+
+        IF any stories "in-progress" OR "review"
+        → epic status = "in-progress"
+
+        IF all stories "backlog" OR "ready-for-dev"
+        → epic status = "backlog"
+      </action>
+
+      <action>Update last_verified timestamp in header</action>
+      <action>Save {{sprint_status_file}}</action>
+
+      <output>✅ **sprint-status.yaml Updated**
+
+Applied {{updates_count}} story status corrections
+Epic {{epic_num}}: {{old_epic_status}} → {{new_epic_status}}
+
+Backup: {{backup_path}}
+      </output>
+    </check>
+  </step>
+
+  <step n="7" goal="Identify problem stories requiring action">
+    <action>Flag stories with issues:
+      - Missing story files (in sprint-status.yaml but no .md file)
+      - Invalid files (< 10KB or < 5 tasks)
+      - Conflicting evidence (Status: says done, tasks unchecked)
+      - Poor quality (no BMAD sections)
+    </action>
+
+    <output>⚠️  **Problem Stories Requiring Attention:**
+
+{{#if_missing_files}}
+**Missing Files ({{missing_count}}):**
+{{#each_missing}}
+  - {{story_id}}: Referenced in sprint-status.yaml but file not found
+    Action: Run /create-story OR remove from sprint-status.yaml
+{{/each}}
+{{/if}}
+
+{{#if_invalid_quality}}
+**Invalid Quality ({{invalid_count}}):**
+{{#each_invalid}}
+  - {{story_id}}: {{file_size_kb}}KB, {{total_tasks}} tasks
+    Action: Regenerate with /create-story-with-gap-analysis
+{{/each}}
+{{/if}}
+
+{{#if_conflicting_evidence}}
+**Conflicting Evidence ({{conflict_count}}):**
+{{#each_conflict}}
+  - {{story_id}}: Status: says "{{status_field}}" but {{completion_rate}}% tasks checked
+    Action: Manual review recommended
+{{/each}}
+{{/if}}
+    </output>
+  </step>
+
+  <step n="8" goal="Report results and recommendations">
+    <output>
+━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
+Epic {{epic_num}} Validation Complete
+━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
+
+**Epic Status:** {{epic_status}}
+
+**Stories:**
+  Done: {{done_count}}
+  In-Progress: {{in_progress_count}}
+  Review: {{review_count}}
+  Ready-for-Dev: {{ready_count}}
+  Backlog: {{backlog_count}}
+
+**Quality:**
+  Valid: {{valid_count}} (>=10KB, >=5 tasks)
+  Invalid: {{invalid_count}} (poor quality)
+  Missing: {{missing_count}} (file not found)
+
+**Updates Applied:** {{updates_count}}
+
+**Next Steps:**
+{{#if_invalid_count_gt_0}}
+  1. Regenerate {{invalid_count}} invalid stories with /create-story
+{{/if}}
+{{#if_missing_count_gt_0}}
+  2. Create {{missing_count}} missing story files OR remove from sprint-status.yaml
+{{/if}}
+{{#if_done_count_eq_story_count}}
+  3. Epic complete! Consider running /retrospective
+{{/if}}
+{{#if_in_progress_count_gt_0}}
+  3. Continue with in-progress stories: /dev-story {{first_in_progress}}
+{{/if}}
+    </output>
+
+    <output>💾 Detailed report saved to: {{default_output_file}}</output>
+  </step>
+
+</workflow>
--- a/src/modules/bmm/workflows/4-implementation/validate-epic-status/workflow.yaml
+++ b/src/modules/bmm/workflows/4-implementation/validate-epic-status/workflow.yaml
@ -0,0 +1,34 @@
+name: validate-epic-status
+description: "Validate and fix sprint-status.yaml for a single epic. Scans story files for task completion, validates quality (>10KB, proper tasks), checks git commits, updates sprint-status.yaml to match REALITY."
+author: "BMad"
+version: "1.0.0"
+
+# Critical variables from config
+config_source: "{project-root}/_bmad/bmm/config.yaml"
+user_name: "{config_source}:user_name"
+communication_language: "{config_source}:communication_language"
+implementation_artifacts: "{config_source}:implementation_artifacts"
+story_dir: "{implementation_artifacts}"
+
+# Workflow components
+installed_path: "{project-root}/_bmad/bmm/workflows/4-implementation/validate-epic-status"
+instructions: "{installed_path}/instructions.xml"
+
+# Inputs
+variables:
+  epic_num: "" # User provides (e.g., "19", "16d", "16e")
+  sprint_status_file: "{implementation_artifacts}/sprint-status.yaml"
+  validation_mode: "fix" # Options: "report-only", "fix", "strict"
+
+# Validation criteria
+validation_rules:
+  min_story_size_kb: 10 # Stories should be >= 10KB
+  min_tasks_required: 5 # Stories should have >= 5 tasks
+  completion_threshold: 90 # 90%+ tasks checked = "done"
+  git_commit_lookback_days: 30 # Search last 30 days for commits
+
+# Output
+default_output_file: "{story_dir}/.epic-{epic_num}-validation-report.md"
+
+standalone: true
+web_bundle: false
--- a/src/modules/bmm/workflows/4-implementation/validate-story-deep/instructions.xml
+++ b/src/modules/bmm/workflows/4-implementation/validate-story-deep/instructions.xml
@ -0,0 +1,370 @@
+<workflow>
+  <critical>The workflow execution engine is governed by: {project-root}/_bmad/core/tasks/workflow.xml</critical>
+  <critical>You MUST have already loaded and processed: {installed_path}/workflow.yaml</critical>
+  <critical>This uses HAIKU AGENTS to read actual code and verify task completion - NOT regex patterns</critical>
+
+  <step n="1" goal="Load and parse story">
+    <action>Load story file from {{story_file}}</action>
+
+    <check if="file not found">
+      <output>❌ Story file not found: {{story_file}}</output>
+      <action>HALT</action>
+    </check>
+
+    <action>Extract story metadata:
+      - Story ID from filename
+      - Epic number from "Epic:" field
+      - Current status from "Status:" or "**Status:**" field
+      - Files created/modified from Dev Agent Record section
+    </action>
+
+    <action>Extract ALL tasks (pattern: "- [ ]" or "- [x]"):
+      - Parse checkbox state (checked/unchecked)
+      - Extract task text
+      - Count total, checked, unchecked
+    </action>
+
+    <output>📋 **Deep Story Validation: {{story_id}}**
+
+**Epic:** {{epic_num}}
+**Current Status:** {{current_status}}
+**Tasks:** {{checked_count}}/{{total_count}} checked
+**Files Referenced:** {{file_count}}
+
+**Validation Method:** Haiku agents read actual code
+**Cost Estimate:** ~$0.13 for this story
+
+Starting task-by-task verification...
+    </output>
+  </step>
+
+  <step n="2" goal="Verify ALL tasks with single Haiku agent">
+    <critical>Spawn ONE Haiku agent to verify ALL tasks (avoids 50x agent startup overhead!)</critical>
+
+    <output>Spawning Haiku verification agent for {{total_count}} tasks...</output>
+
+    <!-- Spawn SINGLE Haiku agent to verify ALL tasks in this story -->
+    <invoke-task type="Task" model="haiku">
+      <description>Verify all {{total_count}} story tasks</description>
+      <prompt>
+You are verifying ALL tasks for this user story by reading actual code.
+
+**Story:** {{story_id}}
+**Epic:** {{epic_num}}
+**Total Tasks:** {{total_count}}
+
+**Files from Story (Dev Agent Record):**
+{{#each file_list}}
+- {{this}}
+{{/each}}
+
+**Tasks to Verify:**
+
+{{#each task_list}}
+{{@index}}. [{{#if this.checked}}x{{else}} {{/if}}] {{this.text}}
+{{/each}}
+
+---
+
+**Your Job:**
+
+For EACH task above:
+
+1. **Find relevant files** - Use Glob to find files mentioned in task
+2. **Read the files** - Use Read tool to examine actual code
+3. **Verify implementation:**
+   - Is code real or stubs/TODOs?
+   - Is there error handling?
+   - Multi-tenant isolation (dealerId filters)?
+   - Are there tests?
+   - Does it match task description?
+
+4. **Make judgment for each task**
+
+**Output Format - JSON array with one entry per task:**
+
+```json
+{
+  "story_id": "{{story_id}}",
+  "total_tasks": {{total_count}},
+  "tasks": [
+    {
+      "task_number": 0,
+      "task_text": "Implement UserService",
+      "is_checked": true,
+      "actually_complete": false,
+      "confidence": "high",
+      "evidence": "File exists but has 'TODO: Implement findById' on line 45, tests not found",
+      "issues_found": ["Stub implementation", "Missing tests", "No dealerId filter"],
+      "recommendation": "Implement real logic, add tests, add multi-tenant isolation"
+    },
+    {
+      "task_number": 1,
+      "task_text": "Add error handling",
+      "is_checked": true,
+      "actually_complete": true,
+      "confidence": "very_high",
+      "evidence": "Try-catch blocks in UserService.ts:67-89, proper error logging, tests verify error cases",
+      "issues_found": [],
+      "recommendation": "None - task complete"
+    }
+  ]
+}
+```
+
+**Be efficient:** Read files once, verify all tasks, return comprehensive JSON.
+      </prompt>
+      <subagent_type>general-purpose</subagent_type>
+    </invoke-task>
+
+    <action>Parse agent response (extract JSON)</action>
+
+    <action>For each task result:
+      - Determine verification_status (correct/false_positive/false_negative)
+      - Categorize into verified_complete, false_positives, false_negatives lists
+      - Count totals
+    </action>
+
+    <output>
+━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
+Task Verification Complete
+━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
+
+**✅ Verified Complete:** {{verified_complete_count}}
+**❌ False Positives:** {{false_positive_count}} (checked but code missing/poor)
+**⚠️ False Negatives:** {{false_negative_count}} (unchecked but code exists)
+**❓ Uncertain:** {{uncertain_count}}
+
+**Verification Score:** {{verification_score}}/100
+    </output>
+  </step>
+
+  <step n="3" goal="Calculate overall story health">
+    <action>Calculate scores:
+      - Task accuracy: (correct / total) × 100
+      - False positive penalty: false_positive_count × -5
+      - Overall score: max(0, task_accuracy + penalty)
+    </action>
+
+    <action>Determine story category:
+      IF score >= 95 AND false_positives == 0
+        → VERIFIED_COMPLETE
+      ELSE IF score >= 80 AND false_positives <= 2
+        → COMPLETE_WITH_MINOR_ISSUES
+      ELSE IF false_positives > 5 OR score < 50
+        → FALSE_POSITIVE (story claimed done but significant missing code)
+      ELSE IF false_positives > 0
+        → NEEDS_REWORK
+      ELSE
+        → IN_PROGRESS
+    </action>
+
+    <action>Determine recommended status:
+      VERIFIED_COMPLETE → "done"
+      COMPLETE_WITH_MINOR_ISSUES → "review"
+      FALSE_POSITIVE → "in-progress" or "ready-for-dev"
+      NEEDS_REWORK → "in-progress"
+      IN_PROGRESS → "in-progress"
+    </action>
+
+    <output>
+📊 **STORY HEALTH ASSESSMENT**
+
+**Current Status:** {{current_status}}
+**Recommended Status:** {{recommended_status}}
+**Overall Score:** {{overall_score}}/100
+
+**Category:** {{category}}
+
+{{#if category == "VERIFIED_COMPLETE"}}
+✅ **Story is production-ready**
+- All tasks verified complete
+- Code quality confirmed
+- No significant issues found
+{{/if}}
+
+{{#if category == "FALSE_POSITIVE"}}
+❌ **Story claimed done but has significant missing code**
+- {{false_positive_count}} tasks checked but not implemented
+- Verification score: {{overall_score}}/100 (< 50% = false positive)
+- Action: Update status to in-progress, implement missing tasks
+{{/if}}
+
+{{#if category == "NEEDS_REWORK"}}
+⚠️ **Story needs rework before marking complete**
+- {{false_positive_count}} tasks with missing/poor code
+- Issues found in verification
+- Action: Fix issues, re-verify
+{{/if}}
+    </output>
+  </step>
+
+  <step n="4" goal="Generate detailed validation report">
+    <template-output>
+# Story Validation Report: {{story_id}}
+
+**Generated:** {{date}}
+**Validation Method:** LLM-powered deep verification (Haiku 4.5)
+**Overall Score:** {{overall_score}}/100
+**Category:** {{category}}
+
+---
+
+## Summary
+
+**Story:** {{story_id}}
+**Epic:** {{epic_num}}
+**Current Status:** {{current_status}}
+**Recommended Status:** {{recommended_status}}
+
+**Task Verification:**
+- Total: {{total_count}}
+- Checked: {{checked_count}}
+- Verified Complete: {{verified_complete_count}}
+- False Positives: {{false_positive_count}}
+- False Negatives: {{false_negative_count}}
+
+---
+
+## Verification Details
+
+{{#if false_positive_count > 0}}
+### ❌ False Positives (CRITICAL - Code Claims vs Reality)
+
+{{#each false_positives}}
+**Task {{@index + 1}}:** {{this.task}}
+**Claimed:** [x] Complete
+**Reality:** Code missing or stub implementation
+
+**Evidence:**
+{{this.evidence}}
+
+**Issues Found:**
+{{#each this.issues_found}}
+- {{this}}
+{{/each}}
+
+**Recommendation:** {{this.recommendation}}
+
+---
+{{/each}}
+{{/if}}
+
+{{#if false_negative_count > 0}}
+### ⚠️ False Negatives (Unchecked But Working)
+
+{{#each false_negatives}}
+**Task {{@index + 1}}:** {{this.task}}
+**Status:** [ ] Unchecked
+**Reality:** Code exists and working
+
+**Evidence:**
+{{this.evidence}}
+
+**Recommendation:** Mark task as complete [x]
+
+---
+{{/each}}
+{{/if}}
+
+{{#if verified_complete_count > 0}}
+### ✅ Verified Complete Tasks
+
+{{verified_complete_count}} tasks verified with actual code review.
+
+{{#if show_all_verified}}
+{{#each verified_complete}}
+- {{this.task}} ({{this.confidence}} confidence)
+{{/each}}
+{{/if}}
+{{/if}}
+
+---
+
+## Final Verdict
+
+**Overall Score:** {{overall_score}}/100
+
+{{#if category == "VERIFIED_COMPLETE"}}
+✅ **VERIFIED COMPLETE**
+
+This story is production-ready:
+- All {{total_count}} tasks verified complete
+- Code quality confirmed through review
+- No significant issues found
+- Status "done" is accurate
+
+**Action:** None needed - story is solid
+{{/if}}
+
+{{#if category == "FALSE_POSITIVE"}}
+❌ **FALSE POSITIVE - Story NOT Actually Complete**
+
+**Problems:**
+- {{false_positive_count}} tasks checked but code missing/stubbed
+- Verification score: {{overall_score}}/100 (< 50%)
+- Story marked "{{current_status}}" but significant work remains
+
+**Required Actions:**
+1. Update sprint-status.yaml: {{story_id}} → in-progress
+2. Uncheck {{false_positive_count}} false positive tasks
+3. Implement missing code
+4. Re-run validation after implementation
+
+**Estimated Rework:** {{estimated_rework_hours}} hours
+{{/if}}
+
+{{#if category == "NEEDS_REWORK"}}
+⚠️ **NEEDS REWORK**
+
+**Problems:**
+- {{false_positive_count}} tasks with quality issues
+- Some code exists but has problems (TODOs, missing features, poor quality)
+
+**Required Actions:**
+{{#each action_items}}
+- [ ] {{this}}
+{{/each}}
+
+**Estimated Fix Time:** {{estimated_fix_hours}} hours
+{{/if}}
+
+{{#if category == "IN_PROGRESS"}}
+🔄 **IN PROGRESS** (accurate status)
+
+- {{checked_count}}/{{total_count}} tasks complete
+- {{unchecked_count}} tasks remaining
+- Current status reflects reality
+
+**No action needed** - continue implementation
+{{/if}}
+
+---
+
+**Validation Cost:** ~${{validation_cost}}
+**Agent Model:** {{agent_model}}
+**Tasks Verified:** {{total_count}}
+    </template-output>
+  </step>
+
+  <step n="5" goal="Update sprint-status if needed">
+    <check if="{{recommended_status}} != {{current_status}}">
+      <ask>Story status should be updated from "{{current_status}}" to "{{recommended_status}}". Update sprint-status.yaml? (y/n)</ask>
+
+      <check if="user says yes">
+        <action>Update sprint-status.yaml:
+          python3 scripts/lib/sprint-status-updater.py --epic {{epic_num}} --mode fix
+        </action>
+
+        <action>Add validation note to story file Dev Agent Record</action>
+
+        <output>✅ Updated {{story_id}}: {{current_status}} → {{recommended_status}}</output>
+      </check>
+    </check>
+
+    <check if="{{recommended_status}} == {{current_status}}">
+      <output>✅ Story status is accurate - no changes needed</output>
+    </check>
+  </step>
+
+</workflow>
--- a/src/modules/bmm/workflows/4-implementation/validate-story-deep/workflow.yaml
+++ b/src/modules/bmm/workflows/4-implementation/validate-story-deep/workflow.yaml
@ -0,0 +1,29 @@
+name: validate-story-deep
+description: "Deep story validation using Haiku agents to read and verify actual code. Each task gets micro code review to verify implementation quality."
+author: "BMad"
+version: "1.0.0"
+
+# Critical variables from config
+config_source: "{project-root}/_bmad/bmm/config.yaml"
+user_name: "{config_source}:user_name"
+communication_language: "{config_source}:communication_language"
+implementation_artifacts: "{config_source}:implementation_artifacts"
+story_dir: "{implementation_artifacts}"
+
+# Workflow components
+installed_path: "{project-root}/_bmad/bmm/workflows/4-implementation/validate-story-deep"
+instructions: "{installed_path}/instructions.xml"
+
+# Input variables
+variables:
+  story_file: "" # Path to story file to validate
+
+# Agent configuration
+agent_model: "haiku" # Use Haiku 4.5 for cost efficiency ($0.13/story vs $1.50)
+parallel_tasks: true # Validate tasks in parallel (faster)
+
+# Output
+default_output_file: "{story_dir}/.validation-{story_id}-{date}.md"
+
+standalone: true
+web_bundle: false
--- a/src/modules/bmm/workflows/4-implementation/validate-story/instructions.xml
+++ b/src/modules/bmm/workflows/4-implementation/validate-story/instructions.xml
@ -0,0 +1,395 @@
+<workflow>
+  <critical>The workflow execution engine is governed by: {project-root}/_bmad/core/tasks/workflow.xml</critical>
+  <critical>You MUST have already loaded and processed: {installed_path}/workflow.yaml</critical>
+  <critical>This performs DEEP validation - not just checkbox counting, but verifying code actually exists and works</critical>
+
+  <step n="1" goal="Load and parse story file">
+    <action>Load story file from {{story_file}}</action>
+
+    <check if="file not found">
+      <output>❌ Story file not found: {{story_file}}
+
+Please provide a valid story file path.
+      </output>
+      <action>HALT</action>
+    </check>
+
+    <action>Extract story metadata:
+      - Story ID (from filename)
+      - Epic number
+      - Current status from Status: field
+      - Priority
+      - Estimated effort
+    </action>
+
+    <action>Extract all tasks:
+      - Pattern: "- [ ]" or "- [x]"
+      - Count total tasks
+      - Count checked tasks
+      - Count unchecked tasks
+      - Calculate completion percentage
+    </action>
+
+    <action>Extract file references from Dev Agent Record:
+      - Files created
+      - Files modified
+      - Files deleted
+    </action>
+
+    <output>📋 **Story Validation: {{story_id}}**
+
+**Epic:** {{epic_num}}
+**Current Status:** {{current_status}}
+**Tasks:** {{checked_count}}/{{total_count}} complete ({{completion_pct}}%)
+**Files Referenced:** {{file_count}}
+
+Starting deep validation...
+    </output>
+  </step>
+
+  <step n="2" goal="Task-based verification (Deep)">
+    <critical>Use task-verification-engine.py for DEEP verification (not just file existence)</critical>
+
+    <action>For each task in story:
+      1. Extract task text
+      2. Note if checked [x] or unchecked [ ]
+      3. Pass to task-verification-engine.py
+      4. Receive verification result with:
+         - should_be_checked: true/false
+         - confidence: very high/high/medium/low
+         - evidence: list of findings
+         - verification_status: correct/false_positive/false_negative/uncertain
+    </action>
+
+    <action>Categorize tasks by verification status:
+      - ✅ CORRECT: Checkbox matches reality
+      - ❌ FALSE POSITIVE: Checked but code missing/stubbed
+      - ⚠️ FALSE NEGATIVE: Unchecked but code exists
+      - ❓ UNCERTAIN: Cannot verify (low confidence)
+    </action>
+
+    <action>Calculate verification score:
+      - (correct_tasks / total_tasks) × 100
+      - Penalize false positives heavily (-5 points each)
+      - Penalize false negatives lightly (-2 points each)
+    </action>
+
+    <output>
+🔍 **Task Verification Results**
+
+**Total Tasks:** {{total_count}}
+
+**✅ CORRECT:** {{correct_count}} tasks (checkbox matches reality)
+**❌ FALSE POSITIVES:** {{false_positive_count}} tasks (checked but code missing/stubbed)
+**⚠️ FALSE NEGATIVES:** {{false_negative_count}} tasks (unchecked but code exists)
+**❓ UNCERTAIN:** {{uncertain_count}} tasks (cannot verify)
+
+**Verification Score:** {{verification_score}}/100
+
+{{#if false_positive_count > 0}}
+### ❌ False Positives (CRITICAL - Code Claims vs Reality)
+
+{{#each false_positives}}
+**Task:** {{this.task}}
+**Claimed:** [x] Complete
+**Reality:** {{this.evidence}}
+**Action Required:** {{this.recommended_action}}
+{{/each}}
+{{/if}}
+
+{{#if false_negative_count > 0}}
+### ⚠️ False Negatives (Unchecked but Working)
+
+{{#each false_negatives}}
+**Task:** {{this.task}}
+**Status:** [ ] Unchecked
+**Reality:** {{this.evidence}}
+**Recommendation:** Mark as complete [x]
+{{/each}}
+{{/if}}
+    </output>
+  </step>
+
+  <step n="3" goal="Code quality review" if="{{validation_depth}} == deep OR comprehensive">
+    <action>Extract all files from Dev Agent Record file list</action>
+
+    <check if="no files listed">
+      <output>⚠️ No files listed in Dev Agent Record - cannot perform code review</output>
+      <action>Skip to step 4</action>
+    </check>
+
+    <action>For each file:
+      1. Check if file exists
+      2. Read file content
+      3. Check for quality issues:
+         - TODO/FIXME comments without GitHub issues
+         - any types in TypeScript
+         - Hardcoded values (siteId, dealerId, API keys)
+         - Missing error handling
+         - Missing multi-tenant isolation (dealerId filters)
+         - Missing audit logging on mutations
+         - Security vulnerabilities (SQL injection, XSS)
+    </action>
+
+    <action>Run multi-agent review if files exist:
+      - Security audit
+      - Silent failure detection
+      - Architecture compliance
+      - Performance analysis
+    </action>
+
+    <action>Categorize issues by severity:
+      - CRITICAL: Security, data loss, breaking changes
+      - HIGH: Missing features, poor quality, technical debt
+      - MEDIUM: Code smells, minor violations
+      - LOW: Style issues, nice-to-haves
+    </action>
+
+    <output>
+🛡️ **Code Quality Review**
+
+**Files Reviewed:** {{files_reviewed}}
+**Files Missing:** {{files_missing}}
+
+**Issues Found:** {{total_issues}}
+  CRITICAL: {{critical_count}}
+  HIGH: {{high_count}}
+  MEDIUM: {{medium_count}}
+  LOW: {{low_count}}
+
+{{#if critical_count > 0}}
+### 🚨 CRITICAL Issues (Must Fix)
+
+{{#each critical_issues}}
+**File:** {{this.file}}
+**Issue:** {{this.description}}
+**Impact:** {{this.impact}}
+**Fix:** {{this.recommended_fix}}
+{{/each}}
+{{/if}}
+
+{{#if high_count > 0}}
+### ⚠️ HIGH Priority Issues
+
+{{#each high_issues}}
+**File:** {{this.file}}
+**Issue:** {{this.description}}
+{{/each}}
+{{/if}}
+
+**Code Quality Score:** {{quality_score}}/100
+    </output>
+  </step>
+
+  <step n="4" goal="Integration verification" if="{{validation_depth}} == comprehensive">
+    <action>Extract dependencies from story:
+      - Services called
+      - APIs consumed
+      - Database tables used
+      - Cache keys accessed
+    </action>
+
+    <action>For each dependency:
+      1. Check if dependency still exists
+      2. Check if API contract is still valid
+      3. Run integration tests if they exist
+      4. Check for breaking changes in dependent stories
+    </action>
+
+    <output>
+🔗 **Integration Verification**
+
+**Dependencies Checked:** {{dependency_count}}
+
+{{#if broken_integrations}}
+### ❌ Broken Integrations
+
+{{#each broken_integrations}}
+**Dependency:** {{this.name}}
+**Issue:** {{this.problem}}
+**Likely Cause:** {{this.cause}}
+**Fix:** {{this.fix}}
+{{/each}}
+{{/if}}
+
+{{#if all_integrations_ok}}
+✅ All integrations verified working
+{{/if}}
+    </output>
+  </step>
+
+  <step n="5" goal="Determine final story status">
+    <action>Calculate overall story health:
+      - Task verification score (0-100)
+      - Code quality score (0-100)
+      - Integration score (0-100)
+      - Overall score = weighted average
+    </action>
+
+    <action>Determine recommended status:
+      IF verification_score >= 95 AND quality_score >= 90 AND no CRITICAL issues
+        → VERIFIED_COMPLETE
+      ELSE IF verification_score >= 80 AND quality_score >= 70
+        → COMPLETE_WITH_ISSUES (document issues)
+      ELSE IF false_positives > 0 OR critical_issues > 0
+        → NEEDS_REWORK (code missing or broken)
+      ELSE IF verification_score < 50
+        → FALSE_POSITIVE (claimed done but not implemented)
+      ELSE
+        → IN_PROGRESS (partially complete)
+    </action>
+
+    <output>
+📊 **FINAL VERDICT**
+
+**Story:** {{story_id}}
+**Current Status:** {{current_status}}
+**Recommended Status:** {{recommended_status}}
+
+**Scores:**
+  Task Verification: {{verification_score}}/100
+  Code Quality: {{quality_score}}/100
+  Integration: {{integration_score}}/100
+  **Overall: {{overall_score}}/100**
+
+**Confidence:** {{confidence_level}}
+
+{{#if recommended_status != current_status}}
+### ⚠️ Status Change Recommended
+
+**Current:** {{current_status}}
+**Should Be:** {{recommended_status}}
+
+**Reason:**
+{{status_change_reason}}
+{{/if}}
+    </output>
+  </step>
+
+  <step n="6" goal="Generate actionable report">
+    <template-output>
+# Story Validation Report: {{story_id}}
+
+**Validation Date:** {{date}}
+**Validation Depth:** {{validation_depth}}
+**Overall Score:** {{overall_score}}/100
+
+---
+
+## Summary
+
+**Story:** {{story_id}} - {{story_title}}
+**Epic:** {{epic_num}}
+**Current Status:** {{current_status}}
+**Recommended Status:** {{recommended_status}}
+
+**Task Completion:** {{checked_count}}/{{total_count}} ({{completion_pct}}%)
+**Verification Score:** {{verification_score}}/100
+**Code Quality Score:** {{quality_score}}/100
+
+---
+
+## Task Verification Details
+
+{{task_verification_output}}
+
+---
+
+## Code Quality Review
+
+{{code_quality_output}}
+
+---
+
+## Integration Verification
+
+{{integration_output}}
+
+---
+
+## Recommended Actions
+
+{{#if critical_issues}}
+### Priority 1: Fix Critical Issues (BLOCKING)
+{{#each critical_issues}}
+- [ ] {{this.file}}: {{this.description}}
+{{/each}}
+{{/if}}
+
+{{#if false_positives}}
+### Priority 2: Fix False Positives (Code Claims vs Reality)
+{{#each false_positives}}
+- [ ] {{this.task}} - {{this.evidence}}
+{{/each}}
+{{/if}}
+
+{{#if high_issues}}
+### Priority 3: Address High Priority Issues
+{{#each high_issues}}
+- [ ] {{this.file}}: {{this.description}}
+{{/each}}
+{{/if}}
+
+{{#if false_negatives}}
+### Priority 4: Update Task Checkboxes (Low Impact)
+{{#each false_negatives}}
+- [ ] Mark complete: {{this.task}}
+{{/each}}
+{{/if}}
+
+---
+
+## Next Steps
+
+{{#if recommended_status == "VERIFIED_COMPLETE"}}
+✅ **Story is verified complete and production-ready**
+- Update sprint-status.yaml: {{story_id}} = done
+- No further action required
+{{/if}}
+
+{{#if recommended_status == "NEEDS_REWORK"}}
+⚠️ **Story requires rework before marking complete**
+- Fix {{critical_count}} CRITICAL issues
+- Address {{false_positive_count}} false positive tasks
+- Re-run validation after fixes
+{{/if}}
+
+{{#if recommended_status == "FALSE_POSITIVE"}}
+❌ **Story is marked done but not actually implemented**
+- Verification score: {{verification_score}}/100 (< 50%)
+- Update sprint-status.yaml: {{story_id}} = in-progress or ready-for-dev
+- Implement missing tasks before claiming done
+{{/if}}
+
+---
+
+**Generated by:** /validate-story workflow
+**Validation Engine:** task-verification-engine.py v2.0
+    </template-output>
+  </step>
+
+  <step n="7" goal="Update story file and sprint-status">
+    <ask>Apply recommended status change to sprint-status.yaml? (y/n)</ask>
+
+    <check if="user says yes">
+      <action>Update sprint-status.yaml:
+        - Use sprint-status-updater.py
+        - Update {{story_id}} to {{recommended_status}}
+        - Add comment: "Validated {{date}}, score {{overall_score}}/100"
+      </action>
+
+      <action>Update story file:
+        - Add validation report link to Dev Agent Record
+        - Add validation score to completion notes
+        - Update Status: field if changed
+      </action>
+
+      <output>✅ Updated {{story_id}} status: {{current_status}} → {{recommended_status}}</output>
+    </check>
+
+    <check if="user says no">
+      <output>ℹ️ Status not updated. Validation report saved for reference.</output>
+    </check>
+  </step>
+
+</workflow>
--- a/src/modules/bmm/workflows/4-implementation/validate-story/workflow.yaml
+++ b/src/modules/bmm/workflows/4-implementation/validate-story/workflow.yaml
@ -0,0 +1,29 @@
+name: validate-story
+description: "Deep validation of a single story: verify tasks against codebase, run code quality review, check for regressions. Produces verification report with actionable findings."
+author: "BMad"
+version: "1.0.0"
+
+# Critical variables from config
+config_source: "{project-root}/_bmad/bmm/config.yaml"
+user_name: "{config_source}:user_name"
+communication_language: "{config_source}:communication_language"
+implementation_artifacts: "{config_source}:implementation_artifacts"
+story_dir: "{implementation_artifacts}"
+
+# Workflow components
+installed_path: "{project-root}/_bmad/bmm/workflows/4-implementation/validate-story"
+instructions: "{installed_path}/instructions.xml"
+
+# Input variables
+variables:
+  story_file: "" # Path to story file (e.g., docs/sprint-artifacts/16e-6-ecs-task-definitions-tier3.md)
+  validation_depth: "deep" # Options: "quick" (tasks only), "deep" (tasks + code review), "comprehensive" (tasks + review + integration tests)
+
+# Tools
+task_verification_script: "{project-root}/scripts/lib/task-verification-engine.py"
+
+# Output
+default_output_file: "{story_dir}/.validation-{story_id}-{date}.md"
+
+standalone: true
+web_bundle: false