From 5c63f31c0e4366a746b09d0ff661a88755ffba95 Mon Sep 17 00:00:00 2001 From: Caleb <46907094+rotationalphysics495@users.noreply.github.com> Date: Mon, 26 Jan 2026 14:33:25 -0600 Subject: [PATCH] feat(epic-execute): add JSON output parsing and TDD workflow phases Implements the final two improvements from bmad_improvements_v2.md: ## Structured JSON Output (Improvement #6) - New module: scripts/epic-execute-lib/json-output.sh - Functions for extracting and parsing JSON from Claude output - Unified check_phase_completion() with JSON + text fallback - Updated prompts to request JSON result blocks - Added --legacy-output flag to disable JSON parsing ## Test-First Flow (Improvement #7) - New module: scripts/epic-execute-lib/tdd-flow.sh - execute_test_spec_phase() - Generates BDD specs from acceptance criteria - execute_test_impl_phase() - Creates failing tests from specs - execute_test_verification_phase() - Verifies tests fail correctly - Integration with dev phase for TDD context - Added --skip-tdd, --skip-test-spec, --skip-test-impl flags All 7 improvements from the analysis are now complete. Co-Authored-By: Claude Opus 4.5 --- docs/bmad_improvements_v2.md | 45 ++- scripts/epic-execute-lib/json-output.sh | 433 ++++++++++++++++++++++ scripts/epic-execute-lib/tdd-flow.sh | 460 ++++++++++++++++++++++++ scripts/epic-execute.sh | 323 ++++++++++++++--- 4 files changed, 1199 insertions(+), 62 deletions(-) create mode 100644 scripts/epic-execute-lib/json-output.sh create mode 100644 scripts/epic-execute-lib/tdd-flow.sh diff --git a/docs/bmad_improvements_v2.md b/docs/bmad_improvements_v2.md index d8ec1ab0d..6a9e71f02 100644 --- a/docs/bmad_improvements_v2.md +++ b/docs/bmad_improvements_v2.md @@ -405,11 +405,11 @@ Respect these decisions unless you have a specific reason to deviate. --- -### 6. Test-First Enforcement (HIGH IMPACT, HIGH EFFORT) +### 6. Test-First Enforcement (HIGH IMPACT, HIGH EFFORT) ✅ IMPLEMENTED Restructure the flow to enforce TDD principles. -**Proposed New Flow:** +**Implemented Flow:** ``` 1. DESIGN PHASE (Dev) → Plan implementation approach @@ -424,14 +424,29 @@ Restructure the flow to enforce TDD principles. This ensures tests actually test requirements rather than implementation details. +**Implementation:** Module at `scripts/epic-execute-lib/tdd-flow.sh` with functions: +- `execute_test_spec_phase()` - Generates BDD test specifications from acceptance criteria +- `execute_test_impl_phase()` - Creates failing tests from specifications +- `execute_test_verification_phase()` - Verifies tests fail correctly before implementation +- `build_test_spec_context_for_dev()` - Provides test context to dev phase + +**Skip flags:** `--skip-tdd`, `--skip-test-spec`, `--skip-test-impl` + --- -### 7. Structured Output Validation (MEDIUM IMPACT, MEDIUM EFFORT) +### 7. Structured Output Validation (MEDIUM IMPACT, MEDIUM EFFORT) ✅ IMPLEMENTED Replace fragile regex parsing with structured JSON output. +**Implementation:** Module at `scripts/epic-execute-lib/json-output.sh` with functions: +- `extract_json_result()` - Parse JSON from Claude output +- `get_result_status()` - Extract status field +- `get_result_files()` - Extract files_changed array +- `get_result_issues()` - Extract issues for fix loops +- `check_phase_completion()` - Unified completion detection with JSON + text fallback + ```bash -# Add to prompts: +# Prompts now request JSON output: "Output your result as JSON: \`\`\`json { @@ -445,11 +460,13 @@ Replace fragile regex parsing with structured JSON output. } \`\`\`" -# Parse with jq: -result_json=$(echo "$result" | sed -n '/```json/,/```/p' | sed '1d;$d') -status=$(echo "$result_json" | jq -r '.status') +# Parsing with fallback: +check_phase_completion "$result" "dev" "$story_id" +# Returns: 0 (complete), 1 (failed), 2 (unclear) ``` +**Skip flag:** `--legacy-output` (disables JSON parsing, uses text-only detection) + --- ## Implementation Priority Matrix @@ -461,8 +478,8 @@ status=$(echo "$result_json" | jq -r '.status') | 3 | Decision Log | MEDIUM | LOW | ✅ DONE | Easy context preservation | | 4 | Regression Gate | HIGH | MEDIUM | ✅ DONE | Prevents silent breakage | | 5 | Design Phase | HIGH | MEDIUM | ✅ DONE | Catches issues early | -| 6 | Structured JSON Output | MEDIUM | MEDIUM | | Improves reliability | -| 7 | Test-First Flow | HIGH | HIGH | | Fundamental quality improvement | +| 6 | Structured JSON Output | MEDIUM | MEDIUM | ✅ DONE | Improves reliability | +| 7 | Test-First Flow | HIGH | HIGH | ✅ DONE | Fundamental quality improvement | --- @@ -532,8 +549,16 @@ These three changes alone would dramatically improve code reliability with minim **Additionally implemented:** 4. **Regression Gate** - Prevents silent breakage ✅ 5. **Design Phase** - Catches architectural issues early ✅ +6. **Structured JSON Output** - Reliable completion signal parsing ✅ +7. **Test-First Flow** - TDD workflow with test specs before implementation ✅ -**Implementation:** All features are modularized in `scripts/epic-execute-lib/` with graceful degradation and skip flags (`--skip-design`, `--skip-regression`). +**Implementation:** All features are modularized in `scripts/epic-execute-lib/` with graceful degradation and skip flags: +- `--skip-design` - Skip pre-implementation design phase +- `--skip-regression` - Skip regression test gate +- `--skip-tdd` - Skip test-first development phases +- `--skip-test-spec` - Skip test specification phase only +- `--skip-test-impl` - Skip test implementation phase only +- `--legacy-output` - Use legacy text-based output parsing (no JSON) --- diff --git a/scripts/epic-execute-lib/json-output.sh b/scripts/epic-execute-lib/json-output.sh new file mode 100644 index 000000000..6db5e3fbb --- /dev/null +++ b/scripts/epic-execute-lib/json-output.sh @@ -0,0 +1,433 @@ +#!/bin/bash +# +# BMAD Epic Execute - JSON Output Module +# +# Provides functions for structured JSON output parsing to replace +# fragile regex-based completion signal detection. +# +# Usage: Sourced by epic-execute.sh +# + +# ============================================================================= +# JSON Output Variables +# ============================================================================= + +# Whether to use legacy text-based parsing instead of JSON +USE_LEGACY_OUTPUT=false + +# Last extracted JSON result (for reuse within a phase) +LAST_JSON_RESULT="" + +# ============================================================================= +# JSON Output Functions +# ============================================================================= + +# Extract JSON result block from Claude output +# Looks for ```json ... ``` or ```result ... ``` blocks +# Arguments: +# $1 - Full Claude output +# Returns: JSON string or empty if not found +extract_json_result() { + local output="$1" + + # Reset last result + LAST_JSON_RESULT="" + + # Try to extract JSON from code block (```json ... ```) + local json_block + json_block=$(echo "$output" | sed -n '/```json/,/```/p' | sed '1d;$d') + + # If no ```json block, try ```result block + if [ -z "$json_block" ]; then + json_block=$(echo "$output" | sed -n '/```result/,/```/p' | sed '1d;$d') + fi + + # If still no block, try to find raw JSON object at end of output + if [ -z "$json_block" ]; then + # Look for JSON object pattern {"status": ...} + json_block=$(echo "$output" | grep -oE '\{"status":[^}]+\}' | tail -1) + fi + + # Validate JSON if jq is available + if [ -n "$json_block" ] && command -v jq >/dev/null 2>&1; then + if echo "$json_block" | jq . >/dev/null 2>&1; then + LAST_JSON_RESULT="$json_block" + echo "$json_block" + return 0 + fi + elif [ -n "$json_block" ]; then + # jq not available, return raw block + LAST_JSON_RESULT="$json_block" + echo "$json_block" + return 0 + fi + + echo "" + return 1 +} + +# Get the status field from a JSON result +# Arguments: +# $1 - JSON string (optional, uses LAST_JSON_RESULT if not provided) +# Returns: Status string (COMPLETE, BLOCKED, FAILED, PASSED, etc.) +get_result_status() { + local json="${1:-$LAST_JSON_RESULT}" + + if [ -z "$json" ]; then + echo "" + return 1 + fi + + if command -v jq >/dev/null 2>&1; then + echo "$json" | jq -r '.status // empty' + else + # Fallback: basic pattern matching + echo "$json" | grep -oE '"status":\s*"[^"]+"' | sed 's/.*"\([^"]*\)"$/\1/' + fi +} + +# Get the story_id field from a JSON result +# Arguments: +# $1 - JSON string (optional, uses LAST_JSON_RESULT if not provided) +get_result_story_id() { + local json="${1:-$LAST_JSON_RESULT}" + + if [ -z "$json" ]; then + echo "" + return 1 + fi + + if command -v jq >/dev/null 2>&1; then + echo "$json" | jq -r '.story_id // empty' + else + echo "$json" | grep -oE '"story_id":\s*"[^"]+"' | sed 's/.*"\([^"]*\)"$/\1/' + fi +} + +# Get the summary field from a JSON result +# Arguments: +# $1 - JSON string (optional, uses LAST_JSON_RESULT if not provided) +get_result_summary() { + local json="${1:-$LAST_JSON_RESULT}" + + if [ -z "$json" ]; then + echo "" + return 1 + fi + + if command -v jq >/dev/null 2>&1; then + echo "$json" | jq -r '.summary // empty' + else + echo "$json" | grep -oE '"summary":\s*"[^"]+"' | sed 's/.*"\([^"]*\)"$/\1/' + fi +} + +# Get the files_changed array from a JSON result +# Arguments: +# $1 - JSON string (optional, uses LAST_JSON_RESULT if not provided) +# Returns: Newline-separated list of file paths +get_result_files() { + local json="${1:-$LAST_JSON_RESULT}" + + if [ -z "$json" ]; then + echo "" + return 1 + fi + + if command -v jq >/dev/null 2>&1; then + echo "$json" | jq -r '.files_changed[]? // empty' + else + # Fallback: basic pattern matching (limited) + echo "$json" | grep -oE '"files_changed":\s*\[[^\]]*\]' | grep -oE '"[^"]+\.[a-z]+"' | tr -d '"' + fi +} + +# Get the concerns array from a JSON result +# Arguments: +# $1 - JSON string (optional, uses LAST_JSON_RESULT if not provided) +# Returns: Newline-separated list of concerns +get_result_concerns() { + local json="${1:-$LAST_JSON_RESULT}" + + if [ -z "$json" ]; then + echo "" + return 1 + fi + + if command -v jq >/dev/null 2>&1; then + echo "$json" | jq -r '.concerns[]? // empty' + else + echo "" + fi +} + +# Get the issues array from a JSON result (for review/fix phases) +# Arguments: +# $1 - JSON string (optional, uses LAST_JSON_RESULT if not provided) +# Returns: JSON array of issues or empty +get_result_issues() { + local json="${1:-$LAST_JSON_RESULT}" + + if [ -z "$json" ]; then + echo "" + return 1 + fi + + if command -v jq >/dev/null 2>&1; then + echo "$json" | jq -c '.issues // []' + else + echo "[]" + fi +} + +# Get the tests_added count from a JSON result +# Arguments: +# $1 - JSON string (optional, uses LAST_JSON_RESULT if not provided) +# Returns: Number of tests added +get_result_tests_added() { + local json="${1:-$LAST_JSON_RESULT}" + + if [ -z "$json" ]; then + echo "0" + return 1 + fi + + if command -v jq >/dev/null 2>&1; then + echo "$json" | jq -r '.tests_added // 0' + else + echo "$json" | grep -oE '"tests_added":\s*[0-9]+' | grep -oE '[0-9]+' || echo "0" + fi +} + +# Get the decisions array from a JSON result +# Arguments: +# $1 - JSON string (optional, uses LAST_JSON_RESULT if not provided) +# Returns: JSON array of decisions +get_result_decisions() { + local json="${1:-$LAST_JSON_RESULT}" + + if [ -z "$json" ]; then + echo "[]" + return 1 + fi + + if command -v jq >/dev/null 2>&1; then + echo "$json" | jq -c '.decisions // []' + else + echo "[]" + fi +} + +# Check phase completion with JSON parsing and text fallback +# Arguments: +# $1 - Full Claude output +# $2 - Phase type (dev, review, fix, arch, test_quality, trace, uat) +# $3 - Story ID (for legacy text matching) +# Returns: 0 if complete/passed, 1 if failed/blocked, 2 if unclear +check_phase_completion() { + local output="$1" + local phase_type="$2" + local story_id="$3" + + # Try JSON parsing first (unless legacy mode) + if [ "$USE_LEGACY_OUTPUT" != true ]; then + local json_result + json_result=$(extract_json_result "$output") + + if [ -n "$json_result" ]; then + local status + status=$(get_result_status "$json_result") + + case "$status" in + COMPLETE|PASSED|COMPLIANT|APPROVED) + return 0 + ;; + BLOCKED|FAILED|VIOLATIONS|CONCERNS) + return 1 + ;; + esac + fi + fi + + # Fallback to legacy text-based parsing + case "$phase_type" in + dev) + if echo "$output" | grep -q "IMPLEMENTATION COMPLETE"; then + return 0 + elif echo "$output" | grep -q "IMPLEMENTATION BLOCKED"; then + return 1 + fi + ;; + review) + if echo "$output" | grep -q "REVIEW PASSED"; then + return 0 + elif echo "$output" | grep -q "REVIEW FAILED"; then + return 1 + fi + ;; + fix) + if echo "$output" | grep -q "FIX COMPLETE"; then + return 0 + elif echo "$output" | grep -q "FIX INCOMPLETE"; then + return 1 + fi + ;; + arch) + if echo "$output" | grep -q "ARCH COMPLIANT"; then + return 0 + elif echo "$output" | grep -q "ARCH VIOLATIONS"; then + return 1 + fi + ;; + test_quality) + if echo "$output" | grep -q "TEST QUALITY APPROVED"; then + return 0 + elif echo "$output" | grep -q "TEST QUALITY FAILED"; then + return 1 + elif echo "$output" | grep -q "TEST QUALITY CONCERNS"; then + # Concerns don't block + return 0 + fi + ;; + trace) + if echo "$output" | grep -q "TRACEABILITY PASS"; then + return 0 + elif echo "$output" | grep -q "TRACEABILITY FAIL"; then + return 1 + elif echo "$output" | grep -q "TRACEABILITY CONCERNS"; then + return 0 + fi + ;; + uat) + if echo "$output" | grep -q "UAT GENERATED"; then + return 0 + fi + ;; + test_gen) + if echo "$output" | grep -q "TEST GENERATION COMPLETE"; then + return 0 + elif echo "$output" | grep -q "TEST GENERATION PARTIAL"; then + return 1 + fi + ;; + esac + + # Unclear result + return 2 +} + +# Build JSON output instruction block for prompts +# Arguments: +# $1 - Phase type (dev, review, fix, arch, test_quality, trace, uat) +# $2 - Story ID +# Returns: Instruction text for prompts +build_json_output_instructions() { + local phase_type="$1" + local story_id="$2" + + cat << 'EOF' + +## Output Format + +After completing your task, output a JSON result block: + +```json +{ + "status": "COMPLETE" | "BLOCKED" | "FAILED" | "PASSED" | "VIOLATIONS" | "CONCERNS", + "story_id": "", + "summary": "", + "files_changed": ["", ""], + "tests_added": , + "decisions": [ + {"what": "", "why": ""} + ], + "issues": [ + {"severity": "HIGH|MEDIUM|LOW", "description": "", "location": ""} + ], + "concerns": [""] +} +``` + +### Status Values by Phase +EOF + + case "$phase_type" in + dev) + cat << EOF + +- **COMPLETE**: Implementation finished successfully +- **BLOCKED**: Cannot proceed due to missing dependencies or unclear requirements + +Then ALSO output the legacy signal for backward compatibility: +- Success: \`IMPLEMENTATION COMPLETE: $story_id\` +- Blocked: \`IMPLEMENTATION BLOCKED: $story_id - [reason]\` +EOF + ;; + review) + cat << EOF + +- **PASSED**: Code review passed (all issues fixed or acceptable) +- **FAILED**: Critical issues remain that need developer attention + +Then ALSO output the legacy signal for backward compatibility: +- Pass: \`REVIEW PASSED: $story_id\` +- Fail: \`REVIEW FAILED: $story_id - [reason]\` +EOF + ;; + fix) + cat << EOF + +- **COMPLETE**: All issues from review have been fixed +- **FAILED**: Unable to fix one or more issues + +Then ALSO output the legacy signal for backward compatibility: +- Complete: \`FIX COMPLETE: $story_id - Fixed N issues\` +- Incomplete: \`FIX INCOMPLETE: $story_id - [reason]\` +EOF + ;; + arch) + cat << EOF + +- **COMPLIANT**: No architecture violations (or all fixed) +- **VIOLATIONS**: Architecture violations that need attention + +Then ALSO output the legacy signal for backward compatibility: +- Compliant: \`ARCH COMPLIANT: $story_id\` +- Violations: \`ARCH VIOLATIONS: $story_id - [summary]\` +EOF + ;; + test_quality) + cat << EOF + +- **APPROVED**: Test quality meets standards (score >= 70) +- **CONCERNS**: Minor quality issues (score 60-69) +- **FAILED**: Test quality below acceptable threshold (score < 60) + +Then ALSO output the legacy signal for backward compatibility: +- Approved: \`TEST QUALITY APPROVED: $story_id - Score: N/100\` +- Concerns: \`TEST QUALITY CONCERNS: $story_id - Score: N/100\` +- Failed: \`TEST QUALITY FAILED: $story_id - Score: N/100\` +EOF + ;; + trace) + cat << EOF + +- **PASSED**: Traceability requirements met (P0=100%, P1>=90%) +- **CONCERNS**: Minor gaps (P1 80-89%) +- **FAILED**: Critical traceability gaps + +Then ALSO output the legacy signal for backward compatibility: +- Pass: \`TRACEABILITY PASS: Epic-$story_id - P0: N%, P1: M%\` +- Fail: \`TRACEABILITY FAIL: Epic-$story_id - X critical gaps\` +EOF + ;; + uat) + cat << EOF + +- **COMPLETE**: UAT document generated successfully + +Then ALSO output the legacy signal: \`UAT GENERATED: \` +EOF + ;; + esac +} diff --git a/scripts/epic-execute-lib/tdd-flow.sh b/scripts/epic-execute-lib/tdd-flow.sh new file mode 100644 index 000000000..bd041f728 --- /dev/null +++ b/scripts/epic-execute-lib/tdd-flow.sh @@ -0,0 +1,460 @@ +#!/bin/bash +# +# BMAD Epic Execute - TDD Flow Module +# +# Provides test-first development (TDD) workflow phases: +# 1. Test Specification - Generate test specs from acceptance criteria +# 2. Test Implementation - Create failing tests from specs +# 3. Test Verification - Verify tests fail appropriately before implementation +# +# Usage: Sourced by epic-execute.sh +# + +# ============================================================================= +# TDD Flow Variables +# ============================================================================= + +# Store test specifications for use across phases +LAST_TEST_SPEC="" + +# Test spec output directory +TEST_SPEC_DIR="" + +# ============================================================================= +# Test Specification Phase +# ============================================================================= + +# Execute test specification phase +# Generates BDD-style test specifications from acceptance criteria +# Arguments: +# $1 - story_file path +execute_test_spec_phase() { + local story_file="$1" + local story_id=$(basename "$story_file" .md) + + # Reset last spec + LAST_TEST_SPEC="" + + log ">>> TEST SPEC PHASE: $story_id (generating test specifications)" + + local story_contents=$(cat "$story_file") + + # Load architecture file if available for context + local arch_contents="" + for search_path in "$PROJECT_ROOT/docs/architecture.md" "$PROJECT_ROOT/docs/architecture/architecture.md" "$PROJECT_ROOT/architecture.md"; do + if [ -f "$search_path" ]; then + arch_contents=$(cat "$search_path") + break + fi + done + + # Get design context if available + local design_context="" + if type get_last_design >/dev/null 2>&1; then + design_context=$(get_last_design) + fi + + local spec_prompt="You are a Test Architect (TEA) generating test specifications from acceptance criteria. + +## Your Task + +Generate test specifications for: $story_id + +Do NOT write test code yet. Output only test specifications in BDD format. + +### CRITICAL RULES +- One test specification per acceptance criterion minimum +- Use Given-When-Then format for all specifications +- Include edge cases and error scenarios +- Assign unique test IDs (format: ${story_id}-E2E-001, ${story_id}-UNIT-001) +- Map each AC explicitly to test specifications + +## Story to Analyze + +**Story Path:** $story_file +**Story ID:** $story_id + + +$story_contents + + +## Architecture Context (for understanding test boundaries) + + +$arch_contents + + +## Design Context (if available) + + +$design_context + + +## Exploration Commands + +First, explore existing test patterns in the codebase: +\`\`\`bash +# Find existing test files +find . -type f \\( -name \"*.spec.ts\" -o -name \"*.test.ts\" -o -name \"*.spec.js\" -o -name \"*.test.js\" \\) | head -10 +# Check test directory structure +ls -la test/ tests/ __tests__/ src/**/__tests__/ 2>/dev/null || true +\`\`\` + +## Required Output + +Output your test specifications in this exact format: + +\`\`\` +TEST SPEC START +story_id: $story_id +generated: $(date '+%Y-%m-%d') + +test_specifications: + +## AC1: + +### ${story_id}-E2E-001: +- Priority: P0|P1|P2 +- Type: e2e|integration|unit +- Given: +- When: +- Then: +- Data: + +### ${story_id}-E2E-002: +- Priority: P1 +- Type: e2e +- Given: +- When: +- Then: + +## AC2: + +### ${story_id}-UNIT-001: +... + +edge_cases: + - + +error_scenarios: + - + +test_file_mapping: + - ${story_id}-E2E-*: + - ${story_id}-UNIT-*: + - ${story_id}-INT-*: + +TEST SPEC END +\`\`\` + +## Completion Signal + +After outputting the spec block: + +1. Output JSON result: +\`\`\`json +{ + \"status\": \"COMPLETE\", + \"story_id\": \"$story_id\", + \"summary\": \"Generated N test specifications for M acceptance criteria\", + \"tests_added\": +} +\`\`\` + +2. Then output: TEST SPEC COMPLETE: $story_id - Generated N specifications" + + if [ "$DRY_RUN" = true ]; then + echo "[DRY RUN] Would execute test spec phase for $story_id" + return 0 + fi + + local result + result=$(claude --dangerously-skip-permissions -p "$spec_prompt" 2>&1) || true + + echo "$result" >> "$LOG_FILE" + + # Extract test spec block + LAST_TEST_SPEC=$(echo "$result" | sed -n '/TEST SPEC START/,/TEST SPEC END/p') + + if [ -n "$LAST_TEST_SPEC" ]; then + # Save to spec directory + TEST_SPEC_DIR="$SPRINT_ARTIFACTS_DIR/test-specs" + mkdir -p "$TEST_SPEC_DIR" + echo "$LAST_TEST_SPEC" > "$TEST_SPEC_DIR/${story_id}-test-spec.md" + + # Save to decision log + if type append_to_decision_log >/dev/null 2>&1; then + append_to_decision_log "TEST_SPEC" "$story_id" "$LAST_TEST_SPEC" + fi + + log_success "Test spec phase complete: $story_id" + log "Saved to: $TEST_SPEC_DIR/${story_id}-test-spec.md" + return 0 + else + log_error "Test spec phase did not produce valid output" + return 1 + fi +} + +# ============================================================================= +# Test Implementation Phase +# ============================================================================= + +# Execute test implementation phase +# Creates failing tests from specifications +# Arguments: +# $1 - story_file path +execute_test_impl_phase() { + local story_file="$1" + local story_id=$(basename "$story_file" .md) + + log ">>> TEST IMPL PHASE: $story_id (implementing failing tests)" + + # Check if we have test specs + if [ -z "$LAST_TEST_SPEC" ]; then + # Try to load from file + if [ -f "$TEST_SPEC_DIR/${story_id}-test-spec.md" ]; then + LAST_TEST_SPEC=$(cat "$TEST_SPEC_DIR/${story_id}-test-spec.md") + else + log_error "No test specifications available for $story_id" + return 1 + fi + fi + + local story_contents=$(cat "$story_file") + + local impl_prompt="You are a Test Architect (TEA) implementing tests from specifications. + +## Your Task + +Implement failing tests for: $story_id + +The tests MUST FAIL initially because the feature is not yet implemented. +This is Test-First Development (TDD). + +### CRITICAL RULES +- Create test files based on the specifications below +- Tests should compile/parse without errors +- Tests should FAIL when run (feature not implemented yet) +- Follow existing test patterns in the codebase +- Use proper fixtures and data factories +- Do NOT implement any feature code + +## Test Specifications to Implement + + +$LAST_TEST_SPEC + + +## Story Context + + +$story_contents + + +## Exploration Commands + +First, examine existing test patterns: +\`\`\`bash +# Find existing test patterns +find . -type f \\( -name \"*.spec.ts\" -o -name \"*.test.ts\" \\) -exec head -50 {} \\; 2>/dev/null | head -100 +# Check for test utilities +ls -la test/utils/ tests/helpers/ __tests__/fixtures/ 2>/dev/null || true +\`\`\` + +## Implementation Guidelines + +1. **File Structure**: Create test files in the appropriate directory +2. **Imports**: Use the project's test framework (jest, mocha, vitest, etc.) +3. **Describe Blocks**: Group tests by acceptance criterion +4. **Test Names**: Include test IDs from specifications +5. **BDD Format**: Use Given-When-Then comments +6. **Assertions**: Write assertions that will FAIL until feature is implemented +7. **Data**: Use factories or fixtures, no hardcoded values + +## Example Test Structure + +\`\`\`typescript +describe('Feature: ', () => { + describe('AC1: ', () => { + test('${story_id}-E2E-001: should ', async () => { + // Given: + const setup = await createTestFixture(); + + // When: + const result = await performAction(setup); + + // Then: + expect(result).toBe(expectedValue); // Will FAIL - not implemented + }); + }); +}); +\`\`\` + +## Completion Signal + +After implementing the tests: + +1. Stage the test files: git add -A +2. Output JSON result: +\`\`\`json +{ + \"status\": \"COMPLETE\", + \"story_id\": \"$story_id\", + \"summary\": \"Implemented N failing tests in M files\", + \"files_changed\": [\"\"], + \"tests_added\": +} +\`\`\` + +3. Then output: TEST IMPL COMPLETE: $story_id - Implemented N tests" + + if [ "$DRY_RUN" = true ]; then + echo "[DRY RUN] Would execute test impl phase for $story_id" + return 0 + fi + + local result + result=$(claude --dangerously-skip-permissions -p "$impl_prompt" 2>&1) || true + + echo "$result" >> "$LOG_FILE" + + # Check completion + local completion_status + if type check_phase_completion >/dev/null 2>&1; then + check_phase_completion "$result" "test_gen" "$story_id" + completion_status=$? + else + if echo "$result" | grep -q "TEST IMPL COMPLETE"; then + completion_status=0 + else + completion_status=2 + fi + fi + + case $completion_status in + 0) + log_success "Test impl phase complete: $story_id" + return 0 + ;; + *) + log_error "Test impl phase did not complete cleanly: $story_id" + return 1 + ;; + esac +} + +# ============================================================================= +# Test Verification Phase +# ============================================================================= + +# Execute test verification phase +# Verifies that tests fail appropriately (compile but don't pass) +# Arguments: +# $1 - story_file path +execute_test_verification_phase() { + local story_file="$1" + local story_id=$(basename "$story_file" .md) + + log ">>> TEST VERIFICATION PHASE: $story_id (verifying tests fail correctly)" + + if [ "$DRY_RUN" = true ]; then + echo "[DRY RUN] Would verify tests fail for $story_id" + return 0 + fi + + # Run tests and expect failures + local test_output="" + local test_exit_code=0 + + if [ -f "$PROJECT_ROOT/package.json" ]; then + # Node.js project + test_output=$(cd "$PROJECT_ROOT" && npm test 2>&1) || test_exit_code=$? + elif [ -f "$PROJECT_ROOT/Cargo.toml" ]; then + test_output=$(cd "$PROJECT_ROOT" && cargo test 2>&1) || test_exit_code=$? + elif [ -f "$PROJECT_ROOT/go.mod" ]; then + test_output=$(cd "$PROJECT_ROOT" && go test ./... 2>&1) || test_exit_code=$? + elif [ -f "$PROJECT_ROOT/requirements.txt" ] || [ -f "$PROJECT_ROOT/pyproject.toml" ]; then + if command -v pytest >/dev/null 2>&1; then + test_output=$(cd "$PROJECT_ROOT" && pytest 2>&1) || test_exit_code=$? + fi + fi + + echo "$test_output" >> "$LOG_FILE" + + # Analyze test output + # We expect: tests compile, tests run, tests FAIL (exit code non-zero) + + # Check for compilation errors (bad - tests should at least compile) + if echo "$test_output" | grep -qiE "syntax error|cannot find module|compilation failed|parse error"; then + log_error "Tests have compilation/syntax errors - fix before proceeding" + echo "$test_output" | grep -iE "syntax error|cannot find module|compilation failed|parse error" | head -10 + return 1 + fi + + # Check for test failures (good - expected in TDD) + if [ $test_exit_code -ne 0 ]; then + # Count failures + local failure_count=0 + failure_count=$(echo "$test_output" | grep -cE "FAIL|failed|failing" || echo "0") + + if [ "$failure_count" -gt 0 ]; then + log_success "Test verification passed: $failure_count test(s) failing as expected" + log "Tests compile and fail appropriately - ready for implementation" + return 0 + else + log_warn "Tests exited with error but no clear failures detected" + return 0 # Proceed anyway - might be framework difference + fi + else + # Tests passed - this is unexpected in TDD before implementation + log_warn "Tests passed unexpectedly - verify tests are actually testing new functionality" + log "This may indicate tests are not properly written or feature already exists" + return 0 # Don't block, but warn + fi +} + +# ============================================================================= +# Helper Functions +# ============================================================================= + +# Get the last test specification for use in other phases +get_last_test_spec() { + echo "$LAST_TEST_SPEC" +} + +# Build test spec context for dev phase prompt +build_test_spec_context_for_dev() { + local story_id="$1" + + if [ -z "$LAST_TEST_SPEC" ]; then + # Try to load from file + if [ -n "$TEST_SPEC_DIR" ] && [ -f "$TEST_SPEC_DIR/${story_id}-test-spec.md" ]; then + LAST_TEST_SPEC=$(cat "$TEST_SPEC_DIR/${story_id}-test-spec.md") + fi + fi + + if [ -z "$LAST_TEST_SPEC" ]; then + echo "" + return + fi + + cat << EOF + +## Test Specifications (TDD) + +The following tests have been written and are FAILING. Your implementation must make these tests pass. + + +$LAST_TEST_SPEC + + +### TDD Implementation Guidelines + +1. Run tests frequently: \`npm test\` (or equivalent) +2. Implement just enough code to make the next test pass +3. Do NOT modify the test files - only implement the feature code +4. All tests in the specification must pass when implementation is complete + +EOF +} diff --git a/scripts/epic-execute.sh b/scripts/epic-execute.sh index b3e80c076..972fd47cb 100755 --- a/scripts/epic-execute.sh +++ b/scripts/epic-execute.sh @@ -36,6 +36,8 @@ LIB_DIR="$SCRIPT_DIR/epic-execute-lib" [ -f "$LIB_DIR/decision-log.sh" ] && source "$LIB_DIR/decision-log.sh" [ -f "$LIB_DIR/regression-gate.sh" ] && source "$LIB_DIR/regression-gate.sh" [ -f "$LIB_DIR/design-phase.sh" ] && source "$LIB_DIR/design-phase.sh" +[ -f "$LIB_DIR/json-output.sh" ] && source "$LIB_DIR/json-output.sh" +[ -f "$LIB_DIR/tdd-flow.sh" ] && source "$LIB_DIR/tdd-flow.sh" STORIES_DIR="$PROJECT_ROOT/docs/stories" SPRINT_ARTIFACTS_DIR="$PROJECT_ROOT/docs/sprint-artifacts" @@ -382,6 +384,10 @@ SKIP_TRACEABILITY=false SKIP_STATIC_ANALYSIS=false SKIP_DESIGN=false SKIP_REGRESSION=false +SKIP_TDD=false +SKIP_TEST_SPEC=false +SKIP_TEST_IMPL=false +LEGACY_OUTPUT=false while [[ $# -gt 0 ]]; do case $1 in @@ -437,6 +443,22 @@ while [[ $# -gt 0 ]]; do SKIP_REGRESSION=true shift ;; + --skip-tdd) + SKIP_TDD=true + shift + ;; + --skip-test-spec) + SKIP_TEST_SPEC=true + shift + ;; + --skip-test-impl) + SKIP_TEST_IMPL=true + shift + ;; + --legacy-output) + LEGACY_OUTPUT=true + shift + ;; -*) echo "Unknown option: $1" exit 1 @@ -465,6 +487,10 @@ if [ -z "$EPIC_ID" ]; then echo " --skip-static-analysis Skip static analysis gate (runs real tooling)" echo " --skip-design Skip pre-implementation design phase" echo " --skip-regression Skip regression test gate" + echo " --skip-tdd Skip test-first development phases" + echo " --skip-test-spec Skip test specification phase only" + echo " --skip-test-impl Skip test implementation phase only" + echo " --legacy-output Use legacy text-based output parsing (no JSON)" exit 1 fi @@ -547,6 +573,12 @@ if [ "$SKIP_REGRESSION" = false ] && type init_regression_baseline >/dev/null 2> init_regression_baseline fi +# Set legacy output mode if requested +if [ "$LEGACY_OUTPUT" = true ] && type -v USE_LEGACY_OUTPUT >/dev/null 2>&1; then + USE_LEGACY_OUTPUT=true + log "Using legacy text-based output parsing" +fi + # Find epic file (supports both epic-39-*.md and epic-039-*.md formats) EPIC_FILE="" # Pad epic ID with leading zero for 3-digit format (e.g., 40 -> 040) @@ -673,6 +705,12 @@ execute_dev_phase() { design_context=$(build_design_context_for_dev "$story_id") fi + # Get test spec context if available (from TDD test spec phase) + local test_spec_context="" + if type build_test_spec_context_for_dev >/dev/null 2>&1; then + test_spec_context=$(build_test_spec_context_for_dev "$story_id") + fi + # Build the dev prompt using BMAD workflow local dev_prompt="You are executing a BMAD dev-story workflow in automated mode. @@ -721,6 +759,7 @@ $workflow_checklist $story_contents $design_context +$test_spec_context ## Previous Implementation Context @@ -743,10 +782,25 @@ $decision_context ## Completion Signals When the workflow completes successfully (all tasks done, tests pass, status set to 'review'): -Output exactly: IMPLEMENTATION COMPLETE: $story_id + +1. Output a JSON result block: +\`\`\`json +{ + \"status\": \"COMPLETE\", + \"story_id\": \"$story_id\", + \"summary\": \"\", + \"files_changed\": [\"\"], + \"tests_added\": , + \"decisions\": [{\"what\": \"\", \"why\": \"\"}] +} +\`\`\` + +2. Then output exactly: IMPLEMENTATION COMPLETE: $story_id If a HALT condition is triggered or implementation is blocked: -Output exactly: IMPLEMENTATION BLOCKED: $story_id - [specific reason] + +1. Output a JSON result block with status \"BLOCKED\" and issues array describing blockers +2. Then output exactly: IMPLEMENTATION BLOCKED: $story_id - [specific reason] ## Begin Execution @@ -765,17 +819,48 @@ Stage all changes with: git add -A (after implementation is complete)" echo "$result" >> "$LOG_FILE" - if echo "$result" | grep -q "IMPLEMENTATION COMPLETE"; then - log_success "Dev phase complete: $story_id" - return 0 - elif echo "$result" | grep -q "IMPLEMENTATION BLOCKED"; then - log_error "Dev phase blocked: $story_id" - echo "$result" | grep "IMPLEMENTATION BLOCKED" - return 1 + # Check completion using JSON parsing with text fallback + local completion_status + if type check_phase_completion >/dev/null 2>&1; then + check_phase_completion "$result" "dev" "$story_id" + completion_status=$? else - log_error "Dev phase did not complete cleanly: $story_id" - return 1 + # Fallback to legacy detection + if echo "$result" | grep -q "IMPLEMENTATION COMPLETE"; then + completion_status=0 + elif echo "$result" | grep -q "IMPLEMENTATION BLOCKED"; then + completion_status=1 + else + completion_status=2 + fi fi + + case $completion_status in + 0) + # Extract decisions for decision log if available + if type get_result_decisions >/dev/null 2>&1 && type append_to_decision_log >/dev/null 2>&1; then + local decisions=$(get_result_decisions) + if [ "$decisions" != "[]" ] && [ -n "$decisions" ]; then + append_to_decision_log "DEV" "$story_id" "Decisions: $decisions" + fi + fi + log_success "Dev phase complete: $story_id" + return 0 + ;; + 1) + log_error "Dev phase blocked: $story_id" + if type get_result_summary >/dev/null 2>&1; then + local summary=$(get_result_summary) + [ -n "$summary" ] && echo "Reason: $summary" + fi + echo "$result" | grep "IMPLEMENTATION BLOCKED" || true + return 1 + ;; + *) + log_error "Dev phase did not complete cleanly: $story_id" + return 1 + ;; + esac } # Global variable to store review findings for fix loop @@ -882,22 +967,45 @@ When the workflow presents options: ## Completion Signals When review passes (all HIGH/MEDIUM issues fixed, all ACs implemented, status set to 'done'): -Output exactly: REVIEW PASSED: $story_id -When review passes but required fixes: -Output exactly: REVIEW PASSED WITH FIXES: $story_id - Fixed N issues +1. Output a JSON result block: +\`\`\`json +{ + \"status\": \"PASSED\", + \"story_id\": \"$story_id\", + \"summary\": \"\", + \"files_changed\": [\"\"], + \"issues\": [] +} +\`\`\` + +2. Then output exactly: REVIEW PASSED: $story_id + Or if fixes were made: REVIEW PASSED WITH FIXES: $story_id - Fixed N issues If review fails (unfixable issues, missing acceptance criteria that YOU cannot fix): -1. First output a structured findings block: + +1. Output a JSON result block with issues: +\`\`\`json +{ + \"status\": \"FAILED\", + \"story_id\": \"$story_id\", + \"summary\": \"\", + \"issues\": [ + {\"severity\": \"HIGH\", \"description\": \"\", \"location\": \"\"}, + {\"severity\": \"MEDIUM\", \"description\": \"\", \"location\": \"\"} + ] +} +\`\`\` + +2. Then output the legacy findings block: \`\`\` REVIEW FINDINGS START - [HIGH] Description of issue 1 (file:line if applicable) -- [HIGH] Description of issue 2 -- [MEDIUM] Description of issue 3 -... all HIGH and MEDIUM issues that need dev attention ... +- [MEDIUM] Description of issue 2 REVIEW FINDINGS END \`\`\` -2. Then output exactly: REVIEW FAILED: $story_id - [summary reason] + +3. Then output exactly: REVIEW FAILED: $story_id - [summary reason] ## Begin Execution @@ -917,25 +1025,56 @@ Stage any fixes with: git add -A" echo "$result" >> "$LOG_FILE" - if echo "$result" | grep -q "REVIEW PASSED"; then - log_success "Review passed: $story_id" - return 0 - elif echo "$result" | grep -q "REVIEW FAILED"; then - log_error "Review failed: $story_id" - echo "$result" | grep "REVIEW FAILED" - - # Extract findings for fix loop - LAST_REVIEW_FINDINGS=$(echo "$result" | sed -n '/REVIEW FINDINGS START/,/REVIEW FINDINGS END/p' | grep -E '^\s*-\s*\[(HIGH|MEDIUM)\]' || true) - - if [ -n "$LAST_REVIEW_FINDINGS" ]; then - log "Captured ${#LAST_REVIEW_FINDINGS} bytes of review findings for fix loop" - fi - - return 1 + # Check completion using JSON parsing with text fallback + local completion_status + if type check_phase_completion >/dev/null 2>&1; then + check_phase_completion "$result" "review" "$story_id" + completion_status=$? else - log_warn "Review did not complete cleanly: $story_id" - return 1 + # Fallback to legacy detection + if echo "$result" | grep -q "REVIEW PASSED"; then + completion_status=0 + elif echo "$result" | grep -q "REVIEW FAILED"; then + completion_status=1 + else + completion_status=2 + fi fi + + case $completion_status in + 0) + log_success "Review passed: $story_id" + return 0 + ;; + 1) + log_error "Review failed: $story_id" + echo "$result" | grep "REVIEW FAILED" || true + + # Extract findings for fix loop - try JSON first, then legacy + if type get_result_issues >/dev/null 2>&1; then + local json_issues=$(get_result_issues) + if [ "$json_issues" != "[]" ] && [ -n "$json_issues" ]; then + # Convert JSON issues to text format for fix phase + LAST_REVIEW_FINDINGS=$(echo "$json_issues" | jq -r '.[] | "- [\(.severity)] \(.description) (\(.location // "unknown"))"' 2>/dev/null || echo "") + fi + fi + + # Fallback to legacy text extraction if JSON didn't work + if [ -z "$LAST_REVIEW_FINDINGS" ]; then + LAST_REVIEW_FINDINGS=$(echo "$result" | sed -n '/REVIEW FINDINGS START/,/REVIEW FINDINGS END/p' | grep -E '^\s*-\s*\[(HIGH|MEDIUM)\]' || true) + fi + + if [ -n "$LAST_REVIEW_FINDINGS" ]; then + log "Captured review findings for fix loop" + fi + + return 1 + ;; + *) + log_warn "Review did not complete cleanly: $story_id" + return 1 + ;; + esac } execute_fix_phase() { @@ -1062,10 +1201,33 @@ $story_contents ## Completion Signals When ALL review issues are successfully fixed: -Output exactly: FIX COMPLETE: $story_id - Fixed [N] issues + +1. Output a JSON result block: +\`\`\`json +{ + \"status\": \"COMPLETE\", + \"story_id\": \"$story_id\", + \"summary\": \"Fixed N issues: \", + \"files_changed\": [\"\"], + \"issues\": [] +} +\`\`\` + +2. Then output exactly: FIX COMPLETE: $story_id - Fixed [N] issues If unable to fix one or more issues: -Output exactly: FIX INCOMPLETE: $story_id - [reason and which issues remain] + +1. Output a JSON result block with remaining issues: +\`\`\`json +{ + \"status\": \"FAILED\", + \"story_id\": \"$story_id\", + \"summary\": \"\", + \"issues\": [{\"severity\": \"HIGH\", \"description\": \"\", \"location\": \"\"}] +} +\`\`\` + +2. Then output exactly: FIX INCOMPLETE: $story_id - [reason and which issues remain] ## Begin Execution @@ -1082,20 +1244,40 @@ Address all review findings now. This is attempt $attempt_num of 3." echo "$result" >> "$LOG_FILE" - if echo "$result" | grep -q "FIX COMPLETE"; then - log_success "Fix phase complete: $story_id (attempt $attempt_num)" - record_fix_attempt "$story_id" "$attempt_num" "success" - return 0 - elif echo "$result" | grep -q "FIX INCOMPLETE"; then - log_error "Fix phase incomplete: $story_id (attempt $attempt_num)" - echo "$result" | grep "FIX INCOMPLETE" - record_fix_attempt "$story_id" "$attempt_num" "failed" - return 1 + # Check completion using JSON parsing with text fallback + local completion_status + if type check_phase_completion >/dev/null 2>&1; then + check_phase_completion "$result" "fix" "$story_id" + completion_status=$? else - log_warn "Fix phase did not complete cleanly: $story_id (attempt $attempt_num)" - record_fix_attempt "$story_id" "$attempt_num" "failed" - return 1 + # Fallback to legacy detection + if echo "$result" | grep -q "FIX COMPLETE"; then + completion_status=0 + elif echo "$result" | grep -q "FIX INCOMPLETE"; then + completion_status=1 + else + completion_status=2 + fi fi + + case $completion_status in + 0) + log_success "Fix phase complete: $story_id (attempt $attempt_num)" + record_fix_attempt "$story_id" "$attempt_num" "success" + return 0 + ;; + 1) + log_error "Fix phase incomplete: $story_id (attempt $attempt_num)" + echo "$result" | grep "FIX INCOMPLETE" || true + record_fix_attempt "$story_id" "$attempt_num" "failed" + return 1 + ;; + *) + log_warn "Fix phase did not complete cleanly: $story_id (attempt $attempt_num)" + record_fix_attempt "$story_id" "$attempt_num" "failed" + return 1 + ;; + esac } # ============================================================================= @@ -1854,12 +2036,49 @@ execute_story_with_fix_loop() { # DESIGN PHASE (Context 0) - Pre-implementation planning if [ "$SKIP_DESIGN" = false ] && type execute_design_phase >/dev/null 2>&1; then if ! execute_design_phase "$story_file"; then - log_warn "Design phase did not complete cleanly for $story_id - proceeding to dev" + log_warn "Design phase did not complete cleanly for $story_id - proceeding" # Don't fail - design is advisory fi fi - # DEV PHASE (Context 1) + # TDD PHASES (Test-First Development) + # Enabled by default, skip with --skip-tdd or individual --skip-test-spec/--skip-test-impl + if [ "$SKIP_TDD" = false ]; then + + # TEST SPEC PHASE - Generate test specifications from acceptance criteria + if [ "$SKIP_TEST_SPEC" = false ] && type execute_test_spec_phase >/dev/null 2>&1; then + if ! execute_test_spec_phase "$story_file"; then + log_warn "Test spec phase did not complete cleanly for $story_id - proceeding" + # Don't fail - spec generation is advisory + fi + fi + + # TEST IMPL PHASE - Create failing tests from specifications + if [ "$SKIP_TEST_IMPL" = false ] && type execute_test_impl_phase >/dev/null 2>&1; then + # Only run if we have test specs (either just generated or loaded from file) + if [ -n "$LAST_TEST_SPEC" ] || [ -f "$TEST_SPEC_DIR/${story_id}-test-spec.md" ] 2>/dev/null; then + if ! execute_test_impl_phase "$story_file"; then + log_warn "Test impl phase did not complete cleanly for $story_id - proceeding" + # Don't fail - test impl is advisory in first iteration + fi + else + log_warn "No test specifications available - skipping test implementation" + fi + fi + + # TEST VERIFICATION PHASE - Verify tests fail appropriately + if [ "$SKIP_TEST_IMPL" = false ] && type execute_test_verification_phase >/dev/null 2>&1; then + # Only verify if we just created tests + if type get_last_test_spec >/dev/null 2>&1 && [ -n "$(get_last_test_spec)" ]; then + if ! execute_test_verification_phase "$story_file"; then + log_warn "Test verification had issues - proceeding to dev phase" + # Don't fail - verification is informational + fi + fi + fi + fi + + # DEV PHASE (Context 1) - Now implements to make tests pass (if TDD enabled) if ! execute_dev_phase "$story_file"; then log_error "Dev phase failed for $story_id" return 1