The workflow execution engine is governed by: {project-root}/_bmad/core/tasks/workflow.xml
You MUST have already loaded and processed: {installed_path}/workflow.yaml
This performs DEEP validation - not just checkbox counting, but verifying code actually exists and works
Load story file from {{story_file}}
HALT
Extract story metadata:
- Story ID (from filename)
- Epic number
- Current status from Status: field
- Priority
- Estimated effort
Extract all tasks:
- Pattern: "- [ ]" or "- [x]"
- Count total tasks
- Count checked tasks
- Count unchecked tasks
- Calculate completion percentage
Extract file references from Dev Agent Record:
- Files created
- Files modified
- Files deleted
Use task-verification-engine.py for DEEP verification (not just file existence)
For each task in story:
1. Extract task text
2. Note if checked [x] or unchecked [ ]
3. Pass to task-verification-engine.py
4. Receive verification result with:
- should_be_checked: true/false
- confidence: very high/high/medium/low
- evidence: list of findings
- verification_status: correct/false_positive/false_negative/uncertain
Categorize tasks by verification status:
- â
CORRECT: Checkbox matches reality
- â FALSE POSITIVE: Checked but code missing/stubbed
- â ī¸ FALSE NEGATIVE: Unchecked but code exists
- â UNCERTAIN: Cannot verify (low confidence)
Calculate verification score:
- (correct_tasks / total_tasks) Ã 100
- Penalize false positives heavily (-5 points each)
- Penalize false negatives lightly (-2 points each)
Extract all files from Dev Agent Record file list
Skip to step 4
For each file:
1. Check if file exists
2. Read file content
3. Check for quality issues:
- TODO/FIXME comments without GitHub issues
- any types in TypeScript
- Hardcoded values (siteId, dealerId, API keys)
- Missing error handling
- Missing multi-tenant isolation (dealerId filters)
- Missing audit logging on mutations
- Security vulnerabilities (SQL injection, XSS)
Run multi-agent review if files exist:
- Security audit
- Silent failure detection
- Architecture compliance
- Performance analysis
Categorize issues by severity:
- CRITICAL: Security, data loss, breaking changes
- HIGH: Missing features, poor quality, technical debt
- MEDIUM: Code smells, minor violations
- LOW: Style issues, nice-to-haves
Extract dependencies from story:
- Services called
- APIs consumed
- Database tables used
- Cache keys accessed
For each dependency:
1. Check if dependency still exists
2. Check if API contract is still valid
3. Run integration tests if they exist
4. Check for breaking changes in dependent stories
Calculate overall story health:
- Task verification score (0-100)
- Code quality score (0-100)
- Integration score (0-100)
- Overall score = weighted average
Determine recommended status:
IF verification_score >= 95 AND quality_score >= 90 AND no CRITICAL issues
â VERIFIED_COMPLETE
ELSE IF verification_score >= 80 AND quality_score >= 70
â COMPLETE_WITH_ISSUES (document issues)
ELSE IF false_positives > 0 OR critical_issues > 0
â NEEDS_REWORK (code missing or broken)
ELSE IF verification_score < 50
â FALSE_POSITIVE (claimed done but not implemented)
ELSE
â IN_PROGRESS (partially complete)
# Story Validation Report: {{story_id}}
**Validation Date:** {{date}}
**Validation Depth:** {{validation_depth}}
**Overall Score:** {{overall_score}}/100
---
## Summary
**Story:** {{story_id}} - {{story_title}}
**Epic:** {{epic_num}}
**Current Status:** {{current_status}}
**Recommended Status:** {{recommended_status}}
**Task Completion:** {{checked_count}}/{{total_count}} ({{completion_pct}}%)
**Verification Score:** {{verification_score}}/100
**Code Quality Score:** {{quality_score}}/100
---
## Task Verification Details
{{task_verification_output}}
---
## Code Quality Review
{{code_quality_output}}
---
## Integration Verification
{{integration_output}}
---
## Recommended Actions
{{#if critical_issues}}
### Priority 1: Fix Critical Issues (BLOCKING)
{{#each critical_issues}}
- [ ] {{this.file}}: {{this.description}}
{{/each}}
{{/if}}
{{#if false_positives}}
### Priority 2: Fix False Positives (Code Claims vs Reality)
{{#each false_positives}}
- [ ] {{this.task}} - {{this.evidence}}
{{/each}}
{{/if}}
{{#if high_issues}}
### Priority 3: Address High Priority Issues
{{#each high_issues}}
- [ ] {{this.file}}: {{this.description}}
{{/each}}
{{/if}}
{{#if false_negatives}}
### Priority 4: Update Task Checkboxes (Low Impact)
{{#each false_negatives}}
- [ ] Mark complete: {{this.task}}
{{/each}}
{{/if}}
---
## Next Steps
{{#if recommended_status == "VERIFIED_COMPLETE"}}
â
**Story is verified complete and production-ready**
- Update sprint-status.yaml: {{story_id}} = done
- No further action required
{{/if}}
{{#if recommended_status == "NEEDS_REWORK"}}
â ī¸ **Story requires rework before marking complete**
- Fix {{critical_count}} CRITICAL issues
- Address {{false_positive_count}} false positive tasks
- Re-run validation after fixes
{{/if}}
{{#if recommended_status == "FALSE_POSITIVE"}}
â **Story is marked done but not actually implemented**
- Verification score: {{verification_score}}/100 (< 50%)
- Update sprint-status.yaml: {{story_id}} = in-progress or ready-for-dev
- Implement missing tasks before claiming done
{{/if}}
---
**Generated by:** /validate-story workflow
**Validation Engine:** task-verification-engine.py v2.0
Apply recommended status change to sprint-status.yaml? (y/n)
Update sprint-status.yaml:
- Use sprint-status-updater.py
- Update {{story_id}} to {{recommended_status}}
- Add comment: "Validated {{date}}, score {{overall_score}}/100"
Update story file:
- Add validation report link to Dev Agent Record
- Add validation score to completion notes
- Update Status: field if changed