From a1c054006a08a465d99e756999d411f7b03e2571 Mon Sep 17 00:00:00 2001 From: Dicky Moore Date: Sun, 8 Feb 2026 15:01:58 +0000 Subject: [PATCH] refine workflow contracts for review findings, halt protocol, and sprint tracking --- .../steps/step-03-execute-review.md | 79 +++++++++++++----- .../steps/step-04-present-and-resolve.md | 13 ++- .../correct-course/workflow.md | 13 ++- .../steps/step-08-mark-task-complete.md | 81 ++++++++++++++----- .../steps/step-09-mark-review-ready.md | 47 ++++++----- .../4-implementation/dev-story/workflow.md | 40 ++++++--- 6 files changed, 195 insertions(+), 78 deletions(-) diff --git a/src/bmm/workflows/4-implementation/code-review/steps/step-03-execute-review.md b/src/bmm/workflows/4-implementation/code-review/steps/step-03-execute-review.md index 6ca2f3896..3fba7d536 100644 --- a/src/bmm/workflows/4-implementation/code-review/steps/step-03-execute-review.md +++ b/src/bmm/workflows/4-implementation/code-review/steps/step-03-execute-review.md @@ -12,9 +12,10 @@ reviewFindingsFile: '{story_dir}/review-findings.json' Initialize findings artifacts: - Set {{review_findings}} = [] (in-memory array) - Set {{review_findings_file}} = {reviewFindingsFile} + - Set {{review_findings_schema}} = "id,severity,type,summary,detail,file_line,proof,suggested_fix,reviewer,timestamp" - Each finding record MUST contain: id, severity, type, summary, detail, file_line, proof, suggested_fix, reviewer, timestamp - - `file_line` format MUST be `path/to/file:line` + - `file_line` is the required `file:line` locator field and MUST use `path/to/file:line` format - `reviewer` value MUST be `senior-dev-review` - `timestamp` MUST use system ISO datetime @@ -35,28 +36,67 @@ reviewFindingsFile: '{story_dir}/review-findings.json' 1. Read the AC requirement 2. Search implementation files for evidence 3. Determine: IMPLEMENTED, PARTIAL, or MISSING using this algorithm: + - Parse each AC into explicit clauses (single requirement statements). + - Evaluate each clause independently, then derive overall AC status from clause outcomes. - IMPLEMENTED: - - Direct code evidence exists for ALL AC clauses, and - - At least one corroborating test OR deterministic runtime verification exists, and - - Any docs/comments are supported by code/test evidence. + - EVERY clause has direct code evidence tied to the story execution path, and + - Evidence includes at least one strong corroborator for AC behavior (automated test, integration test, or reproducible runtime proof), and + - Weak evidence (docs/comments/README) is only supplemental. - PARTIAL: - - Some AC clauses have direct implementation evidence but one or more clauses are missing OR only indirectly covered, or - - Evidence is helper/utility code not clearly wired to the story path, or - - Evidence is docs/comments only without strong corroboration. + - One or more clauses have direct evidence, but at least one clause lacks direct evidence, OR + - Coverage is only indirect (helper/generic utility not proven wired), OR + - Evidence is mostly weak and not corroborated by code/tests. - MISSING: - - No credible code/test/docs evidence addresses the AC clauses. - 4. Evidence-strength rules: - - Code + tests = strong evidence - - Code only = medium evidence - - Docs/comments/README only = weak evidence (cannot justify IMPLEMENTED alone) + - No clause has credible direct implementation evidence, and + - No test/runtime proof demonstrates AC behavior. + 4. Evidence-type rules: + - Strong evidence: implementation code plus validating tests/runtime proof. + - Medium evidence: implementation code without validating tests. + - Weak evidence: comments, README/docs, design notes, screenshots, or unverifiable logs. + - Weak evidence alone cannot qualify an AC as IMPLEMENTED. 5. Indirect evidence rules: - - Generic helpers/utilities count as PARTIAL unless explicitly wired by call sites OR integration tests. + - Helper functions/utilities count as indirect until explicit call sites or integration coverage prove the AC path. + - Generic capability not wired to this story remains PARTIAL. 6. Severity mapping for AC gaps: - - MISSING critical-path AC → HIGH - - MISSING non-critical AC → MEDIUM - - PARTIAL critical-path AC → HIGH - - PARTIAL non-critical AC → MEDIUM - 7. If AC is PARTIAL or MISSING, append a finding object to {{review_findings}}. + - MISSING + security/data-loss/compliance/core user flow risk -> HIGH. + - MISSING + non-core behavior or secondary UX/documentation requirement -> MEDIUM. + - PARTIAL + security/data-integrity/compliance risk -> HIGH. + - PARTIAL + degraded core behavior -> MEDIUM. + - PARTIAL + optional/non-critical behavior gap with safe fallback -> LOW. + 7. Classification examples: + - IMPLEMENTED example: AC requires validation + error response, and code path plus passing test covers all clauses. + - PARTIAL example: helper exists and one clause passes, but integration path for another clause is unproven. + - MISSING example: AC text exists, but no matching code path or tests are found. + 8. If AC is PARTIAL or MISSING, append a finding object to {{review_findings}} with status, severity, and clause-level proof. + + + When creating findings from any action above, populate fields using this mapping: + - id: + - Git discrepancy: `GIT-DIFF-{{index}}` + - AC gap: `AC-{{ac_id}}-{{status}}-{{index}}` + - Task mismatch: `TASK-{{task_id}}-MISMATCH-{{index}}` + - Code-quality issue: `CQ-{{category}}-{{index}}` + - severity: + - Use explicit severity rule from the originating action block + - type: + - `story-sync` for git/story discrepancies + - `acceptance-criteria` for AC gaps + - `task-audit` for task completion mismatches + - `code-quality` for quality/security/performance/test issues + - summary: + - One-line, user-facing issue statement + - detail: + - Include violated expectation plus observed behavior + - file_line: + - `path/to/file:line` evidence anchor (use most relevant file and line) + - proof: + - Concrete evidence snippet (code, test output, or git command result) + - suggested_fix: + - Actionable implementation guidance + - reviewer: + - `senior-dev-review` + - timestamp: + - System ISO datetime at finding creation time @@ -94,7 +134,8 @@ reviewFindingsFile: '{story_dir}/review-findings.json' Persist findings contract for downstream step: - Save {{review_findings}} as JSON array to {{review_findings_file}} - Ensure JSON is valid and each finding includes all required fields - - Set {{findings_contract}} = "JSON array at {{review_findings_file}}" + - Set {{findings_contract}} = "JSON array at {{review_findings_file}} with schema {{review_findings_schema}}" + - Step 4 MUST load findings from {{review_findings_file}} and validate against {{review_findings_schema}} before presenting or resolving Example finding record (must match real records): diff --git a/src/bmm/workflows/4-implementation/code-review/steps/step-04-present-and-resolve.md b/src/bmm/workflows/4-implementation/code-review/steps/step-04-present-and-resolve.md index 87df0d1ab..be966cbcb 100644 --- a/src/bmm/workflows/4-implementation/code-review/steps/step-04-present-and-resolve.md +++ b/src/bmm/workflows/4-implementation/code-review/steps/step-04-present-and-resolve.md @@ -6,11 +6,20 @@ reviewFindingsFile: '{story_dir}/review-findings.json' --- - Load structured findings from {reviewFindingsFile} + Resolve findings artifact input: + - Use {{review_findings_file}} from step 3 when present + - Otherwise fallback to {reviewFindingsFile} + - Set {{review_findings_schema}} = "id,severity,type,summary,detail,file_line,proof,suggested_fix,reviewer,timestamp" if not already set + + Load structured findings JSON array from {{review_findings_file}} Validate findings schema for each entry: id, severity, type, summary, detail, file_line, proof, suggested_fix, reviewer, timestamp - If findings file missing or malformed: HALT with explicit error and return to step 3 generation + Validation contract: + - `file_line` is the required `file:line` locator in `path/to/file:line` format + - Reject non-array JSON, missing required keys, or invalid file_line formatting + - If findings file missing/unreadable/malformed: HALT with explicit error and return to step 3 generation + Categorize findings: HIGH (must fix), MEDIUM (should fix), LOW (nice to fix) Set {{fixed_count}} = 0 Set {{action_count}} = 0 diff --git a/src/bmm/workflows/4-implementation/correct-course/workflow.md b/src/bmm/workflows/4-implementation/correct-course/workflow.md index 74b188b37..caf77108c 100644 --- a/src/bmm/workflows/4-implementation/correct-course/workflow.md +++ b/src/bmm/workflows/4-implementation/correct-course/workflow.md @@ -18,17 +18,24 @@ web_bundle: false - `sprint_status` = `{implementation_artifacts}/sprint-status.yaml` - `date` (system-generated) - `installed_path` = `{project-root}/_bmad/bmm/workflows/4-implementation/correct-course` - - Note: `installed_path` targets the installed runtime tree under `_bmad/...`; source authoring files are in `src/bmm/workflows/4-implementation/correct-course/...`. + - `source_path` = `{project-root}/src/bmm/workflows/4-implementation/correct-course` + - Note: `installed_path` targets the installed runtime tree under `_bmad/...`; `source_path` is the repository authoring path. - `default_output_file` = `{planning_artifacts}/sprint-change-proposal-{date}.md` Communicate all responses in {communication_language} and generate all documents in {document_output_language} - Read and follow instructions at: {installed_path}/instructions.md + Resolve workflow content path: + - If `{installed_path}/instructions.md` exists and is readable, set {{workflow_path}} = `{installed_path}` + - Else if `{source_path}/instructions.md` exists and is readable, set {{workflow_path}} = `{source_path}` + - Else emit an error listing both paths and HALT + + Read and follow instructions at: {{workflow_path}}/instructions.md - Validate against checklist at {installed_path}/checklist.md using {project-root}/_bmad/core/tasks/validate-workflow.md + If {{workflow_path}} is not set from step 1, repeat path resolution using checklist.md + Validate against checklist at {{workflow_path}}/checklist.md using {project-root}/_bmad/core/tasks/validate-workflow.md diff --git a/src/bmm/workflows/4-implementation/dev-story/steps/step-08-mark-task-complete.md b/src/bmm/workflows/4-implementation/dev-story/steps/step-08-mark-task-complete.md index 03d1d68f9..4facc56f8 100644 --- a/src/bmm/workflows/4-implementation/dev-story/steps/step-08-mark-task-complete.md +++ b/src/bmm/workflows/4-implementation/dev-story/steps/step-08-mark-task-complete.md @@ -10,8 +10,14 @@ nextStepFile: './step-09-mark-review-ready.md' Initialize review-tracking variables before checks: - If {{resolved_review_items}} is undefined: set {{resolved_review_items}} = [] - If {{unresolved_review_items}} is undefined: set {{unresolved_review_items}} = [] - - Set {{review_continuation}} by checking current task title/original task list for prefix "[AI-Review]" - - Set {{date}} from system-generated timestamp in project date format + - Set {{review_continuation}} = false + - If current {{task_title}} starts with "[AI-Review]", set {{review_continuation}} = true + - Else scan {{original_task_list}}; if any item starts with "[AI-Review]", set {{review_continuation}} = true + - Set {{date}} from system-generated timestamp formatted for project change log entries + - Set {{resolved_count}} = length({{resolved_review_items}}) + - Set {{review_match_threshold}} = 0.60 + - Define normalize(text): lowercase, trim, remove "[AI-Review]" prefix and punctuation, collapse whitespace + - Define token_set(text): unique whitespace-separated normalized tokens @@ -23,28 +29,58 @@ nextStepFile: './step-09-mark-review-ready.md' Extract review item details (severity, description, related AC/file) - Add current review task to resolution tracking list: append structured entry to {{resolved_review_items}} + Load all items from "Senior Developer Review (AI) → Action Items" as candidate list {{review_action_items}} + Set {{task_text_norm}} = normalize(current review follow-up task description) + Initialize {{best_match}} = null, {{best_score}} = 0, {{best_shared_tokens}} = 0, {{tie_candidates}} = [] + For each candidate action item: + 1. Set {{candidate_text_norm}} = normalize(candidate text) + 2. If {{task_text_norm}} == {{candidate_text_norm}} OR either contains the other: + - set {{candidate_score}} = 1.0 and mark as strong match + 3. Else: + - compute Jaccard score = |token_set(task) ∩ token_set(candidate)| / |token_set(task) ∪ token_set(candidate)| + - set {{candidate_score}} to computed score + 4. Track shared-token count for tie-breaking + 5. Keep highest score candidate; if same score, keep candidate with more shared tokens + 6. If score and shared-token count both tie, add candidate to {{tie_candidates}} + + Set {{match_found}} = true only if {{best_score}} >= {{review_match_threshold}} - + Mark task checkbox [x] in "Tasks/Subtasks → Review Follow-ups (AI)" section - - Find matching action item in "Senior Developer Review (AI) → Action Items" using fuzzy matching: - 1. Normalize strings (lowercase, trim, remove "[AI-Review]" prefix/punctuation) - 2. Try exact and substring matches first - 3. If none, compute token-overlap/Jaccard score per candidate - 4. Select highest-scoring candidate when score >= 0.60 - 5. If tie at best score, prefer the candidate with more shared tokens; log ambiguity - - - Mark that action item checkbox [x] as resolved - - - Log warning and append task to {{unresolved_review_items}} - Add resolution note in Dev Agent Record that no corresponding action item was found + + Mark matched action item checkbox [x] in "Senior Developer Review (AI) → Action Items" + Append structured entry to {{resolved_review_items}}: + - task: current review follow-up task + - matched_action_item: {{best_match}} + - match_score: {{best_score}} + - resolved_at: {{date}} + - status: "matched" + + + Log ambiguity warning with tied candidates and selected best_match + + Add to Dev Agent Record → Completion Notes: "✅ Resolved review finding [{{severity}}]: {{description}} (matched action item, score {{best_score}})" - Add to Dev Agent Record → Completion Notes: "✅ Resolved review finding [{{severity}}]: {{description}}" + + Log warning: no candidate met threshold {{review_match_threshold}} for task "{{task_text_norm}}" + Append structured entry to {{resolved_review_items}}: + - task: current review follow-up task + - matched_action_item: null + - match_score: {{best_score}} + - resolved_at: {{date}} + - status: "unmatched" + + Append structured entry to {{unresolved_review_items}}: + - task: current review follow-up task + - reason: "No corresponding action item met fuzzy-match threshold" + - best_candidate: {{best_match}} + - best_score: {{best_score}} + - recorded_at: {{date}} + + Add resolution note in Dev Agent Record that no corresponding action item was found, while follow-up checkbox was still marked complete + @@ -56,7 +92,12 @@ nextStepFile: './step-09-mark-review-ready.md' DO NOT mark task complete - fix issues first - HALT if unable to fix validation failures + If unable to fix validation failures, invoke HALT protocol from dev-story/workflow.md with: + - reason_code: DEV-STORY-STEP-08-VALIDATION-FAIL + - step_id: step-08-mark-task-complete + - message: "Task completion validation failed and remediation was unsuccessful." + - required_action: "Fix failing validations/tests, then resume." + diff --git a/src/bmm/workflows/4-implementation/dev-story/steps/step-09-mark-review-ready.md b/src/bmm/workflows/4-implementation/dev-story/steps/step-09-mark-review-ready.md index 1ddae1928..fbdead276 100644 --- a/src/bmm/workflows/4-implementation/dev-story/steps/step-09-mark-review-ready.md +++ b/src/bmm/workflows/4-implementation/dev-story/steps/step-09-mark-review-ready.md @@ -10,9 +10,14 @@ nextStepFile: './step-10-closeout.md' Confirm File List includes every changed file Execute enhanced definition-of-done validation Update the story Status to: "review" - Initialize sprint tracking state: - - If {sprint_status} exists and is readable, load file and set {{current_sprint_status}} from tracking mode/content - - If file does not exist, unreadable, or indicates no sprint tracking, set {{current_sprint_status}} = "no-sprint-tracking" + Initialize sprint tracking state deterministically before any sprint-status check: + - Set {{current_sprint_status}} = "no-sprint-tracking" + - Set {{sprint_tracking_enabled}} = false + - If {sprint_status} exists and is readable: + - Load the FULL file: {sprint_status} + - If file content indicates tracking disabled OR development_status section is missing, keep "no-sprint-tracking" + - Else set {{current_sprint_status}} = "enabled" and {{sprint_tracking_enabled}} = true + - If file missing/unreadable, keep defaults and continue with story-only status update @@ -31,31 +36,31 @@ nextStepFile: './step-10-closeout.md' - - Load the FULL file: {sprint_status} + Find development_status key matching {{story_key}} - Verify current status is "in-progress" (expected previous state) - Update development_status[{{story_key}}] = "review" - Save file, preserving ALL comments and structure including STATUS DEFINITIONS - ✅ Story status updated to "review" in sprint-status.yaml + + Verify current status is "in-progress" (expected previous state) + Update development_status[{{story_key}}] = "review" + Save file, preserving ALL comments and structure including STATUS DEFINITIONS + ✅ Story status updated to "review" in sprint-status.yaml + + + ⚠️ Story file updated, but sprint-status update failed: {{story_key}} not found + + Story status is set to "review" in file, but sprint-status.yaml may be out of sync. + + - + ℹ️ Story status updated to "review" in story file (no sprint tracking configured) - - ⚠️ Story file updated, but sprint-status update failed: {{story_key}} not found - - Story status is set to "review" in file, but sprint-status.yaml may be out of sync. - - - - HALT - Complete remaining tasks before marking ready for review - HALT - Fix regression issues before completing - HALT - Update File List with all changed files - HALT - Address DoD failures before completing + Invoke HALT protocol (reason_code: DEV-STORY-STEP-09-INCOMPLETE-TASKS, step_id: step-09-mark-review-ready, message: "Incomplete tasks remain before review-ready transition.", required_action: "Complete all tasks/subtasks and rerun validations.") + Invoke HALT protocol (reason_code: DEV-STORY-STEP-09-REGRESSION-FAIL, step_id: step-09-mark-review-ready, message: "Regression suite has failures.", required_action: "Fix failing tests and rerun full regression suite.") + Invoke HALT protocol (reason_code: DEV-STORY-STEP-09-FILE-LIST-INCOMPLETE, step_id: step-09-mark-review-ready, message: "File List does not include all changed files.", required_action: "Update File List with all added/modified/deleted paths.") + Invoke HALT protocol (reason_code: DEV-STORY-STEP-09-DOD-FAIL, step_id: step-09-mark-review-ready, message: "Definition-of-done checks failed.", required_action: "Address DoD failures and rerun validation.") ## Next diff --git a/src/bmm/workflows/4-implementation/dev-story/workflow.md b/src/bmm/workflows/4-implementation/dev-story/workflow.md index ac00070ad..1d338fbe8 100644 --- a/src/bmm/workflows/4-implementation/dev-story/workflow.md +++ b/src/bmm/workflows/4-implementation/dev-story/workflow.md @@ -45,22 +45,36 @@ Implement a ready story end-to-end with strict validation gates, accurate progre - Change Log - Status - Execute steps in order and do not skip validation gates. -- Continue until the story is complete unless a defined HALT condition triggers. +- Continue until the story is complete unless the HALT protocol below is triggered. -## HALT Definition -- HALT triggers: - - Required inputs/files are missing or unreadable. - - Validation gates fail and cannot be remediated in current step. - - Test/regression failures persist after fix attempts. - - Story state becomes inconsistent (e.g., malformed task structure preventing safe updates). +## HALT Protocol (Normative) +- Scope: + - Every `HALT` instruction in this workflow and all `steps/*.md` files MUST use this protocol. +- Operational definition: + - HALT is a deterministic hard-stop event raised when execution cannot safely continue. + - A HALT event MUST include: + - `reason_code` (stable machine-readable code) + - `step_id` (current step file + step number) + - `message` (human-readable failure summary) + - `required_action` (what user/operator must do before resume) +- Trigger criteria: + - Required inputs/files are missing, unreadable, or malformed. + - Validation gates fail and cannot be remediated in the current step. + - Test/regression failures persist after attempted fixes. + - Story state is inconsistent (for example malformed task structure preventing safe updates). - HALT behavior: - - Stop executing further steps immediately. - - Persist current story-file edits and workflow state safely. - - Emit explicit user-facing error message describing trigger and remediation needed. - - Do not apply partial completion marks after HALT. + - Stop execution immediately and skip all downstream steps. + - Persist workflow checkpoint: current step id, resolved variables, and pending task context. + - Persist only already-applied safe edits; do not apply new partial completion marks after HALT. + - Emit logger event exactly in this format: + - `HALT[{reason_code}] step={step_id} story={story_key|unknown} detail=\"{message}\"` + - Emit user-facing prompt exactly in this format: + - `Workflow HALTED at {step_id} ({reason_code}): {message}. Required action: {required_action}. Reply RESUME after remediation.` - Resume semantics: - - Manual resume only after user confirms the blocking issue is resolved. - - Resume from the last incomplete step checkpoint, re-running validations before progressing. + - Manual resume only (no automatic retry loop). + - Resume is checkpoint-based: restart from the halted step after user confirms remediation. + - Re-run the failed validation/input check before executing further actions. + - If the same HALT condition repeats, stop again with updated evidence. ## Execution Read fully and follow: `steps/step-01-find-story.md`.