From 195664029d9dd5bdc06b19602cfbf787aea457ce Mon Sep 17 00:00:00 2001 From: Scott Jennings Date: Sat, 13 Dec 2025 20:29:11 -0600 Subject: [PATCH] feat: add triage step to code review workflow External agents find ALL possible issues, but not all are relevant. This new step 4 evaluates findings against project context: - Actually Important: AC violations, bugs, real security issues - Context-Dependent: Performance, validation strictness (asks user) - Theoretical/Nitpicking: Micro-optimizations, style preferences (skips) This enables pragmatic code review that focuses on what matters. --- .../code-review/instructions.xml | 145 +++++++++++++++--- 1 file changed, 121 insertions(+), 24 deletions(-) diff --git a/src/modules/bmm/workflows/4-implementation/code-review/instructions.xml b/src/modules/bmm/workflows/4-implementation/code-review/instructions.xml index 87100fef..eddb61eb 100644 --- a/src/modules/bmm/workflows/4-implementation/code-review/instructions.xml +++ b/src/modules/bmm/workflows/4-implementation/code-review/instructions.xml @@ -288,38 +288,135 @@ - - Categorize findings: HIGH (must fix), MEDIUM (should fix), LOW (nice to fix) + + + + + + + External agents are adversarial by design - they find EVERYTHING. + Your job: Apply project context to determine what ACTUALLY matters. + Do NOT blindly accept all findings. Do NOT dismiss valid concerns. + + For EACH finding from {{all_findings}}, evaluate against these criteria: + + + **ACTUALLY IMPORTANT** (always fix): + - AC violations: Story claims something works but it doesn't + - Task fraud: Checkbox marked [x] but code doesn't exist + - Contract violations: Method signature doesn't match documented behavior + - Real bugs: Code will fail at runtime in normal usage + - Security issues for the ACTUAL threat model (game loading own files ≠ public API) + + + **CONTEXT-DEPENDENT** (ask user): + - Performance: Does it matter at this scale? Game data vs. million-record DB + - Validation strictness: Nice-to-have vs. actually needed + - Edge cases: Will they ever happen in practice? + - Thread safety: Is this actually multi-threaded? + + + **THEORETICAL/NITPICKING** (skip unless user insists): + - "Could be exploited if attacker controls X" when attacker never controls X + - Micro-optimizations that save nanoseconds + - Style preferences disguised as bugs + - "Best practice" violations that don't cause problems + - DoS concerns for trusted internal data + + + + + + + + Categorize each finding into one of the three lists with brief reasoning + + + **🔍 FINDINGS TRIAGE** + +I've reviewed {{external_agent_cmd}}'s findings against your project context. +Here's what I think actually matters vs. theoretical concerns: + + + + + +## ✅ ACTUALLY IMPORTANT (Recommend Fix) +These are real issues that affect correctness or violate documented contracts: + +{{#each important_findings}} +- **{{this.id}}**: {{this.summary}} + - *Why it matters*: {{this.reasoning}} +{{/each}} + + + + + + + Add user-selected contextual findings to {{important_findings}} + + + + + +## ⏭️ SKIPPING (Theoretical/Nitpicking) +These findings are technically valid but don't matter for your use case: + +{{#each theoretical_findings}} +- **{{this.id}}**: {{this.summary}} — *{{this.reasoning}}* +{{/each}} + +*(Say "fix all" if you want these addressed anyway)* + + + + + + +**📋 FINAL FIX LIST: {{final_fix_list.length}} issues** + + + + Set {{fixed_count}} = 0 Set {{action_count}} = 0 - **🔥 CODE REVIEW FINDINGS, {user_name}!** + **🔥 CODE REVIEW SUMMARY, {user_name}!** **Story:** {{story_file}} **Review Method:** {{#if external_agent_cmd}}{{external_agent_cmd}} CLI{{else}}built-in{{/if}} - **Git vs Story Discrepancies:** {{git_discrepancy_count}} found - **Issues Found:** {{high_count}} High, {{medium_count}} Medium, {{low_count}} Low + **Raw Findings:** {{all_findings.length}} total + **After Triage:** {{final_fix_list.length}} to address, {{theoretical_findings.length}} skipped - ## 🔴 CRITICAL ISSUES - - Tasks marked [x] but not actually implemented - - Acceptance Criteria not implemented - - Story claims files changed but no git evidence - - Security vulnerabilities - - ## 🟡 MEDIUM ISSUES - - Files changed but not documented in story File List - - Uncommitted changes not tracked - - Performance problems - - Poor test coverage/quality - - Code maintainability issues - - ## 🟢 LOW ISSUES - - Code style improvements - - Documentation gaps - - Git commit message quality + ## Issues to Fix + {{#each final_fix_list}} + - [{{this.severity}}] {{this.id}}: {{this.summary}} + {{/each}} - What should I do with these issues? + What should I do with these {{final_fix_list.length}} issues? 1. **Fix them automatically** - I'll update the code and tests 2. **Create action items** - Add to story Tasks/Subtasks for later @@ -349,7 +446,7 @@ - + Set {{new_status}} = "done"