Merge branch 'main' into feature/more-cynical-review

feat(bmm): add automatic adversarial code review to quick-dev workflow
2025-12-16 23:21:52 -07:00 · 2025-12-15 07:50:26 +08:00 · 2025-12-13 23:06:32 -07:00 · 2025-12-13 12:13:46 -07:00
3 changed files with 167 additions and 33 deletions
--- a/src/modules/bmm/workflows/4-implementation/code-review/instructions.xml
+++ b/src/modules/bmm/workflows/4-implementation/code-review/instructions.xml
@ -104,52 +104,104 @@
      </action>
      <action>Find at least 3 more specific, actionable issues</action>
    </check>
+
+    <!-- Store context-aware findings for later consolidation -->
+    <action>Set {{context_aware_findings}} = all issues found in this step (numbered list with file:line locations)</action>
  </step>

-  <step n="4" goal="Present findings and fix them">
-    <action>Categorize findings: HIGH (must fix), MEDIUM (should fix), LOW (nice to fix)</action>
+  <step n="4" goal="Run information-asymmetric adversarial review">
+    <critical>Reviewer has FULL repo access but NO knowledge of WHY changes were made</critical>
+    <critical>DO NOT include story file in prompt - asymmetry is about intent, not visibility</critical>
+    <critical>Reviewer can explore codebase to understand impact, but judges changes on merit alone</critical>
+
+    <!-- Construct diff of story-related changes -->
+    <action>Construct the diff of story-related changes:
+      - Uncommitted changes: `git diff` + `git diff --cached`
+      - Committed changes (if story spans commits): `git log --oneline` to find relevant commits, then `git diff base..HEAD`
+      - Exclude story file from diff: `git diff -- . ':!{{story_path}}'`
+    </action>
+    <action>Set {{asymmetric_target}} = the diff output (reviewer can explore repo but is prompted to review this diff)</action>
+
+    <!-- Execution hierarchy: cleanest context first -->
+    <check if="Task tool available (can spawn subagent)">
+      <action>Launch general-purpose subagent with adversarial prompt:
+        "You are a cynical, jaded code reviewer with zero patience for sloppy work.
+        A clueless weasel submitted the following changes and you expect to find problems.
+        Find at least ten findings to fix or improve. Look for what's missing, not just what's wrong.
+        Number each finding (1., 2., 3., ...). Be skeptical of everything.
+
+        Changes to review:
+        {{asymmetric_target}}"
+      </action>
+      <action>Collect numbered findings into {{asymmetric_findings}}</action>
+    </check>
+
+    <check if="no Task tool BUT can use Bash to invoke CLI for fresh context">
+      <action>Execute adversarial review via CLI (e.g., claude --print) in fresh context with same prompt</action>
+      <action>Collect numbered findings into {{asymmetric_findings}}</action>
+    </check>
+
+    <check if="cannot create clean slate agent by any means (fallback)">
+      <action>Execute adversarial prompt inline in main context</action>
+      <action>Note: Has context pollution but cynical reviewer persona still adds significant value</action>
+      <action>Collect numbered findings into {{asymmetric_findings}}</action>
+    </check>
+  </step>
+
+  <step n="5" goal="Consolidate findings and present to user">
+    <critical>Merge findings from BOTH context-aware review (step 3) AND asymmetric review (step 4)</critical>
+
+    <action>Combine {{context_aware_findings}} from step 3 with {{asymmetric_findings}} from step 4</action>
+
+    <action>Deduplicate findings:
+      - Identify findings that describe the same underlying issue
+      - Keep the more detailed/actionable version
+      - Note when both reviews caught the same issue (validates severity)
+    </action>
+
+    <action>Assess each finding:
+      - Is this a real issue or noise/false positive?
+      - Assign severity: 🔴 CRITICAL, 🟠 HIGH, 🟡 MEDIUM, 🟢 LOW
+    </action>
+
+    <action>Filter out non-issues:
+      - Remove false positives
+      - Remove nitpicks that do not warrant action
+      - Keep anything that could cause problems in production
+    </action>
+
+    <action>Sort by severity (CRITICAL → HIGH → MEDIUM → LOW)</action>
+
    <action>Set {{fixed_count}} = 0</action>
    <action>Set {{action_count}} = 0</action>

    <output>**🔥 CODE REVIEW FINDINGS, {user_name}!**

-      **Story:** {{story_file}}
+      **Story:** {{story_path}}
      **Git vs Story Discrepancies:** {{git_discrepancy_count}} found
-      **Issues Found:** {{high_count}} High, {{medium_count}} Medium, {{low_count}} Low
+      **Issues Found:** {{critical_count}} Critical, {{high_count}} High, {{medium_count}} Medium, {{low_count}} Low

-      ## 🔴 CRITICAL ISSUES
-      - Tasks marked [x] but not actually implemented
-      - Acceptance Criteria not implemented
-      - Story claims files changed but no git evidence
-      - Security vulnerabilities
+      | # | Severity | Summary | Location |
+      |---|----------|---------|----------|
+      {{findings_table}}

-      ## 🟡 MEDIUM ISSUES
-      - Files changed but not documented in story File List
-      - Uncommitted changes not tracked
-      - Performance problems
-      - Poor test coverage/quality
-      - Code maintainability issues
-
-      ## 🟢 LOW ISSUES
-      - Code style improvements
-      - Documentation gaps
-      - Git commit message quality
+      **{{total_count}} issues found** ({{critical_count}} critical, {{high_count}} high, {{medium_count}} medium, {{low_count}} low)
    </output>

    <ask>What should I do with these issues?

-      1. **Fix them automatically** - I'll update the code and tests
+      1. **Fix them automatically** - I'll fix all HIGH and CRITICAL, you approve each
      2. **Create action items** - Add to story Tasks/Subtasks for later
-      3. **Show me details** - Deep dive into specific issues
+      3. **Details on #N** - Explain specific issue

      Choose [1], [2], or specify which issue to examine:</ask>

    <check if="user chooses 1">
-      <action>Fix all HIGH and MEDIUM issues in the code</action>
+      <action>Fix all CRITICAL and HIGH issues in the code</action>
      <action>Add/update tests as needed</action>
      <action>Update File List in story if files changed</action>
      <action>Update story Dev Agent Record with fixes applied</action>
-      <action>Set {{fixed_count}} = number of HIGH and MEDIUM issues fixed</action>
+      <action>Set {{fixed_count}} = number of CRITICAL and HIGH issues fixed</action>
      <action>Set {{action_count}} = 0</action>
    </check>

@ -166,13 +218,13 @@
    </check>
  </step>

-  <step n="5" goal="Update story status and sync sprint tracking">
+  <step n="6" goal="Update story status and sync sprint tracking">
    <!-- Determine new status based on review outcome -->
-    <check if="all HIGH and MEDIUM issues fixed AND all ACs implemented">
+    <check if="all CRITICAL and HIGH issues fixed AND all ACs implemented">
      <action>Set {{new_status}} = "done"</action>
      <action>Update story Status field to "done"</action>
    </check>
-    <check if="HIGH or MEDIUM issues remain OR ACs not fully implemented">
+    <check if="CRITICAL or HIGH issues remain OR ACs not fully implemented">
      <action>Set {{new_status}} = "in-progress"</action>
      <action>Update story Status field to "in-progress"</action>
    </check>
--- a/src/modules/bmm/workflows/bmad-quick-flow/quick-dev/checklist.md
+++ b/src/modules/bmm/workflows/bmad-quick-flow/quick-dev/checklist.md
@ -23,3 +23,11 @@
 - [ ] Acceptance criteria satisfied
 - [ ] Tech-spec updated (if applicable)
 - [ ] Summary provided to user
+
+## Adversarial Review
+
+- [ ] Diff constructed (tracked changes from {baseline_commit} + new untracked files)
+- [ ] Adversarial review executed (subagent preferred)
+- [ ] Findings presented with severity and classification
+- [ ] User chose handling approach (walk through / auto-fix / skip)
+- [ ] Findings resolved or acknowledged
--- a/src/modules/bmm/workflows/bmad-quick-flow/quick-dev/instructions.md
+++ b/src/modules/bmm/workflows/bmad-quick-flow/quick-dev/instructions.md
@ -15,6 +15,8 @@

 <step n="1" goal="Load project context and determine execution mode">

+<action>Record current HEAD as baseline for later review. Run `git rev-parse HEAD` and store the result as {baseline_commit}.</action>
+
 <action>Check if {project_context} exists. If yes, load it - this is your foundational reference for ALL implementation decisions (patterns, conventions, architecture).</action>

 <action>Parse user input:
@ -170,7 +172,7 @@ Use holistic judgment, not mechanical keyword matching.</action>

 </step>

-<step n="4" goal="Verify and complete">
+<step n="4" goal="Verify and transition to review">

 <action>Verify: all tasks [x], tests passing, AC satisfied, patterns followed</action>

@ -185,17 +187,89 @@ Use holistic judgment, not mechanical keyword matching.</action>
 **Tests:** {{test_summary}}
 **AC Status:** {{ac_status}}

---
+Running adversarial code review...
+</output>

-**Before committing (Recommended): Copy this code review prompt to a different LLM**
+<action>Proceed immediately to step 5</action>

+</step>
+
+<step n="5" goal="Adversarial code review (automatic)">
+
+<action>Construct diff of all changes since workflow started and capture as {diff_output}:
+
+**Tracked file changes:**
+
+```bash
+git diff {baseline_commit}
 ```
-You are a cynical, jaded code reviewer with zero patience for sloppy work. These uncommitted changes were submitted by a clueless weasel and you expect to find problems. Find at least five issues to fix or improve in it. Number them. Be skeptical of everything.
-```
+
+**New files created by this workflow:**
+Only include untracked files that YOU actually created during steps 2-4. Do not include pre-existing untracked files. For each new file you created, include its full content as a "new file" addition.
+
+Combine both into {diff_output} for review. Do NOT `git add` anything - this is read-only inspection.</action>
+
+<action>Execute adversarial review using this hierarchy (try in order until one succeeds):
+
+1. **Spawn subagent** (preferred) - pass the diff output along with this prompt:
+
+   ```
+   You are a cynical, jaded code reviewer with zero patience for sloppy work. This diff was submitted by a clueless weasel and you expect to find problems. Find at least five issues to fix or improve. Number them. Be skeptical of everything.
+
+   <diff>
+   {diff_output}
+   </diff>
+   ```
+
+2. **CLI fallback** - pipe diff to `claude --print` with same prompt
+
+3. **Inline self-review** - Review the diff output yourself using the cynical reviewer persona above
+   </action>
+
+<check if="zero findings returned">
+  <action>HALT - Zero findings is suspicious. Adversarial review should always find something. Request user guidance.</action>
+</check>
+
+<action>Process findings:
+
+- Assign IDs: F1, F2, F3...
+- Assign severity: 🔴 Critical | 🟠 High | 🟡 Medium | 🟢 Low
+- Classify each: **real** (confirmed issue) | **noise** (false positive) | **uncertain** (needs discussion)
+  </action>
+
+<output>**Adversarial Review Findings**
+
+| ID  | Severity | Classification | Finding |
+| --- | -------- | -------------- | ------- |
+| F1  | 🟠       | real           | ...     |
+| F2  | 🟡       | noise          | ...     |
+| ... |

 </output>

-<action>You must explain what was implemented based on {user_skill_level}</action>
+<ask>How would you like to handle these findings?
+
+**[1] Walk through** - Discuss each finding individually
+**[2] Auto-fix** - Automatically fix issues classified as "real"
+**[3] Skip** - Acknowledge and proceed to commit</ask>
+
+<check if="1">
+  <action>Present each finding one by one. For each, ask: fix now / skip / discuss</action>
+  <action>Apply fixes as approved</action>
+</check>
+
+<check if="2">
+  <action>Automatically fix all findings classified as "real"</action>
+  <action>Report what was fixed</action>
+</check>
+
+<check if="3">
+  <action>Acknowledge findings were reviewed and user chose to skip</action>
+</check>
+
+<output>**Review complete. Ready to commit.**</output>
+
+<action>Explain what was implemented based on {user_skill_level}</action>

 </step>
Author	SHA1	Message	Date
Alex Verkhovsky	6c7f72d146	Merge branch 'main' into feature/more-cynical-review	2025-12-16 23:21:52 -07:00
Brian	d5e5796ba3	Merge branch 'main' into feature/more-cynical-review	2025-12-15 07:50:26 +08:00
Alex Verkhovsky	43c6b6e5bd	feat(bmm): add automatic adversarial code review to quick-dev workflow Adds Step 5 to quick-dev that automatically runs adversarial code review after implementation completes. Captures baseline commit at workflow start and reviews all changes (tracked + newly created files) using a cynical reviewer persona via subagent, CLI fallback, or inline self-review. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2025-12-13 23:06:32 -07:00
Alex Verkhovsky	65c93c529c	feat(bmm): add information-asymmetric adversarial code review Enhance code review workflow with a two-phase approach: - Context-aware review (step 3): Uses story knowledge to check implementation - Asymmetric adversarial review (step 4): Cynical reviewer with no story context judges changes purely on technical merit Key additions: - Cynical reviewer persona that expects to find problems - Execution hierarchy: Task tool > CLI fresh context > inline fallback - Findings consolidation with deduplication across both review phases - Improved severity assessment (CRITICAL/HIGH/MEDIUM/LOW) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2025-12-13 12:13:46 -07:00