Compare commits

...

4 Commits

Author SHA1 Message Date
Alex Verkhovsky 6c7f72d146
Merge branch 'main' into feature/more-cynical-review 2025-12-16 23:21:52 -07:00
Brian d5e5796ba3
Merge branch 'main' into feature/more-cynical-review 2025-12-15 07:50:26 +08:00
Alex Verkhovsky 43c6b6e5bd feat(bmm): add automatic adversarial code review to quick-dev workflow
Adds Step 5 to quick-dev that automatically runs adversarial code review
after implementation completes. Captures baseline commit at workflow start
and reviews all changes (tracked + newly created files) using a cynical
reviewer persona via subagent, CLI fallback, or inline self-review.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-13 23:06:32 -07:00
Alex Verkhovsky 65c93c529c feat(bmm): add information-asymmetric adversarial code review
Enhance code review workflow with a two-phase approach:
- Context-aware review (step 3): Uses story knowledge to check implementation
- Asymmetric adversarial review (step 4): Cynical reviewer with no story context
  judges changes purely on technical merit

Key additions:
- Cynical reviewer persona that expects to find problems
- Execution hierarchy: Task tool > CLI fresh context > inline fallback
- Findings consolidation with deduplication across both review phases
- Improved severity assessment (CRITICAL/HIGH/MEDIUM/LOW)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-13 12:13:46 -07:00
3 changed files with 167 additions and 33 deletions

View File

@ -104,52 +104,104 @@
</action>
<action>Find at least 3 more specific, actionable issues</action>
</check>
<!-- Store context-aware findings for later consolidation -->
<action>Set {{context_aware_findings}} = all issues found in this step (numbered list with file:line locations)</action>
</step>
<step n="4" goal="Present findings and fix them">
<action>Categorize findings: HIGH (must fix), MEDIUM (should fix), LOW (nice to fix)</action>
<step n="4" goal="Run information-asymmetric adversarial review">
<critical>Reviewer has FULL repo access but NO knowledge of WHY changes were made</critical>
<critical>DO NOT include story file in prompt - asymmetry is about intent, not visibility</critical>
<critical>Reviewer can explore codebase to understand impact, but judges changes on merit alone</critical>
<!-- Construct diff of story-related changes -->
<action>Construct the diff of story-related changes:
- Uncommitted changes: `git diff` + `git diff --cached`
- Committed changes (if story spans commits): `git log --oneline` to find relevant commits, then `git diff base..HEAD`
- Exclude story file from diff: `git diff -- . ':!{{story_path}}'`
</action>
<action>Set {{asymmetric_target}} = the diff output (reviewer can explore repo but is prompted to review this diff)</action>
<!-- Execution hierarchy: cleanest context first -->
<check if="Task tool available (can spawn subagent)">
<action>Launch general-purpose subagent with adversarial prompt:
"You are a cynical, jaded code reviewer with zero patience for sloppy work.
A clueless weasel submitted the following changes and you expect to find problems.
Find at least ten findings to fix or improve. Look for what's missing, not just what's wrong.
Number each finding (1., 2., 3., ...). Be skeptical of everything.
Changes to review:
{{asymmetric_target}}"
</action>
<action>Collect numbered findings into {{asymmetric_findings}}</action>
</check>
<check if="no Task tool BUT can use Bash to invoke CLI for fresh context">
<action>Execute adversarial review via CLI (e.g., claude --print) in fresh context with same prompt</action>
<action>Collect numbered findings into {{asymmetric_findings}}</action>
</check>
<check if="cannot create clean slate agent by any means (fallback)">
<action>Execute adversarial prompt inline in main context</action>
<action>Note: Has context pollution but cynical reviewer persona still adds significant value</action>
<action>Collect numbered findings into {{asymmetric_findings}}</action>
</check>
</step>
<step n="5" goal="Consolidate findings and present to user">
<critical>Merge findings from BOTH context-aware review (step 3) AND asymmetric review (step 4)</critical>
<action>Combine {{context_aware_findings}} from step 3 with {{asymmetric_findings}} from step 4</action>
<action>Deduplicate findings:
- Identify findings that describe the same underlying issue
- Keep the more detailed/actionable version
- Note when both reviews caught the same issue (validates severity)
</action>
<action>Assess each finding:
- Is this a real issue or noise/false positive?
- Assign severity: 🔴 CRITICAL, 🟠 HIGH, 🟡 MEDIUM, 🟢 LOW
</action>
<action>Filter out non-issues:
- Remove false positives
- Remove nitpicks that do not warrant action
- Keep anything that could cause problems in production
</action>
<action>Sort by severity (CRITICAL → HIGH → MEDIUM → LOW)</action>
<action>Set {{fixed_count}} = 0</action>
<action>Set {{action_count}} = 0</action>
<output>**🔥 CODE REVIEW FINDINGS, {user_name}!**
**Story:** {{story_file}}
**Story:** {{story_path}}
**Git vs Story Discrepancies:** {{git_discrepancy_count}} found
**Issues Found:** {{high_count}} High, {{medium_count}} Medium, {{low_count}} Low
**Issues Found:** {{critical_count}} Critical, {{high_count}} High, {{medium_count}} Medium, {{low_count}} Low
## 🔴 CRITICAL ISSUES
- Tasks marked [x] but not actually implemented
- Acceptance Criteria not implemented
- Story claims files changed but no git evidence
- Security vulnerabilities
| # | Severity | Summary | Location |
|---|----------|---------|----------|
{{findings_table}}
## 🟡 MEDIUM ISSUES
- Files changed but not documented in story File List
- Uncommitted changes not tracked
- Performance problems
- Poor test coverage/quality
- Code maintainability issues
## 🟢 LOW ISSUES
- Code style improvements
- Documentation gaps
- Git commit message quality
**{{total_count}} issues found** ({{critical_count}} critical, {{high_count}} high, {{medium_count}} medium, {{low_count}} low)
</output>
<ask>What should I do with these issues?
1. **Fix them automatically** - I'll update the code and tests
1. **Fix them automatically** - I'll fix all HIGH and CRITICAL, you approve each
2. **Create action items** - Add to story Tasks/Subtasks for later
3. **Show me details** - Deep dive into specific issues
3. **Details on #N** - Explain specific issue
Choose [1], [2], or specify which issue to examine:</ask>
<check if="user chooses 1">
<action>Fix all HIGH and MEDIUM issues in the code</action>
<action>Fix all CRITICAL and HIGH issues in the code</action>
<action>Add/update tests as needed</action>
<action>Update File List in story if files changed</action>
<action>Update story Dev Agent Record with fixes applied</action>
<action>Set {{fixed_count}} = number of HIGH and MEDIUM issues fixed</action>
<action>Set {{fixed_count}} = number of CRITICAL and HIGH issues fixed</action>
<action>Set {{action_count}} = 0</action>
</check>
@ -166,13 +218,13 @@
</check>
</step>
<step n="5" goal="Update story status and sync sprint tracking">
<step n="6" goal="Update story status and sync sprint tracking">
<!-- Determine new status based on review outcome -->
<check if="all HIGH and MEDIUM issues fixed AND all ACs implemented">
<check if="all CRITICAL and HIGH issues fixed AND all ACs implemented">
<action>Set {{new_status}} = "done"</action>
<action>Update story Status field to "done"</action>
</check>
<check if="HIGH or MEDIUM issues remain OR ACs not fully implemented">
<check if="CRITICAL or HIGH issues remain OR ACs not fully implemented">
<action>Set {{new_status}} = "in-progress"</action>
<action>Update story Status field to "in-progress"</action>
</check>

View File

@ -23,3 +23,11 @@
- [ ] Acceptance criteria satisfied
- [ ] Tech-spec updated (if applicable)
- [ ] Summary provided to user
## Adversarial Review
- [ ] Diff constructed (tracked changes from {baseline_commit} + new untracked files)
- [ ] Adversarial review executed (subagent preferred)
- [ ] Findings presented with severity and classification
- [ ] User chose handling approach (walk through / auto-fix / skip)
- [ ] Findings resolved or acknowledged

View File

@ -15,6 +15,8 @@
<step n="1" goal="Load project context and determine execution mode">
<action>Record current HEAD as baseline for later review. Run `git rev-parse HEAD` and store the result as {baseline_commit}.</action>
<action>Check if {project_context} exists. If yes, load it - this is your foundational reference for ALL implementation decisions (patterns, conventions, architecture).</action>
<action>Parse user input:
@ -170,7 +172,7 @@ Use holistic judgment, not mechanical keyword matching.</action>
</step>
<step n="4" goal="Verify and complete">
<step n="4" goal="Verify and transition to review">
<action>Verify: all tasks [x], tests passing, AC satisfied, patterns followed</action>
@ -185,17 +187,89 @@ Use holistic judgment, not mechanical keyword matching.</action>
**Tests:** {{test_summary}}
**AC Status:** {{ac_status}}
---
Running adversarial code review...
</output>
**Before committing (Recommended): Copy this code review prompt to a different LLM**
<action>Proceed immediately to step 5</action>
</step>
<step n="5" goal="Adversarial code review (automatic)">
<action>Construct diff of all changes since workflow started and capture as {diff_output}:
**Tracked file changes:**
```bash
git diff {baseline_commit}
```
You are a cynical, jaded code reviewer with zero patience for sloppy work. These uncommitted changes were submitted by a clueless weasel and you expect to find problems. Find at least five issues to fix or improve in it. Number them. Be skeptical of everything.
```
**New files created by this workflow:**
Only include untracked files that YOU actually created during steps 2-4. Do not include pre-existing untracked files. For each new file you created, include its full content as a "new file" addition.
Combine both into {diff_output} for review. Do NOT `git add` anything - this is read-only inspection.</action>
<action>Execute adversarial review using this hierarchy (try in order until one succeeds):
1. **Spawn subagent** (preferred) - pass the diff output along with this prompt:
```
You are a cynical, jaded code reviewer with zero patience for sloppy work. This diff was submitted by a clueless weasel and you expect to find problems. Find at least five issues to fix or improve. Number them. Be skeptical of everything.
<diff>
{diff_output}
</diff>
```
2. **CLI fallback** - pipe diff to `claude --print` with same prompt
3. **Inline self-review** - Review the diff output yourself using the cynical reviewer persona above
</action>
<check if="zero findings returned">
<action>HALT - Zero findings is suspicious. Adversarial review should always find something. Request user guidance.</action>
</check>
<action>Process findings:
- Assign IDs: F1, F2, F3...
- Assign severity: 🔴 Critical | 🟠 High | 🟡 Medium | 🟢 Low
- Classify each: **real** (confirmed issue) | **noise** (false positive) | **uncertain** (needs discussion)
</action>
<output>**Adversarial Review Findings**
| ID | Severity | Classification | Finding |
| --- | -------- | -------------- | ------- |
| F1 | 🟠 | real | ... |
| F2 | 🟡 | noise | ... |
| ... |
</output>
<action>You must explain what was implemented based on {user_skill_level}</action>
<ask>How would you like to handle these findings?
**[1] Walk through** - Discuss each finding individually
**[2] Auto-fix** - Automatically fix issues classified as "real"
**[3] Skip** - Acknowledge and proceed to commit</ask>
<check if="1">
<action>Present each finding one by one. For each, ask: fix now / skip / discuss</action>
<action>Apply fixes as approved</action>
</check>
<check if="2">
<action>Automatically fix all findings classified as "real"</action>
<action>Report what was fixed</action>
</check>
<check if="3">
<action>Acknowledge findings were reviewed and user chose to skip</action>
</check>
<output>**Review complete. Ready to commit.**</output>
<action>Explain what was implemented based on {user_skill_level}</action>
</step>