Compare commits

...

46 Commits

Author SHA1 Message Date
Alex Verkhovsky 9542a45f97
Merge 59ed596392 into d419ac8a70 2026-01-12 10:56:16 -06:00
Alex Verkhovsky d419ac8a70
feat: add editorial review tasks for structure and prose (#1307)
* feat: add editorial review tasks for structure and prose

Add two complementary editorial review tasks:

- editorial-review-structure.xml: Structural editor that proposes cuts,
  reorganization, and simplification. Includes 5 document archetype models
  (Tutorial, Reference, Explanation, Prompt, Strategic) for targeted evaluation.

- editorial-review-prose.xml: Clinical copy-editor for prose improvements
  using Microsoft Writing Style Guide as baseline.

Both tasks support humans and llm target audiences with different principles.

* fix: add content-sacrosanct guardrail to editorial review tasks

Both editorial review tasks (prose and structure) were missing the key
constraint that reviewers should never challenge the ideas/knowledge
themselves—only how clearly they are communicated. This restores the
original design intent.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix: align reader_type parameter naming across editorial tasks

Prose task was using 'target_audience' for the humans/llm optimization
flag while structure task correctly separates 'target_audience' (who
reads) from 'reader_type' (optimization mode). Aligns to reader_type.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Co-authored-by: Brian <bmadcode@gmail.com>
2026-01-13 00:20:04 +08:00
Alex Verkhovsky 59ed596392 refactor(code-review): swap phase order - adversarial first, context-aware second
Reorder dual-phase review so adversarial diff review runs before
context-aware review. This ensures fresh-eyes code quality checks
happen before story-biased validation.
2026-01-06 23:29:41 -08:00
Alex Verkhovsky a2458a5537
Merge branch 'main' into refactor/code-review-sharded-dual-phase 2026-01-06 19:29:31 -08:00
Alex Verkhovsky b73670700b docs(code-review): expand story file validation to include empty and malformed files 2026-01-05 07:39:56 -08:00
Alex Verkhovsky 5a16c3a102 fix(code-review): halt on git command failure instead of silently treating as NO_GIT 2026-01-05 07:27:12 -08:00
Alex Verkhovsky 58e0b6a634 docs(code-review): add generic error handling for git commands 2026-01-05 07:26:50 -08:00
Alex Verkhovsky 2785d382d5 docs(code-review): use {sprint_status} variable instead of expanded path 2026-01-05 07:24:11 -08:00
Alex Verkhovsky 551a2ccb53 docs(code-review): use variable reference for sprint-status path 2026-01-05 07:21:44 -08:00
Alex Verkhovsky 3fc411d9c9 docs(code-review): clarify sprint_status file definition and location 2026-01-05 07:19:31 -08:00
Alex Verkhovsky ec30b580e7 refactor(code-review): use Skip to for flow control directive in substep 4
Skip to substep 5 correctly communicates jumping past the rest of the git
discovery logic in substep 4 when git repo is not found. Proceed would
suggest normal sequential flow, but we are skipping the conditional branch.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
2026-01-05 07:13:41 -08:00
Alex Verkhovsky 9e6e991b53 fix(code-review): correct flow control directive in substep 4
Changed "Skip to substep 6" (which does not exist) to "Proceed to substep 5".
Step only has 5 substeps. After setting NO_GIT flag, workflow continues to
substep 5 (Cross-Reference Story vs Git), not to a non-existent substep 6.

Fixes h2 finding from adversarial review.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
2026-01-05 07:09:56 -08:00
Alex Verkhovsky dbdaae1be7 refactor(code-review): remove NEXT directive from completion checklist
The checklist validates work done DURING step execution.
The NEXT directive is OUTPUT of completion, not a validation criterion.
It happens AFTER the checklist passes, so it does not belong there.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
2026-01-05 06:34:33 -08:00
Alex Verkhovsky 1636bd5a55 refactor(code-review): remove redundant 'immediately' from halt instruction
'Immediately' is implied by HALT. No timing choice exists.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
2026-01-05 06:33:05 -08:00
Alex Verkhovsky 53045d35b1 refactor(code-review): move NEXT STEP DIRECTIVE after COMPLETION CHECKLIST
Logical flow: verify checklist → then declare next step
Not: declare next step → then verify checklist

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
2026-01-05 06:31:19 -08:00
Alex Verkhovsky b3643af6dc refactor(code-review): remove redundancy and clarify halt instruction
- Remove redundant "Do NOT proceed to the next step" (halt already means this)
- Change "item" to "criterion" (more precise terminology)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
2026-01-05 06:30:31 -08:00
Alex Verkhovsky 4ba6e19303 refactor(code-review): rename SUCCESS METRICS to COMPLETION CHECKLIST
Correct terminology:
- "Metrics" implies quantitative measurement
- These are actually pass/fail criteria for step completion
- Section is self-validation checklist, not measurement data

Reframe as checkpoint before proceeding to next step:
- Add "Before proceeding to the next step, verify ALL of the following:"
- Change "If any metric" to "If any item"
- Explicit instruction: "Do NOT proceed to the next step" if checklist fails

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
2026-01-05 06:29:52 -08:00
Alex Verkhovsky 38ab12da85 refactor(code-review): remove cargo cult failure modes repetition from step-01
FAILURE MODES section was just inverted SUCCESS METRICS. Not valuable.
Replaced with single catch-all statement: failure to meet any success metric = failure.

Let actual failure modes emerge from usage patterns, not speculation.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
2026-01-05 06:17:37 -08:00
Alex Verkhovsky 0ae6799cb6 refactor(code-review): remove project context loading from step-01
Step-01 focus is: load story + discover git changes. Nothing else.

Project context loading belongs in step-04 (Context-Aware Review) where it
provides audit rules, principles, and requirements for validating AC
implementation against project standards.

(See implementation-notes.md for detail)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
2026-01-05 06:04:34 -08:00
Alex Verkhovsky e479b4164c refactor(code-review): add checkpoint for empty git changes and exclude ignored files
Step-01 substeps 5:
- If no git changes detected: halt and ask user "Continue anyway?"
  Allows AC audit on story File List even if no code changes in git
- Exclude git-ignored files from discrepancy comparison
  Prevents false positives if story modified only ignored files

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
2026-01-05 06:02:17 -08:00
Alex Verkhovsky 71a1c325f7 refactor(code-review): add rename detection to git change discovery
Step-01 substep-4:
- Use git diff -M to detect renamed/moved files
- Include deleted, renamed files in git_changed_files
- Adversarial reviewer needs to see deletions (e.g., critical code removed)
- Downstream steps will handle these appropriately (documented in implementation-notes)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
2026-01-05 05:54:32 -08:00
Alex Verkhovsky 59c58b2e2c refactor(code-review): clean up step-01 substep 3 and add error handling
Substep 3 (Extract File List):
- Removed repetitive wording
- Reference {story_content} variable instead of generic "story file"
- Add error handling: if Dev Agent Record/File List not found, set story_file_list = NO_FILE_LIST
- Consistent with NO_GIT pattern used elsewhere

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
2026-01-05 05:47:46 -08:00
Alex Verkhovsky 18ac3c931a refactor(code-review): audit step-01 substeps and success/failure criteria
Step 01 audit findings:
- Substep 3 was extracting items not needed by step-01 (ACs, tasks, changelog)
  Trimmed to only extract story_file_list (needed for git comparison)
- Success/failure criteria now explicitly guard story_content completeness
  since downstream steps depend on the full file content
- Removed "downstream" jargon in favor of "later steps"

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
2026-01-05 05:42:04 -08:00
Alex Verkhovsky 8fc7db7b97 refactor(code-review): remove implementation notes from step-01
Implementation notes for the workflow should be collected in a dedicated
implementation-notes.md file, not embedded in step files. This keeps each
step focused and defers editorial comments to a separate tracking document.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
2026-01-05 05:33:27 -08:00
Alex Verkhovsky 060d5562a4 docs(code-review): clarify fuzzy matching for story identification
- Changed priority 1 from exact to resembles: handles format variations (1 2, 1.2, one-two, one thirty two)
- Explicitly prevents false matches: 1-33 does not match 1-32
- Updated priority 3-4 to use resembles instead of contains: supports typos and TTS errors (paiment, passwd)
- Added examples for number variations and compound spoken formats
- Tested with agent validation: handles typos, format variations, misspellings correctly
2026-01-05 04:24:29 -08:00
Alex Verkhovsky 2bd6e9df1b docs(code-review): clarify step-01 story identification algorithm
- Fixed variable naming convention: backticks for names, curlies only for value substitution
- Rewrote Identify Story section with explicit two-path algorithm (file path vs sprint_status search)
- Added verification step for files not in sprint_status with user confirmation flow
- Clarified matching priority order: exact key > full ID > partial > name > description
- Made loopback instructions consistent and explicit (return to user prompt)
- Improved git_discrepancies description from vague "differences" to concrete "mismatches"
- Tested with 30+ test cases and fresh agent review - algorithm is clear and executable
2026-01-05 04:14:14 -08:00
Alex Verkhovsky 6886e3c8cd refactor(code-review): clarify step-01 description and NO_GIT handling 2026-01-05 02:54:00 -08:00
Alex Verkhovsky 1f5700ea14 refactor(code-review): remove unused thisStepFile/nextStepFile from frontmatter 2026-01-05 02:37:00 -08:00
Alex Verkhovsky 9700da9dc6 refactor(code-review): remove input_file_patterns from workflow.md to prevent context leak 2026-01-05 01:14:37 -08:00
Alex Verkhovsky 0f18c4bcba refactor(code-review): replace discover_inputs protocol with explicit file loading 2026-01-05 01:12:35 -08:00
Alex Verkhovsky ae9b83388c refactor(code-review): reorder phases - adversarial first, context-aware second
- Swap step-03 and step-04: adversarial review now runs before context-aware
- Move discover_inputs from step-01 to step-04 (JIT loading)
- Add input_file_patterns to workflow.md frontmatter
- Adversarial runs lean (just diff + code), context-aware loads planning docs
2026-01-05 01:09:38 -08:00
Alex Verkhovsky 64c32d8c8c refactor(code-review): add web_bundle: false, use "Read and follow" wording
- Add web_bundle: false to frontmatter (workflow needs file access)
- Change "Load and execute" to "Read and follow" (clearer for LLMs)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
2026-01-05 00:39:17 -08:00
Alex Verkhovsky eae4ad46a1 refactor(code-review): remove unused validation path and checklist 2026-01-04 21:38:04 -08:00
Alex Verkhovsky a8758b0393 refactor(code-review): remove CRITICAL DIRECTIVES, add communication_language 2026-01-04 21:32:10 -08:00
Alex Verkhovsky ac081a27e8 docs(code-review): clarify step file loading in workflow architecture 2026-01-04 21:15:04 -08:00
Alex Verkhovsky 7c914ae8b2 refactor(code-review): inline single-use adversarial task path 2026-01-04 21:05:48 -08:00
Alex Verkhovsky dadca29b09 refactor(code-review): use installed_path variable in step files 2026-01-04 21:00:18 -08:00
Alex Verkhovsky 25f93a3b64 refactor(code-review): simplify workflow.md 2026-01-04 20:59:58 -08:00
Alex Verkhovsky 0f708d0b89 refactor(core): shorten adversarial review task name 2026-01-04 18:28:51 -08:00
Alex Verkhovsky 5fcdae02b5 refactor(code-review): defer finding IDs until consolidation 2026-01-04 05:33:21 -08:00
Alex Verkhovsky b8eeb78cff refactor(adversarial-review): simplify severity/validity classification 2026-01-04 04:13:46 -08:00
Alex Verkhovsky b628eec9fd refactor(code-review): simplify adversarial review task invocation 2026-01-04 04:07:23 -08:00
Alex Verkhovsky f5d949b922 feat(dev-story): capture baseline commit for code-review diff 2026-01-04 03:04:56 -08:00
Alex Verkhovsky 6d1d7d0e72 fix(adversarial-review): add tech-spec exclusion and read-only notes 2026-01-04 02:12:02 -08:00
Alex Verkhovsky 8b6a053d2e fix(code-review): simplify diff exclusion to implementation_artifacts only 2026-01-04 01:41:20 -08:00
Alex Verkhovsky 460c27e29a refactor(code-review): convert to sharded format with dual-phase review
Convert monolithic code-review workflow to step-file architecture:
- workflow.md: Overview and initialization
- step-01: Load story and discover git changes
- step-02: Build review attack plan
- step-03: Context-aware review (validates ACs, audits tasks)
- step-04: Adversarial review (information-asymmetric diff review)
- step-05: Consolidate findings (merge + deduplicate)
- step-06: Resolve findings and update status

Key features:
- Dual-phase review: context-aware + context-independent adversarial
- Information asymmetry: adversarial reviewer sees only diff, no story
- Uses review-adversarial-general.xml via subagent (with fallbacks)
- Findings consolidation with severity (CRITICAL/HIGH/MEDIUM/LOW)
- State variables for cross-step persistence

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-04 01:07:58 -08:00
15 changed files with 1330 additions and 304 deletions

View File

@ -0,0 +1,91 @@
<task id="_bmad/core/tasks/editorial-review-prose.xml"
name="Editorial Review - Prose"
description="Clinical copy-editor that reviews text for communication issues"
standalone="false">
<objective>Review text for communication issues that impede comprehension and output suggested fixes in a three-column table</objective>
<inputs>
<input name="content" required="true" desc="Cohesive unit of text to review (markdown, plain text, or text-heavy XML)" />
<input name="reader_type" required="false" default="humans" desc="'humans' (default) for standard editorial, 'llm' for precision focus" />
</inputs>
<llm critical="true">
<i>MANDATORY: Execute ALL steps in the flow section IN EXACT ORDER</i>
<i>DO NOT skip steps or change the sequence</i>
<i>HALT immediately when halt-conditions are met</i>
<i>Each action xml tag within step xml tag is a REQUIRED action to complete that step</i>
<i>You are a clinical copy-editor: precise, professional, neither warm nor cynical</i>
<i>Apply Microsoft Writing Style Guide principles as your baseline</i>
<i>Focus on communication issues that impede comprehension - not style preferences</i>
<i>NEVER rewrite for preference - only fix genuine issues</i>
<i critical="true">CONTENT IS SACROSANCT: Never challenge ideas—only clarify how they're expressed.</i>
<principles>
<i>Minimal intervention: Apply the smallest fix that achieves clarity</i>
<i>Preserve structure: Fix prose within existing structure, never restructure</i>
<i>Skip code/markup: Detect and skip code blocks, frontmatter, structural markup</i>
<i>When uncertain: Flag with a query rather than suggesting a definitive change</i>
<i>Deduplicate: Same issue in multiple places = one entry with locations listed</i>
<i>No conflicts: Merge overlapping fixes into single entries</i>
<i>Respect author voice: Preserve intentional stylistic choices</i>
</principles>
</llm>
<flow>
<step n="1" title="Validate Input">
<action>Check if content is empty or contains fewer than 3 words</action>
<action if="empty or fewer than 3 words">HALT with error: "Content too short for editorial review (minimum 3 words required)"</action>
<action>Validate reader_type is "humans" or "llm" (or not provided, defaulting to "humans")</action>
<action if="reader_type is invalid">HALT with error: "Invalid reader_type. Must be 'humans' or 'llm'"</action>
<action>Identify content type (markdown, plain text, XML with text)</action>
<action>Note any code blocks, frontmatter, or structural markup to skip</action>
</step>
<step n="2" title="Analyze Style">
<action>Analyze the style, tone, and voice of the input text</action>
<action>Note any intentional stylistic choices to preserve (informal tone, technical jargon, rhetorical patterns)</action>
<action>Calibrate review approach based on reader_type parameter</action>
<action if="reader_type='llm'">Prioritize: unambiguous references, consistent terminology, explicit structure, no hedging</action>
<action if="reader_type='humans'">Prioritize: clarity, flow, readability, natural progression</action>
</step>
<step n="3" title="Editorial Review" critical="true">
<action>Review all prose sections (skip code blocks, frontmatter, structural markup)</action>
<action>Identify communication issues that impede comprehension</action>
<action>For each issue, determine the minimal fix that achieves clarity</action>
<action>Deduplicate: If same issue appears multiple times, create one entry listing all locations</action>
<action>Merge overlapping issues into single entries (no conflicting suggestions)</action>
<action>For uncertain fixes, phrase as query: "Consider: [suggestion]?" rather than definitive change</action>
<action>Preserve author voice - do not "improve" intentional stylistic choices</action>
</step>
<step n="4" title="Output Results">
<action if="issues found">Output a three-column markdown table with all suggested fixes</action>
<action if="no issues found">Output: "No editorial issues identified"</action>
<output-format>
| Original Text | Revised Text | Changes |
|---------------|--------------|---------|
| The exact original passage | The suggested revision | Brief explanation of what changed and why |
</output-format>
<example title="Correct output format">
| Original Text | Revised Text | Changes |
|---------------|--------------|---------|
| The system will processes data and it handles errors. | The system processes data and handles errors. | Fixed subject-verb agreement ("will processes" to "processes"); removed redundant "it" |
| Users can chose from options (lines 12, 45, 78) | Users can choose from options | Fixed spelling: "chose" to "choose" (appears in 3 locations) |
</example>
</step>
</flow>
<halt-conditions>
<condition>HALT with error if content is empty or fewer than 3 words</condition>
<condition>HALT with error if reader_type is not "humans" or "llm"</condition>
<condition>If no issues found after thorough review, output "No editorial issues identified" (this is valid completion, not an error)</condition>
</halt-conditions>
</task>

View File

@ -0,0 +1,198 @@
<?xml version="1.0"?>
<!-- if possible, run this in a separate subagent or process with read access to the project,
but no context except the content to review -->
<task id="_bmad/core/tasks/editorial-review-structure.xml"
name="Editorial Review - Structure"
description="Structural editor that proposes cuts, reorganization,
and simplification while preserving comprehension"
standalone="false">
<objective>Review document structure and propose substantive changes
to improve clarity and flow-run this BEFORE copy editing</objective>
<inputs>
<input name="content" required="true"
desc="Document to review (markdown, plain text, or structured content)"/>
<input name="purpose" required="false"
desc="Document's intended purpose (e.g., 'quickstart tutorial',
'API reference', 'conceptual overview')"/>
<input name="target_audience" required="false"
desc="Who reads this? (e.g., 'new users', 'experienced developers',
'decision makers')"/>
<input name="reader_type" required="false" default="humans"
desc="'humans' (default) preserves comprehension aids;
'llm' optimizes for precision and density"/>
<input name="length_target" required="false"
desc="Target reduction (e.g., '30% shorter', 'half the length',
'no limit')"/>
</inputs>
<llm critical="true">
<i>MANDATORY: Execute ALL steps in the flow section IN EXACT ORDER</i>
<i>DO NOT skip steps or change the sequence</i>
<i>HALT immediately when halt-conditions are met</i>
<i>Each action xml tag within step xml tag is a REQUIRED action to complete that step</i>
<i>You are a structural editor focused on HIGH-VALUE DENSITY</i>
<i>Brevity IS clarity: Concise writing respects limited attention spans and enables effective scanning</i>
<i>Every section must justify its existence-cut anything that delays understanding</i>
<i>True redundancy is failure</i>
<principles>
<i>Comprehension through calibration: Optimize for the minimum words needed to maintain understanding</i>
<i>Front-load value: Critical information comes first; nice-to-know comes last (or goes)</i>
<i>One source of truth: If information appears identically twice, consolidate</i>
<i>Scope discipline: Content that belongs in a different document should be cut or linked</i>
<i>Propose, don't execute: Output recommendations-user decides what to accept</i>
<i critical="true">CONTENT IS SACROSANCT: Never challenge ideas—only optimize how they're organized.</i>
</principles>
<human-reader-principles>
<i>These elements serve human comprehension and engagement-preserve unless clearly wasteful:</i>
<i>Visual aids: Diagrams, images, and flowcharts anchor understanding</i>
<i>Expectation-setting: "What You'll Learn" helps readers confirm they're in the right place</i>
<i>Reader's Journey: Organize content biologically (linear progression), not logically (database)</i>
<i>Mental models: Overview before details prevents cognitive overload</i>
<i>Warmth: Encouraging tone reduces anxiety for new users</i>
<i>Whitespace: Admonitions and callouts provide visual breathing room</i>
<i>Summaries: Recaps help retention; they're reinforcement, not redundancy</i>
<i>Examples: Concrete illustrations make abstract concepts accessible</i>
<i>Engagement: "Flow" techniques (transitions, variety) are functional, not "fluff"-they maintain attention</i>
</human-reader-principles>
<llm-reader-principles>
<i>When reader_type='llm', optimize for PRECISION and UNAMBIGUITY:</i>
<i>Dependency-first: Define concepts before usage to minimize hallucination risk</i>
<i>Cut emotional language, encouragement, and orientation sections</i>
<i>
IF concept is well-known from training (e.g., "conventional
commits", "REST APIs"): Reference the standard-don't re-teach it
ELSE: Be explicit-don't assume the LLM will infer correctly
</i>
<i>Use consistent terminology-same word for same concept throughout</i>
<i>Eliminate hedging ("might", "could", "generally")-use direct statements</i>
<i>Prefer structured formats (tables, lists, YAML) over prose</i>
<i>Reference known standards ("conventional commits", "Google style guide") to leverage training</i>
<i>STILL PROVIDE EXAMPLES even for known standards-grounds the LLM in your specific expectation</i>
<i>Unambiguous references-no unclear antecedents ("it", "this", "the above")</i>
<i>Note: LLM documents may be LONGER than human docs in some areas
(more explicit) while shorter in others (no warmth)</i>
</llm-reader-principles>
<structure-models>
<model name="Tutorial/Guide (Linear)" applicability="Tutorials, detailed guides, how-to articles, walkthroughs">
<i>Prerequisites: Setup/Context MUST precede action</i>
<i>Sequence: Steps must follow strict chronological or logical dependency order</i>
<i>Goal-oriented: clear 'Definition of Done' at the end</i>
</model>
<model name="Reference/Database" applicability="API docs, glossaries, configuration references, cheat sheets">
<i>Random Access: No narrative flow required; user jumps to specific item</i>
<i>MECE: Topics are Mutually Exclusive and Collectively Exhaustive</i>
<i>Consistent Schema: Every item follows identical structure (e.g., Signature to Params to Returns)</i>
</model>
<model name="Explanation (Conceptual)"
applicability="Deep dives, architecture overviews, conceptual guides,
whitepapers, project context">
<i>Abstract to Concrete: Definition to Context to Implementation/Example</i>
<i>Scaffolding: Complex ideas built on established foundations</i>
</model>
<model name="Prompt/Task Definition (Functional)"
applicability="BMAD tasks, prompts, system instructions, XML definitions">
<i>Meta-first: Inputs, usage constraints, and context defined before instructions</i>
<i>Separation of Concerns: Instructions (logic) separate from Data (content)</i>
<i>Step-by-step: Execution flow must be explicit and ordered</i>
</model>
<model name="Strategic/Context (Pyramid)" applicability="PRDs, research reports, proposals, decision records">
<i>Top-down: Conclusion/Status/Recommendation starts the document</i>
<i>Grouping: Supporting context grouped logically below the headline</i>
<i>Ordering: Most critical information first</i>
<i>MECE: Arguments/Groups are Mutually Exclusive and Collectively Exhaustive</i>
<i>Evidence: Data supports arguments, never leads</i>
</model>
</structure-models>
</llm>
<flow>
<step n="1" title="Validate Input">
<action>Check if content is empty or contains fewer than 3 words</action>
<action if="empty or fewer than 3 words">HALT with error: "Content
too short for substantive review (minimum 3 words required)"</action>
<action>Validate reader_type is "humans" or "llm" (or not provided, defaulting to "humans")</action>
<action if="reader_type is invalid">HALT with error: "Invalid reader_type. Must be 'humans' or 'llm'"</action>
<action>Identify document type and structure (headings, sections, lists, etc.)</action>
<action>Note the current word count and section count</action>
</step>
<step n="2" title="Understand Purpose">
<action>If purpose was provided, use it; otherwise infer from content</action>
<action>If target_audience was provided, use it; otherwise infer from content</action>
<action>Identify the core question the document answers</action>
<action>State in one sentence: "This document exists to help [audience] accomplish [goal]"</action>
<action>Select the most appropriate structural model from structure-models based on purpose/audience</action>
<action>Note reader_type and which principles apply (human-reader-principles or llm-reader-principles)</action>
</step>
<step n="3" title="Structural Analysis" critical="true">
<action>Map the document structure: list each major section with its word count</action>
<action>Evaluate structure against the selected model's primary rules
(e.g., 'Does recommendation come first?' for Pyramid)</action>
<action>For each section, answer: Does this directly serve the stated purpose?</action>
<action if="reader_type='humans'">For each comprehension aid (visual,
summary, example, callout), answer: Does this help readers
understand or stay engaged?</action>
<action>Identify sections that could be: cut entirely, merged with
another, moved to a different location, or split</action>
<action>Identify true redundancies: identical information repeated
without purpose (not summaries or reinforcement)</action>
<action>Identify scope violations: content that belongs in a different document</action>
<action>Identify burying: critical information hidden deep in the document</action>
</step>
<step n="4" title="Flow Analysis">
<action>Assess the reader's journey: Does the sequence match how readers will use this?</action>
<action>Identify premature detail: explanation given before the reader needs it</action>
<action>Identify missing scaffolding: complex ideas without adequate setup</action>
<action>Identify anti-patterns: FAQs that should be inline, appendices
that should be cut, overviews that repeat the body verbatim</action>
<action if="reader_type='humans'">Assess pacing: Is there enough
whitespace and visual variety to maintain attention?</action>
</step>
<step n="5" title="Generate Recommendations">
<action>Compile all findings into prioritized recommendations</action>
<action>Categorize each recommendation: CUT (remove entirely),
MERGE (combine sections), MOVE (reorder), CONDENSE (shorten
significantly), QUESTION (needs author decision), PRESERVE
(explicitly keep-for elements that might seem cuttable but
serve comprehension)</action>
<action>For each recommendation, state the rationale in one sentence</action>
<action>Estimate impact: how many words would this save (or cost, for PRESERVE)?</action>
<action>If length_target was provided, assess whether recommendations meet it</action>
<action if="reader_type='humans' and recommendations would cut
comprehension aids">Flag with warning: "This cut may impact
reader comprehension/engagement"</action>
</step>
<step n="6" title="Output Results">
<action>Output document summary (purpose, audience, reader_type, current length)</action>
<action>Output the recommendation list in priority order</action>
<action>Output estimated total reduction if all recommendations accepted</action>
<action if="no recommendations">Output: "No substantive changes recommended-document structure is sound"</action>
<output-format>
## Document Summary
- **Purpose:** [inferred or provided purpose]
- **Audience:** [inferred or provided audience]
- **Reader type:** [selected reader type]
- **Structure model:** [selected structure model]
- **Current length:** [X] words across [Y] sections
## Recommendations
### 1. [CUT/MERGE/MOVE/CONDENSE/QUESTION/PRESERVE] - [Section or element name]
**Rationale:** [One sentence explanation]
**Impact:** ~[X] words
**Comprehension note:** [If applicable, note impact on reader understanding]
### 2. ...
## Summary
- **Total recommendations:** [N]
- **Estimated reduction:** [X] words ([Y]% of original)
- **Meets length target:** [Yes/No/No target specified]
- **Comprehension trade-offs:** [Note any cuts that sacrifice reader engagement for brevity]
</output-format>
</step>
</flow>
<halt-conditions>
<condition>HALT with error if content is empty or fewer than 3 words</condition>
<condition>HALT with error if reader_type is not "humans" or "llm"</condition>
<condition>If no structural issues found, output "No substantive changes
recommended" (this is valid completion, not an error)</condition>
</halt-conditions>
</task>

View File

@ -1,7 +1,7 @@
<!-- if possible, run this in a separate subagent or process with read access to the project, <!-- if possible, run this in a separate subagent or process with read access to the project,
but no context except the content to review --> but no context except the content to review -->
<task id="_bmad/core/tasks/review-adversarial-general.xml" name="Adversarial Review (General)"> <task id="_bmad/core/tasks/review-adversarial-general.xml" name="Adversarial Review">
<objective>Cynically review content and produce findings</objective> <objective>Cynically review content and produce findings</objective>
<inputs> <inputs>
@ -9,6 +9,11 @@
</inputs> </inputs>
<llm critical="true"> <llm critical="true">
<i>MANDATORY: Execute ALL steps in the flow section IN EXACT ORDER</i>
<i>DO NOT skip steps or change the sequence</i>
<i>HALT immediately when halt-conditions are met</i>
<i>Each action xml tag within step xml tag is a REQUIRED action to complete that step</i>
<i>You are a cynical, jaded reviewer with zero patience for sloppy work</i> <i>You are a cynical, jaded reviewer with zero patience for sloppy work</i>
<i>The content was submitted by a clueless weasel and you expect to find problems</i> <i>The content was submitted by a clueless weasel and you expect to find problems</i>
<i>Be skeptical of everything</i> <i>Be skeptical of everything</i>

View File

@ -1,23 +0,0 @@
# Senior Developer Review - Validation Checklist
- [ ] Story file loaded from `{{story_path}}`
- [ ] Story Status verified as reviewable (review)
- [ ] Epic and Story IDs resolved ({{epic_num}}.{{story_num}})
- [ ] Story Context located or warning recorded
- [ ] Epic Tech Spec located or warning recorded
- [ ] Architecture/standards docs loaded (as available)
- [ ] Tech stack detected and documented
- [ ] MCP doc search performed (or web fallback) and references captured
- [ ] Acceptance Criteria cross-checked against implementation
- [ ] File List reviewed and validated for completeness
- [ ] Tests identified and mapped to ACs; gaps noted
- [ ] Code quality review performed on changed files
- [ ] Security review performed on changed files and dependencies
- [ ] Outcome decided (Approve/Changes Requested/Blocked)
- [ ] Review notes appended under "Senior Developer Review (AI)"
- [ ] Change Log updated with review entry
- [ ] Status updated according to settings (if enabled)
- [ ] Sprint status synced (if sprint tracking enabled)
- [ ] Story saved successfully
_Reviewer: {{user_name}} on {{date}}_

View File

@ -1,227 +0,0 @@
<workflow>
<critical>The workflow execution engine is governed by: {project-root}/_bmad/core/tasks/workflow.xml</critical>
<critical>You MUST have already loaded and processed: {installed_path}/workflow.yaml</critical>
<critical>Communicate all responses in {communication_language} and language MUST be tailored to {user_skill_level}</critical>
<critical>Generate all documents in {document_output_language}</critical>
<critical>🔥 YOU ARE AN ADVERSARIAL CODE REVIEWER - Find what's wrong or missing! 🔥</critical>
<critical>Your purpose: Validate story file claims against actual implementation</critical>
<critical>Challenge everything: Are tasks marked [x] actually done? Are ACs really implemented?</critical>
<critical>Find 3-10 specific issues in every review minimum - no lazy "looks good" reviews - YOU are so much better than the dev agent
that wrote this slop</critical>
<critical>Read EVERY file in the File List - verify implementation against story requirements</critical>
<critical>Tasks marked complete but not done = CRITICAL finding</critical>
<critical>Acceptance Criteria not implemented = HIGH severity finding</critical>
<critical>Do not review files that are not part of the application's source code. Always exclude the _bmad/ and _bmad-output/ folders from the review. Always exclude IDE and CLI configuration folders like .cursor/ and .windsurf/ and .claude/</critical>
<step n="1" goal="Load story and discover changes">
<action>Use provided {{story_path}} or ask user which story file to review</action>
<action>Read COMPLETE story file</action>
<action>Set {{story_key}} = extracted key from filename (e.g., "1-2-user-authentication.md" → "1-2-user-authentication") or story
metadata</action>
<action>Parse sections: Story, Acceptance Criteria, Tasks/Subtasks, Dev Agent Record → File List, Change Log</action>
<!-- Discover actual changes via git -->
<action>Check if git repository detected in current directory</action>
<check if="git repository exists">
<action>Run `git status --porcelain` to find uncommitted changes</action>
<action>Run `git diff --name-only` to see modified files</action>
<action>Run `git diff --cached --name-only` to see staged files</action>
<action>Compile list of actually changed files from git output</action>
</check>
<!-- Cross-reference story File List vs git reality -->
<action>Compare story's Dev Agent Record → File List with actual git changes</action>
<action>Note discrepancies:
- Files in git but not in story File List
- Files in story File List but no git changes
- Missing documentation of what was actually changed
</action>
<invoke-protocol name="discover_inputs" />
<action>Load {project_context} for coding standards (if exists)</action>
</step>
<step n="2" goal="Build review attack plan">
<action>Extract ALL Acceptance Criteria from story</action>
<action>Extract ALL Tasks/Subtasks with completion status ([x] vs [ ])</action>
<action>From Dev Agent Record → File List, compile list of claimed changes</action>
<action>Create review plan:
1. **AC Validation**: Verify each AC is actually implemented
2. **Task Audit**: Verify each [x] task is really done
3. **Code Quality**: Security, performance, maintainability
4. **Test Quality**: Real tests vs placeholder bullshit
</action>
</step>
<step n="3" goal="Execute adversarial review">
<critical>VALIDATE EVERY CLAIM - Check git reality vs story claims</critical>
<!-- Git vs Story Discrepancies -->
<action>Review git vs story File List discrepancies:
1. **Files changed but not in story File List** → MEDIUM finding (incomplete documentation)
2. **Story lists files but no git changes** → HIGH finding (false claims)
3. **Uncommitted changes not documented** → MEDIUM finding (transparency issue)
</action>
<!-- Use combined file list: story File List + git discovered files -->
<action>Create comprehensive review file list from story File List and git changes</action>
<!-- AC Validation -->
<action>For EACH Acceptance Criterion:
1. Read the AC requirement
2. Search implementation files for evidence
3. Determine: IMPLEMENTED, PARTIAL, or MISSING
4. If MISSING/PARTIAL → HIGH SEVERITY finding
</action>
<!-- Task Completion Audit -->
<action>For EACH task marked [x]:
1. Read the task description
2. Search files for evidence it was actually done
3. **CRITICAL**: If marked [x] but NOT DONE → CRITICAL finding
4. Record specific proof (file:line)
</action>
<!-- Code Quality Deep Dive -->
<action>For EACH file in comprehensive review list:
1. **Security**: Look for injection risks, missing validation, auth issues
2. **Performance**: N+1 queries, inefficient loops, missing caching
3. **Error Handling**: Missing try/catch, poor error messages
4. **Code Quality**: Complex functions, magic numbers, poor naming
5. **Test Quality**: Are tests real assertions or placeholders?
</action>
<check if="total_issues_found lt 3">
<critical>NOT LOOKING HARD ENOUGH - Find more problems!</critical>
<action>Re-examine code for:
- Edge cases and null handling
- Architecture violations
- Documentation gaps
- Integration issues
- Dependency problems
- Git commit message quality (if applicable)
</action>
<action>Find at least 3 more specific, actionable issues</action>
</check>
</step>
<step n="4" goal="Present findings and fix them">
<action>Categorize findings: HIGH (must fix), MEDIUM (should fix), LOW (nice to fix)</action>
<action>Set {{fixed_count}} = 0</action>
<action>Set {{action_count}} = 0</action>
<output>**🔥 CODE REVIEW FINDINGS, {user_name}!**
**Story:** {{story_file}}
**Git vs Story Discrepancies:** {{git_discrepancy_count}} found
**Issues Found:** {{high_count}} High, {{medium_count}} Medium, {{low_count}} Low
## 🔴 CRITICAL ISSUES
- Tasks marked [x] but not actually implemented
- Acceptance Criteria not implemented
- Story claims files changed but no git evidence
- Security vulnerabilities
## 🟡 MEDIUM ISSUES
- Files changed but not documented in story File List
- Uncommitted changes not tracked
- Performance problems
- Poor test coverage/quality
- Code maintainability issues
## 🟢 LOW ISSUES
- Code style improvements
- Documentation gaps
- Git commit message quality
</output>
<ask>What should I do with these issues?
1. **Fix them automatically** - I'll update the code and tests
2. **Create action items** - Add to story Tasks/Subtasks for later
3. **Show me details** - Deep dive into specific issues
Choose [1], [2], or specify which issue to examine:</ask>
<check if="user chooses 1">
<action>Fix all HIGH and MEDIUM issues in the code</action>
<action>Add/update tests as needed</action>
<action>Update File List in story if files changed</action>
<action>Update story Dev Agent Record with fixes applied</action>
<action>Set {{fixed_count}} = number of HIGH and MEDIUM issues fixed</action>
<action>Set {{action_count}} = 0</action>
</check>
<check if="user chooses 2">
<action>Add "Review Follow-ups (AI)" subsection to Tasks/Subtasks</action>
<action>For each issue: `- [ ] [AI-Review][Severity] Description [file:line]`</action>
<action>Set {{action_count}} = number of action items created</action>
<action>Set {{fixed_count}} = 0</action>
</check>
<check if="user chooses 3">
<action>Show detailed explanation with code examples</action>
<action>Return to fix decision</action>
</check>
</step>
<step n="5" goal="Update story status and sync sprint tracking">
<!-- Determine new status based on review outcome -->
<check if="all HIGH and MEDIUM issues fixed AND all ACs implemented">
<action>Set {{new_status}} = "done"</action>
<action>Update story Status field to "done"</action>
</check>
<check if="HIGH or MEDIUM issues remain OR ACs not fully implemented">
<action>Set {{new_status}} = "in-progress"</action>
<action>Update story Status field to "in-progress"</action>
</check>
<action>Save story file</action>
<!-- Determine sprint tracking status -->
<check if="{sprint_status} file exists">
<action>Set {{current_sprint_status}} = "enabled"</action>
</check>
<check if="{sprint_status} file does NOT exist">
<action>Set {{current_sprint_status}} = "no-sprint-tracking"</action>
</check>
<!-- Sync sprint-status.yaml when story status changes (only if sprint tracking enabled) -->
<check if="{{current_sprint_status}} != 'no-sprint-tracking'">
<action>Load the FULL file: {sprint_status}</action>
<action>Find development_status key matching {{story_key}}</action>
<check if="{{new_status}} == 'done'">
<action>Update development_status[{{story_key}}] = "done"</action>
<action>Save file, preserving ALL comments and structure</action>
<output>✅ Sprint status synced: {{story_key}} → done</output>
</check>
<check if="{{new_status}} == 'in-progress'">
<action>Update development_status[{{story_key}}] = "in-progress"</action>
<action>Save file, preserving ALL comments and structure</action>
<output>🔄 Sprint status synced: {{story_key}} → in-progress</output>
</check>
<check if="story key not found in sprint status">
<output>⚠️ Story file updated, but sprint-status sync failed: {{story_key}} not found in sprint-status.yaml</output>
</check>
</check>
<check if="{{current_sprint_status}} == 'no-sprint-tracking'">
<output> Story status updated (no sprint tracking configured)</output>
</check>
<output>**✅ Review Complete!**
**Story Status:** {{new_status}}
**Issues Fixed:** {{fixed_count}}
**Action Items Created:** {{action_count}}
{{#if new_status == "done"}}Code review complete!{{else}}Address the action items and continue development.{{/if}}
</output>
</step>
</workflow>

View File

@ -0,0 +1,122 @@
---
name: 'step-01-load-story'
description: "Compare story's file list against git changes"
---
# Step 1: Load Story and Discover Changes
---
## STATE VARIABLES (capture now, persist throughout)
These variables MUST be set in this step and available to all subsequent steps:
- `story_path` - Path to the story file being reviewed
- `story_key` - Story identifier (e.g., "1-2-user-authentication")
- `story_content` - Complete, unmodified file content from story_path (loaded in substep 2)
- `story_file_list` - Files claimed in story's Dev Agent Record → File List
- `git_changed_files` - Files actually changed according to git
- `git_discrepancies` - Mismatches between `story_file_list` and `git_changed_files`
---
## EXECUTION SEQUENCE
### 1. Identify Story
Ask user: "Which story would you like to review?"
**Try input as direct file path first:**
If input resolves to an existing file:
- Verify it's in {sprint_status} with status `review` or `done`
- If verified → set `story_path` to that file path
- If NOT verified → Warn user the file is not in {sprint_status} (or wrong status). Ask: "Continue anyway?"
- If yes → set `story_path`
- If no → return to user prompt (ask "Which story would you like to review?" again)
**Search {sprint_status}** (if input is not a direct file):
Search for stories with status `review` or `done`. Match by priority:
1. Story number resembles input closely enough (e.g., "1-2" matches "1 2", "1.2", "one dash two", "one two"; "1-32" matches "one thirty two"). Do NOT match if numbers differ (e.g., "1-33" does not match "1-32")
2. Exact story name/key (e.g., "1-2-user-auth-api")
3. Story name/title resembles input closely enough
4. Story description resembles input closely enough
**Resolution:**
- **Single match**: Confident. Set `story_path`, proceed to substep 2
- **Multiple matches**: Uncertain. Present all candidates to user. Wait for selection. Set `story_path`, proceed to substep 2
- **No match**: Ask user to clarify or provide the full story path. Return to user prompt (ask "Which story would you like to review?" again)
### 2. Load Story File
**Load file content:**
Read the complete contents of {story_path} and assign to `story_content` WITHOUT filtering, truncating or summarizing. If {story_path} cannot be read, is empty, or obviously doesn't have the story: report the error to the user and HALT the workflow.
**Extract story identifier:**
Verify the filename ends with `.md` extension. Remove `.md` to get `story_key` (e.g., "1-2-user-authentication.md" → "1-2-user-authentication"). If filename doesn't end with `.md` or the result is empty: report the error to the user and HALT the workflow.
### 3. Extract File List from Story
Extract `story_file_list` from the Dev Agent Record → File List section of {story_content}.
**If Dev Agent Record or File List section not found:** Report to user and set `story_file_list` = NO_FILE_LIST.
### 4. Discover Git Changes
Check if git repository exists.
**If NOT a git repo:** Set `git_changed_files` = NO_GIT, `git_discrepancies` = NO_GIT. Skip to substep 5.
**If git repo detected:**
```bash
git status --porcelain
git diff -M --name-only
git diff -M --cached --name-only
```
If any git command fails: Report the error to the user and HALT the workflow.
Compile `git_changed_files` = union of modified, staged, new, deleted, and renamed files.
### 5. Cross-Reference Story vs Git
**If {git_changed_files} is empty:**
Ask user: "No git changes detected. Continue anyway?"
- If **no**: HALT the workflow
- If **yes**: Continue to comparison
**Compare {story_file_list} with {git_changed_files}:**
Exclude git-ignored files from the comparison (run `git check-ignore` if needed).
Set `git_discrepancies` with categories:
- **files_in_git_not_story**: Files changed in git but not in story File List
- **files_in_story_not_git**: Files in story File List but no git changes (excluding git-ignored)
- **uncommitted_undocumented**: Uncommitted changes not tracked in story
---
## COMPLETION CHECKLIST
Before proceeding to the next step, verify ALL of the following:
- `story_path` identified and loaded
- `story_key` extracted
- `story_content` captured completely and unmodified
- `story_file_list` compiled from Dev Agent Record (or NO_FILE_LIST if not found)
- `git_changed_files` discovered via git commands (or NO_GIT if not a git repo)
- `git_discrepancies` calculated
**If any criterion is not met:** Report to the user and HALT the workflow.
---
## NEXT STEP DIRECTIVE
**CRITICAL:** When this step completes, explicitly state:
"**NEXT:** Loading `step-02-build-attack-plan.md`"

View File

@ -0,0 +1,155 @@
---
name: 'step-02-adversarial-review'
description: 'Lean adversarial review - context-independent diff analysis, no story knowledge'
---
# Step 2: Adversarial Review (Information Asymmetric)
**Goal:** Perform context-independent adversarial review of code changes. Reviewer sees ONLY the diff - no story, no ACs, no context about WHY changes were made.
<critical>Reviewer has FULL repo access but NO knowledge of WHY changes were made</critical>
<critical>DO NOT include story file in prompt - asymmetry is about intent, not visibility</critical>
<critical>This catches issues a fresh reviewer would find that story-biased review might miss</critical>
---
## AVAILABLE STATE
From previous steps:
- `{story_path}`, `{story_key}`
- `{file_list}` - Files listed in story's File List section
- `{git_changed_files}` - Files changed according to git
- `{baseline_commit}` - From story file Dev Agent Record
---
## STATE VARIABLE (capture now)
- `{diff_output}` - Complete diff of changes
- `{asymmetric_findings}` - Findings from adversarial review
---
## EXECUTION SEQUENCE
### 1. Construct Diff
Build complete diff of all changes for this story.
**Step 1a: Read baseline from story file**
Extract `Baseline Commit` from the story file's Dev Agent Record section.
- If found and not "NO_GIT": use as `{baseline_commit}`
- If "NO_GIT" or missing: proceed to fallback
**Step 1b: Construct diff (with baseline)**
If `{baseline_commit}` is a valid commit hash:
```bash
git diff {baseline_commit} -- ':!{implementation_artifacts}'
```
This captures all changes (committed + uncommitted) since dev-story started.
**Step 1c: Fallback (no baseline)**
If no baseline available, review current state of files in `{file_list}`:
- Read each file listed in the story's File List section
- Review as full file content (not a diff)
**Include in `{diff_output}`:**
- All modified tracked files (except files in `{implementation_artifacts}` - asymmetry requires hiding intent)
- All new files created for this story
- Full content for new files
**Note:** Do NOT `git add` anything - this is read-only inspection.
### 2. Invoke Adversarial Review
With `{diff_output}` constructed, invoke the review task. If possible, use information asymmetry: run this step, and only it, in a separate subagent or process with read access to the project, but no context except the `{diff_output}`.
```xml
<invoke-task>Review {diff_output} using {project-root}/_bmad/core/tasks/review-adversarial-general.xml</invoke-task>
```
**Platform fallback:** If task invocation not available, load the task file and execute its instructions inline, passing `{diff_output}` as the content.
The task should: review `{diff_output}` and return a list of findings.
### 3. Process Adversarial Findings
Capture findings from adversarial review.
**If zero findings:** HALT - this is suspicious. Re-analyze or ask for guidance.
Evaluate severity (Critical, High, Medium, Low) and validity (Real, Noise, Undecided).
Add each finding to `{asymmetric_findings}` (no IDs yet - assigned after merge):
```
{
source: "adversarial",
severity: "...",
validity: "...",
description: "...",
location: "file:line (if applicable)"
}
```
### 4. Phase 1 Summary
Present adversarial findings:
```
**Phase 1: Adversarial Review Complete**
**Reviewer Context:** Pure diff review (no story knowledge)
**Findings:** {count}
- Critical: {count}
- High: {count}
- Medium: {count}
- Low: {count}
**Validity Assessment:**
- Real: {count}
- Noise: {count}
- Undecided: {count}
Proceeding to attack plan construction...
```
---
## NEXT STEP DIRECTIVE
**CRITICAL:** When this step completes, explicitly state:
"**NEXT:** Loading `step-03-build-attack-plan.md`"
---
## SUCCESS METRICS
- Diff constructed from correct source (uncommitted or commits)
- Story file excluded from diff
- Task invoked with diff as input
- Adversarial review executed
- Findings captured with severity and validity
- `{asymmetric_findings}` populated
- Phase summary presented
- Explicit NEXT directive provided
## FAILURE MODES
- Including story file in diff (breaks asymmetry)
- Skipping adversarial review entirely
- Accepting zero findings without halt
- Invoking task without providing diff input
- Missing severity/validity classification
- Not storing findings for consolidation
- No explicit NEXT directive at step completion

View File

@ -0,0 +1,147 @@
---
name: 'step-03-build-attack-plan'
description: 'Extract ACs and tasks, create comprehensive review plan for context-aware phase'
---
# Step 3: Build Review Attack Plan
**Goal:** Extract all reviewable items from story and create attack plan for context-aware review phase.
---
## AVAILABLE STATE
From previous steps:
- `{story_path}` - Path to the story file
- `{story_key}` - Story identifier
- `{story_file_list}` - Files claimed in story
- `{git_changed_files}` - Files actually changed (git)
- `{git_discrepancies}` - Differences between claims and reality
- `{asymmetric_findings}` - Findings from Phase 1 (adversarial review)
---
## STATE VARIABLES (capture now)
- `{acceptance_criteria}` - All ACs extracted from story
- `{tasks_with_status}` - All tasks with their [x] or [ ] status
- `{comprehensive_file_list}` - Union of story files + git files
- `{review_attack_plan}` - Structured plan for context-aware phase
---
## EXECUTION SEQUENCE
### 1. Extract Acceptance Criteria
Parse all Acceptance Criteria from story:
```
{acceptance_criteria} = [
{ id: "AC1", requirement: "...", testable: true/false },
{ id: "AC2", requirement: "...", testable: true/false },
...
]
```
Note any ACs that are vague or untestable.
### 2. Extract Tasks with Status
Parse all Tasks/Subtasks with completion markers:
```
{tasks_with_status} = [
{ id: "T1", description: "...", status: "complete" ([x]) or "incomplete" ([ ]) },
{ id: "T1.1", description: "...", status: "complete" or "incomplete" },
...
]
```
Flag any tasks marked complete [x] for verification.
### 3. Build Comprehensive File List
Merge `{story_file_list}` and `{git_changed_files}`:
```
{comprehensive_file_list} = union of:
- Files in story Dev Agent Record
- Files changed according to git
- Deduped and sorted
```
Exclude from review:
- `_bmad/`, `_bmad-output/`
- `.cursor/`, `.windsurf/`, `.claude/`
- IDE/editor config files
### 4. Create Review Attack Plan
Structure the `{review_attack_plan}`:
```
PHASE 1: Adversarial Review (Step 2) [COMPLETE - {asymmetric_findings} findings]
├── Fresh code review without story context
│ └── {asymmetric_findings} items to consolidate
PHASE 2: Context-Aware Review (Step 4)
├── Git vs Story Discrepancies
│ └── {git_discrepancies} items
├── AC Validation
│ └── {acceptance_criteria} items to verify
├── Task Completion Audit
│ └── {tasks_with_status} marked [x] to verify
└── Code Quality Review
└── {comprehensive_file_list} files to review
```
### 5. Preview Attack Plan
Present to user (brief summary):
```
**Review Attack Plan**
**Story:** {story_key}
**Phase 1 (Adversarial - Complete):** {asymmetric_findings count} findings from fresh review
**Phase 2 (Context-Aware - Starting):**
- ACs to verify: {count}
- Tasks marked complete: {count}
- Files to review: {count}
- Git discrepancies detected: {count}
Proceeding with context-aware review...
```
---
## NEXT STEP DIRECTIVE
**CRITICAL:** When this step completes, explicitly state:
"**NEXT:** Loading `step-04-context-aware-review.md`"
---
## SUCCESS METRICS
- All ACs extracted with testability assessment
- All tasks extracted with completion status
- Comprehensive file list built (story + git)
- Exclusions applied correctly
- Attack plan structured for context-aware phase
- Summary presented to user
- Explicit NEXT directive provided
## FAILURE MODES
- Missing AC extraction
- Not capturing task completion status
- Forgetting to merge story + git files
- Not excluding IDE/config directories
- Skipping attack plan structure
- No explicit NEXT directive at step completion

View File

@ -0,0 +1,182 @@
---
name: 'step-04-context-aware-review'
description: 'Story-aware validation: verify ACs, audit task completion, check git discrepancies'
---
# Step 4: Context-Aware Review
**Goal:** Perform story-aware validation - verify AC implementation, audit task completion, review code quality with full story context.
<critical>VALIDATE EVERY CLAIM - Check git reality vs story claims</critical>
<critical>You KNOW the story requirements - use that knowledge to find gaps</critical>
---
## AVAILABLE STATE
From previous steps:
- `{story_path}`, `{story_key}`
- `{story_file_list}`, `{git_changed_files}`, `{git_discrepancies}`
- `{acceptance_criteria}`, `{tasks_with_status}`
- `{comprehensive_file_list}`, `{review_attack_plan}`
- `{asymmetric_findings}` - From Phase 1 (adversarial review)
---
## STATE VARIABLE (capture now)
- `{context_aware_findings}` - All findings from this phase
Initialize `{context_aware_findings}` as empty list.
---
## EXECUTION SEQUENCE
### 0. Load Planning Context (JIT)
Load planning documents for AC validation against system design:
- **Architecture**: `{planning_artifacts}/*architecture*.md` (or sharded: `{planning_artifacts}/*architecture*/*.md`)
- **UX Design**: `{planning_artifacts}/*ux*.md` (if UI review relevant)
- **Epic**: `{planning_artifacts}/*epic*/epic-{epic_num}.md` (the epic containing this story)
These provide the design context needed to validate AC implementation against system requirements.
### 1. Git vs Story Discrepancies
Review `{git_discrepancies}` and create findings:
| Discrepancy Type | Severity |
| --- | --- |
| Files changed but not in story File List | Medium |
| Story lists files but no git changes | High |
| Uncommitted changes not documented | Medium |
For each discrepancy, add to `{context_aware_findings}` (no IDs yet - assigned after merge):
```
{
source: "git-discrepancy",
severity: "...",
description: "...",
evidence: "file: X, git says: Y, story says: Z"
}
```
### 2. Acceptance Criteria Validation
For EACH AC in `{acceptance_criteria}`:
1. Read the AC requirement
2. Search implementation files in `{comprehensive_file_list}` for evidence
3. Determine status: IMPLEMENTED, PARTIAL, or MISSING
4. If PARTIAL or MISSING → add High severity finding
Add to `{context_aware_findings}`:
```
{
source: "ac-validation",
severity: "High",
description: "AC {id} not fully implemented: {details}",
evidence: "Expected: {ac}, Found: {what_was_found}"
}
```
### 3. Task Completion Audit
For EACH task marked [x] in `{tasks_with_status}`:
1. Read the task description
2. Search files for evidence it was actually done
3. **Critical**: If marked [x] but NOT DONE → Critical finding
4. Record specific proof (file:line) if done
Add to `{context_aware_findings}` if false:
```
{
source: "task-audit",
severity: "Critical",
description: "Task marked complete but not implemented: {task}",
evidence: "Searched: {files}, Found: no evidence of {expected}"
}
```
### 4. Code Quality Review (Context-Aware)
For EACH file in `{comprehensive_file_list}`:
Review with STORY CONTEXT (you know what was supposed to be built):
- **Security**: Missing validation for AC-specified inputs?
- **Performance**: Story mentioned scale requirements met?
- **Error Handling**: Edge cases from AC covered?
- **Test Quality**: Tests actually verify ACs or just placeholders?
- **Architecture Compliance**: Follows patterns in architecture doc?
Add findings to `{context_aware_findings}` with appropriate severity.
### 5. Minimum Finding Check
<critical>If total findings < 3, NOT LOOKING HARD ENOUGH</critical>
Re-examine for:
- Edge cases not covered by implementation
- Documentation gaps
- Integration issues with other components
- Dependency problems
- Comments missing for complex logic
---
## PHASE 2 SUMMARY
Present context-aware findings:
```
**Phase 2: Context-Aware Review Complete**
**Findings:** {count}
- Critical: {count}
- High: {count}
- Medium: {count}
- Low: {count}
Proceeding to findings consolidation...
```
Store `{context_aware_findings}` for consolidation in step 5.
---
## NEXT STEP DIRECTIVE
**CRITICAL:** When this step completes, explicitly state:
"**NEXT:** Loading `step-05-consolidate-findings.md`"
---
## SUCCESS METRICS
- All git discrepancies reviewed and findings created
- Every AC checked for implementation evidence
- Every [x] task verified with proof
- Code quality reviewed with story context
- Minimum 3 findings (push harder if not)
- `{context_aware_findings}` populated
- Phase summary presented
- Explicit NEXT directive provided
## FAILURE MODES
- Accepting "looks good" with < 3 findings
- Not verifying [x] tasks with actual evidence
- Missing AC validation
- Ignoring git discrepancies
- Not storing findings for consolidation
- No explicit NEXT directive at step completion

View File

@ -0,0 +1,158 @@
---
name: 'step-05-consolidate-findings'
description: 'Merge and deduplicate findings from both review phases'
---
# Step 5: Consolidate Findings
**Goal:** Merge findings from adversarial review (Phase 1) and context-aware review (Phase 2), deduplicate, and present unified findings table.
---
## AVAILABLE STATE
From previous steps:
- `{story_path}`, `{story_key}`
- `{asymmetric_findings}` - Findings from Phase 1 (step 2 - adversarial review)
- `{context_aware_findings}` - Findings from Phase 2 (step 4 - context-aware review)
---
## STATE VARIABLE (capture now)
- `{consolidated_findings}` - Merged, deduplicated findings
---
## EXECUTION SEQUENCE
### 1. Merge All Findings
Combine both finding lists:
```
all_findings = {context_aware_findings} + {asymmetric_findings}
```
### 2. Deduplicate Findings
Identify duplicates (same underlying issue found by both phases):
**Duplicate Detection Criteria:**
- Same file + same line range
- Same issue type (e.g., both about error handling in same function)
- Overlapping descriptions
**Resolution Rule:**
Keep the MORE DETAILED version:
- If context-aware finding has AC reference → keep that
- If adversarial finding has better technical detail → keep that
- When in doubt, keep context-aware (has more context)
Note which findings were merged (for transparency in the summary).
### 3. Normalize Severity
Apply consistent severity scale (Critical, High, Medium, Low).
### 4. Filter Noise
Review adversarial findings marked as Noise:
- If clearly false positive (e.g., style preference, not actual issue) → exclude
- If questionable → keep with Undecided validity
- If context reveals it's actually valid → upgrade to Real
**Do NOT filter:**
- Any Critical or High severity
- Any context-aware findings (they have story context)
### 5. Sort and Number Findings
Sort by severity (Critical → High → Medium → Low), then assign IDs: F1, F2, F3, etc.
Build `{consolidated_findings}`:
```markdown
| ID | Severity | Source | Description | Location |
|----|----------|--------|-------------|----------|
| F1 | Critical | task-audit | Task 3 marked [x] but not implemented | src/auth.ts |
| F2 | High | ac-validation | AC2 partially implemented | src/api/*.ts |
| F3 | High | adversarial | Missing error handling in API calls | src/api/client.ts:45 |
| F4 | Medium | git-discrepancy | File changed but not in story | src/utils.ts |
| F5 | Low | adversarial | Magic number should be constant | src/config.ts:12 |
```
### 6. Present Consolidated Findings
```markdown
**Consolidated Code Review Findings**
**Story:** {story_key}
**Summary:**
- Total findings: {count}
- Critical: {count}
- High: {count}
- Medium: {count}
- Low: {count}
**Deduplication:** {merged_count} duplicate findings merged
---
## Findings by Severity
### Critical (Must Fix)
{list critical findings with full details}
### High (Should Fix)
{list high findings with full details}
### Medium (Consider Fixing)
{list medium findings}
### Low (Nice to Fix)
{list low findings}
---
**Phase Sources:**
- Adversarial (Phase 1): {count} findings
- Context-Aware (Phase 2): {count} findings
```
---
## NEXT STEP DIRECTIVE
**CRITICAL:** When this step completes, explicitly state:
"**NEXT:** Loading `step-06-resolve-and-update.md`"
---
## SUCCESS METRICS
- All findings merged from both phases
- Duplicates identified and resolved (kept more detailed)
- Severity normalized consistently
- Noise filtered appropriately (but not excessively)
- Consolidated table created
- `{consolidated_findings}` populated
- Summary presented to user
- Explicit NEXT directive provided
## FAILURE MODES
- Missing findings from either phase
- Not detecting duplicates (double-counting issues)
- Inconsistent severity assignment
- Filtering real issues as noise
- Not storing consolidated findings
- No explicit NEXT directive at step completion

View File

@ -0,0 +1,213 @@
---
name: 'step-06-resolve-and-update'
description: 'Present findings, fix or create action items, update story and sprint status'
---
# Step 6: Resolve Findings and Update Status
**Goal:** Present findings to user, handle resolution (fix or action items), update story file and sprint status.
---
## AVAILABLE STATE
From previous steps:
- `{story_path}`, `{story_key}`
- `{consolidated_findings}` - Merged findings from step 5
- `sprint_status` = `{implementation_artifacts}/sprint-status.yaml`
---
## STATE VARIABLES (capture now)
- `{fixed_count}` - Number of issues fixed
- `{action_count}` - Number of action items created
- `{new_status}` - Final story status
---
## EXECUTION SEQUENCE
### 1. Present Resolution Options
```markdown
**Code Review Findings for {user_name}**
**Story:** {story_key}
**Total Issues:** {consolidated_findings.count}
{consolidated_findings_table}
---
**What should I do with these issues?**
**[1] Fix them automatically** - I'll update the code and tests
**[2] Create action items** - Add to story Tasks/Subtasks for later
**[3] Walk through** - Discuss each finding individually
**[4] Show details** - Deep dive into specific issues
Choose [1], [2], [3], [4], or specify which issue (e.g., "CF-3"):
```
### 2. Handle User Choice
**Option [1]: Fix Automatically**
1. For each CRITICAL and HIGH finding:
- Apply the fix in the code
- Add/update tests if needed
- Record what was fixed
2. Update story Dev Agent Record → File List if files changed
3. Add "Code Review Fixes Applied" entry to Change Log
4. Set `{fixed_count}` = number of issues fixed
5. Set `{action_count}` = 0 (LOW findings can become action items)
**Option [2]: Create Action Items**
1. Add "Review Follow-ups (AI)" subsection to Tasks/Subtasks
2. For each finding:
```
- [ ] [AI-Review][{severity}] {description} [{location}]
```
3. Set `{action_count}` = number of action items created
4. Set `{fixed_count}` = 0
**Option [3]: Walk Through**
For each finding in order:
1. Present finding with full context and code snippet
2. Ask: **[f]ix now / [s]kip / [d]iscuss more**
3. If fix: Apply fix immediately, increment `{fixed_count}`
4. If skip: Note as acknowledged, optionally create action item
5. If discuss: Provide more detail, repeat choice
6. Continue to next finding
After all processed, summarize what was fixed/skipped.
**Option [4]: Show Details**
1. Present expanded details for specific finding(s)
2. Return to resolution choice
### 3. Determine Final Status
Evaluate completion:
**If ALL conditions met:**
- All CRITICAL issues fixed
- All HIGH issues fixed or have action items
- All ACs verified as implemented
Set `{new_status}` = "done"
**Otherwise:**
Set `{new_status}` = "in-progress"
### 4. Update Story File
1. Update story Status field to `{new_status}`
2. Add review notes to Dev Agent Record:
```markdown
## Senior Developer Review (AI)
**Date:** {date}
**Reviewer:** AI Code Review
**Findings Summary:**
- CRITICAL: {count} ({fixed}/{action_items})
- HIGH: {count} ({fixed}/{action_items})
- MEDIUM: {count}
- LOW: {count}
**Resolution:** {approach_taken}
**Files Modified:** {list if fixes applied}
```
3. Update Change Log:
```markdown
- [{date}] Code review completed - {outcome_summary}
```
4. Save story file
### 5. Sync Sprint Status
Check if `{sprint_status}` file exists:
**If exists:**
1. Load `{sprint_status}`
2. Find `{story_key}` in development_status
3. Update status to `{new_status}`
4. Save file, preserving ALL comments and structure
```
Sprint status synced: {story_key} {new_status}
```
**If not exists or key not found:**
```
Sprint status sync skipped (no sprint tracking or key not found)
```
### 6. Completion Output
```markdown
** Code Review Complete!**
**Story:** {story_key}
**Final Status:** {new_status}
**Issues Fixed:** {fixed_count}
**Action Items Created:** {action_count}
{if new_status == "done"}
Code review passed! Story is ready for final verification.
{else}
Address the action items and run another review cycle.
{endif}
---
**Next Steps:**
- Commit changes (if fixes applied)
- Run tests to verify fixes
- Address remaining action items (if any)
- Mark story complete when all items resolved
```
---
## WORKFLOW COMPLETE
This is the final step. The Code Review workflow is now complete.
---
## SUCCESS METRICS
- Resolution options presented clearly
- User choice handled correctly
- Fixes applied cleanly (if chosen)
- Action items created correctly (if chosen)
- Story status determined correctly
- Story file updated with review notes
- Sprint status synced (if applicable)
- Completion summary provided
## FAILURE MODES
- Not presenting resolution options
- Fixing without user consent
- Not updating story file
- Wrong status determination (done when issues remain)
- Not syncing sprint status when it exists
- Missing completion summary

View File

@ -0,0 +1,39 @@
---
name: code-review
description: 'Code review for dev-story output. Audits acceptance criteria against implementation, performs adversarial diff review, can auto-fix with approval. A different LLM than the implementer is recommended.'
web_bundle: false
---
# Code Review Workflow
## WORKFLOW ARCHITECTURE: STEP FILES
- This file (workflow.md) stays in context throughout
- Each step file is read just before processing (current step stays at end of context)
- State persists via variables: `{story_path}`, `{story_key}`, `{context_aware_findings}`, `{asymmetric_findings}`
---
## INITIALIZATION
### Configuration Loading
Load config from `{project-root}/_bmad/bmm/config.yaml` and resolve:
- `user_name`, `communication_language`, `user_skill_level`, `document_output_language`
- `planning_artifacts`, `implementation_artifacts`
- `date` as system-generated current datetime
- ✅ YOU MUST ALWAYS SPEAK OUTPUT In your Agent communication style with the config `{communication_language}`
### Paths
- `installed_path` = `{project-root}/_bmad/bmm/workflows/4-implementation/code-review`
- `project_context` = `**/project-context.md` (load if exists)
- `sprint_status` = `{implementation_artifacts}/sprint-status.yaml`
---
## EXECUTION
Read and follow `steps/step-01-load-story.md` to begin the workflow.

View File

@ -1,51 +0,0 @@
# Review Story Workflow
name: code-review
description: "Perform an ADVERSARIAL Senior Developer code review that finds 3-10 specific problems in every story. Challenges everything: code quality, test coverage, architecture compliance, security, performance. NEVER accepts `looks good` - must find minimum issues and can auto-fix with user approval."
author: "BMad"
# Critical variables from config
config_source: "{project-root}/_bmad/bmm/config.yaml"
user_name: "{config_source}:user_name"
communication_language: "{config_source}:communication_language"
user_skill_level: "{config_source}:user_skill_level"
document_output_language: "{config_source}:document_output_language"
date: system-generated
planning_artifacts: "{config_source}:planning_artifacts"
implementation_artifacts: "{config_source}:implementation_artifacts"
output_folder: "{implementation_artifacts}"
sprint_status: "{implementation_artifacts}/sprint-status.yaml"
# Workflow components
installed_path: "{project-root}/_bmad/bmm/workflows/4-implementation/code-review"
instructions: "{installed_path}/instructions.xml"
validation: "{installed_path}/checklist.md"
template: false
variables:
# Project context
project_context: "**/project-context.md"
story_dir: "{implementation_artifacts}"
# Smart input file references - handles both whole docs and sharded docs
# Priority: Whole document first, then sharded version
# Strategy: SELECTIVE LOAD - only load the specific epic needed for this story review
input_file_patterns:
architecture:
description: "System architecture for review context"
whole: "{planning_artifacts}/*architecture*.md"
sharded: "{planning_artifacts}/*architecture*/*.md"
load_strategy: "FULL_LOAD"
ux_design:
description: "UX design specification (if UI review)"
whole: "{planning_artifacts}/*ux*.md"
sharded: "{planning_artifacts}/*ux*/*.md"
load_strategy: "FULL_LOAD"
epics:
description: "Epic containing story being reviewed"
whole: "{planning_artifacts}/*epic*.md"
sharded_index: "{planning_artifacts}/*epic*/index.md"
sharded_single: "{planning_artifacts}/*epic*/epic-{{epic_num}}.md"
load_strategy: "SELECTIVE_LOAD"
standalone: true
web_bundle: false

View File

@ -219,6 +219,17 @@
<output> No sprint status file exists - story progress will be tracked in story file only</output> <output> No sprint status file exists - story progress will be tracked in story file only</output>
<action>Set {{current_sprint_status}} = "no-sprint-tracking"</action> <action>Set {{current_sprint_status}} = "no-sprint-tracking"</action>
</check> </check>
<!-- Capture baseline commit for code review -->
<check if="git is available">
<action>Capture current HEAD commit: `git rev-parse HEAD`</action>
<action>Store as {{baseline_commit}}</action>
<action>Write to story file Dev Agent Record: "**Baseline Commit:** {{baseline_commit}}"</action>
</check>
<check if="git is NOT available">
<action>Set {{baseline_commit}} = "NO_GIT"</action>
<action>Write to story file Dev Agent Record: "**Baseline Commit:** NO_GIT"</action>
</check>
</step> </step>
<step n="5" goal="Implement task following red-green-refactor cycle"> <step n="5" goal="Implement task following red-green-refactor cycle">

View File

@ -51,7 +51,11 @@ Use best-effort diff construction:
### Capture as {diff_output} ### Capture as {diff_output}
Merge all changes into `{diff_output}`. **Include in `{diff_output}`:**
- All modified tracked files (except `{tech_spec_path}` if tech-spec mode - asymmetry requires hiding intent)
- All new files created during this workflow
- Full content for new files
**Note:** Do NOT `git add` anything - this is read-only inspection. **Note:** Do NOT `git add` anything - this is read-only inspection.
@ -75,7 +79,7 @@ The task should: review `{diff_output}` and return a list of findings.
Capture the findings from the task output. Capture the findings from the task output.
**If zero findings:** HALT - this is suspicious. Re-analyze or request user guidance. **If zero findings:** HALT - this is suspicious. Re-analyze or request user guidance.
Evaluate severity (Critical, High, Medium, Low) and validity (real, noise, undecided). Evaluate severity (Critical, High, Medium, Low) and validity (Real, Noise, Undecided).
DO NOT exclude findings based on severity or validity unless explicitly asked to do so. DO NOT exclude findings based on severity or validity unless explicitly asked to do so.
Order findings by severity. Order findings by severity.
Number the ordered findings (F1, F2, F3, etc.). Number the ordered findings (F1, F2, F3, etc.).
@ -92,6 +96,7 @@ With findings in hand, load `step-06-resolve-findings.md` for user to choose res
## SUCCESS METRICS ## SUCCESS METRICS
- Diff constructed from baseline_commit - Diff constructed from baseline_commit
- Tech-spec excluded from diff when in tech-spec mode (information asymmetry)
- New files included in diff - New files included in diff
- Task invoked with diff as input - Task invoked with diff as input
- Findings received - Findings received
@ -100,6 +105,7 @@ With findings in hand, load `step-06-resolve-findings.md` for user to choose res
## FAILURE MODES ## FAILURE MODES
- Missing baseline_commit (can't construct accurate diff) - Missing baseline_commit (can't construct accurate diff)
- Including tech_spec_path in diff when in tech-spec mode (breaks asymmetry)
- Not including new untracked files in diff - Not including new untracked files in diff
- Invoking task without providing diff input - Invoking task without providing diff input
- Accepting zero findings without questioning - Accepting zero findings without questioning