feat(skills): add edge case hunter as parallel review layer in PR review (#1791)

* feat(skills): add edge case hunter as parallel review layer in PR review

Wire review-edge-case-hunter.xml into bmad-os-review-pr as a second
review layer running in parallel with the adversarial review. Both
subagents receive the same PR diff concurrently. Findings are merged,
deduplicated, and tagged by source before tone transformation.

* fix(core): resolve contradictions in edge case hunter task spec

- Show array wrapper [{}] in output-format example to match JSON array
  contract, and document empty array [] as valid output
- Consolidate empty-content handling: step 1 now defers to halt-conditions
  instead of defining separate "ask and abort" behavior
- Zero-findings halt no longer contradicts JSON contract: re-analyze once,
  then return [] instead of ambiguous "HALT or re-analyze"
- Soften "Execute ALL steps" to acknowledge halt-conditions can interrupt

* fix(review): address PR #1791 review feedback

- Remove "Task tool" reference per maintainer; use generic "subagents"
- Fix nested triple-backtick fencing with four-tick outer fence
- Widen location format to support multi-line ranges and hunk refs
- Add JSON-safety constraint to guard_snippet field
- Tighten input loading to "strictly from provided input"
- Replace vague "unreadable" with "cannot be decoded as text"
- Replace vague "increased scrutiny" with concrete re-analysis checklist
- Resolve HALT-immediately vs re-analysis conflict in LLM instructions

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
Alex Verkhovsky 2026-03-06 19:21:28 -07:00 committed by GitHub
parent abf3d25f06
commit f7846fd5eb
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
3 changed files with 82 additions and 18 deletions

View File

@ -1,6 +1,6 @@
---
name: bmad-os-review-pr
description: Adversarial PR review tool (Raven's Verdict). Cynical deep review transformed into professional engineering findings. Use when user asks to 'review a PR' and provides a PR url or id.
description: Dual-layer PR review tool (Raven's Verdict). Runs adversarial cynical review and edge case hunter in parallel, merges and deduplicates findings into professional engineering output. Use when user asks to 'review a PR' and provides a PR url or id.
---
Read `prompts/instructions.md` and execute.

View File

@ -93,13 +93,17 @@ gh pr diff {PR_NUMBER} [--repo {REPO}] --name-only | grep -E '\.(png|jpg|jpeg|gi
Store list of binary files to skip. Note them in final output.
## Adversarial Review
## Review Layers
### 1.1 Run Cynical Review
**Launch steps 1.1 and 1.2 as parallel subagents.** Both receive the same PR diff and run concurrently. Wait for both to complete before proceeding to step 1.3.
### 1.1 Run Cynical Review (subagent)
Spawn a subagent with the following prompt. Pass the full PR diff as context.
**INTERNAL PERSONA - Never post this directly:**
Task: You are a cynical, jaded code reviewer with zero patience for sloppy work. This PR was submitted by a clueless weasel and you expect to find problems. Find at least five issues to fix or improve in it. Number them. Be skeptical of everything. Ultrathink.
Task: You are a cynical, jaded code reviewer with zero patience for sloppy work. This PR was submitted by a clueless weasel and you expect to find problems. Find at least five issues to fix or improve in it. Number them. Be skeptical of everything.
Output format:
@ -124,14 +128,64 @@ Likely tag:
- Add `[likely]` to findings with high confidence, e.g. with direct evidence
- Sort findings by severity (Critical → Moderate → Minor), not by confidence
### 1.2 Run Edge Case Hunter (subagent)
Spawn a subagent that executes the task defined in `_bmad/core/tasks/review-edge-case-hunter.xml`. Pass the full PR diff as the `content` input. Omit `also_consider` unless the user specified extra focus areas.
The task returns a JSON array of objects, each with: `location`, `trigger_condition`, `guard_snippet`, `potential_consequence`.
**Map each JSON finding to the standard finding format:**
````markdown
### [NUMBER]. [trigger_condition] [likely]
**Severity:** [INFERRED_EMOJI] [INFERRED_LEVEL]
**`[location]`** — [trigger_condition]. [potential_consequence].
**Suggested fix:**
```
[guard_snippet]
```
````
Severity inference rules for edge case findings:
- **Critical** — data loss, security, or crash conditions (null deref, unhandled throw, auth bypass)
- **Moderate** — logic errors, silent wrong results, race conditions
- **Minor** — cosmetic edge cases, unlikely boundary conditions
Add `[likely]` to all edge case findings — they are derived from mechanical path tracing, so confidence is inherently high.
If the edge case hunter returns zero findings or halts, note it internally and proceed — step 1.1 findings still stand.
### 1.3 Merge and Deduplicate
Combine the findings from step 1.1 (adversarial) and step 1.2 (edge case hunter) into a single list.
**Deduplication rules:**
1. Compare each edge case finding against each adversarial finding
2. Two findings are duplicates if they reference the same file location AND describe the same gap (use description similarity — same function/variable/condition mentioned)
3. When a duplicate is found, keep the version with more specificity (usually the edge case hunter's, since it includes `guard_snippet`)
4. Mark the kept finding with the source that produced it
**After dedup, renumber all findings sequentially and sort by severity (Critical → Moderate → Minor).**
Tag each finding with its source:
- `[Adversarial]` — from step 1.1 only
- `[Edge Case]` — from step 1.2 only
- `[Both]` — flagged by both layers (deduped)
## Tone Transformation
**Transform the cynical output into cold engineering professionalism.**
**Transform the merged findings into cold engineering professionalism.**
**Transformation rules:**
1. Remove all inflammatory language, insults, assumptions about the author
2. Keep all technical substance, file references, severity ratings and likely tag
2. Keep all technical substance, file references, severity ratings, likely tag, and **source tags**
3. Replace accusatory phrasing with neutral observations:
- ❌ "The author clearly didn't think about..."
- ✅ "This implementation may not account for..."
@ -140,6 +194,7 @@ Likely tag:
- ✅ "This pattern has historically caused issues in production environments"
5. Add the suggested fixes.
6. Keep suggestions actionable and specific
7. Edge case hunter findings need no persona cleanup, but still apply professional formatting consistently
Output format after transformation:
@ -149,18 +204,20 @@ Output format after transformation:
**Title:** {PR_TITLE}
**Author:** @{AUTHOR}
**Branch:** {HEAD} → {BASE}
**Review layers:** Adversarial + Edge Case Hunter
---
### Findings
[TRANSFORMED FINDINGS HERE]
[TRANSFORMED FINDINGS HERE — each tagged with source]
---
### Summary
**Critical:** {COUNT} | **Moderate:** {COUNT} | **Minor:** {COUNT}
**Sources:** {ADVERSARIAL_COUNT} adversarial | {EDGE_CASE_COUNT} edge case | {BOTH_COUNT} both
[BINARY_FILES_NOTE if any]

View File

@ -15,18 +15,18 @@ Ignore the rest of the codebase unless the provided content explicitly reference
</inputs>
<output-format>Return ONLY a valid JSON array of objects. Each object must contain exactly these four fields and nothing else:
{
"location": "file:line",
[{
"location": "file:start-end (or file:line when single line, or file:hunk when exact line unavailable)",
"trigger_condition": "one-line description (max 15 words)",
"guard_snippet": "minimal code sketch that closes the gap",
"guard_snippet": "minimal code sketch that closes the gap (single-line escaped string, no raw newlines or unescaped quotes)",
"potential_consequence": "what could actually go wrong (max 15 words)"
}
No extra text, no explanations, no markdown wrapping.</output-format>
}]
No extra text, no explanations, no markdown wrapping. An empty array [] is valid when no unhandled paths are found.</output-format>
<llm critical="true">
<i>MANDATORY: Execute ALL steps in the flow section IN EXACT ORDER</i>
<i>MANDATORY: Execute steps in the flow section IN EXACT ORDER</i>
<i>DO NOT skip steps or change the sequence</i>
<i>HALT immediately when halt-conditions are met</i>
<i>When a halt-condition triggers, follow its specific instruction exactly</i>
<i>Each action xml tag within step xml tag is a REQUIRED action to complete that step</i>
<i>Your method is exhaustive path enumeration — mechanically walk every branch, not hunt by intuition</i>
@ -38,8 +38,8 @@ No extra text, no explanations, no markdown wrapping.</output-format>
<flow>
<step n="1" title="Receive Content">
<action>Load the content to review from provided input or context</action>
<action>If content to review is empty, ask for clarification and abort task</action>
<action>Load the content to review strictly from provided input</action>
<action>If content is empty, or cannot be decoded as text, return empty array [] and stop</action>
<action>Identify content type (diff, full file, or function) to determine scope rules</action>
</step>
@ -51,13 +51,20 @@ No extra text, no explanations, no markdown wrapping.</output-format>
<action>Collect only the unhandled paths as findings - discard handled ones silently</action>
</step>
<step n="3" title="Present Findings">
<step n="3" title="Validate Completeness">
<action>Recheck every conditional for missing else/default</action>
<action>Recheck every input for null/empty/wrong-type</action>
<action>Recheck loop bounds for off-by-one and empty-collection</action>
<action>Add any newly found unhandled paths to findings; discard confirmed-handled ones</action>
</step>
<step n="4" title="Present Findings">
<action>Output findings as a JSON array following the output-format specification exactly</action>
</step>
</flow>
<halt-conditions>
<condition>HALT if content is empty or unreadable</condition>
<condition>If content is empty or cannot be decoded as text, return empty array [] and stop</condition>
</halt-conditions>
</task>