diff --git a/docs/sprint-artifacts/tech-spec-pr-review-tool.md b/docs/sprint-artifacts/tech-spec-pr-review-tool.md
deleted file mode 100644
index 4f3470ec..00000000
--- a/docs/sprint-artifacts/tech-spec-pr-review-tool.md
+++ /dev/null
@@ -1,183 +0,0 @@
-# Tech-Spec: PR Review Tool (Raven's Verdict)
-
-**Created:** 2025-12-06
-**Status:** Completed
-
-## Overview
-
-### Problem Statement
-
-External contributors submit PRs to the upstream repository without context on code quality expectations. Maintainers need a way to provide deep, thorough code review feedback without spending hours manually reviewing every PR. Automated tools like CodeRabbit handle surface-level checks, but high-compute, human-triggered deep reviews are missing.
-
-### Solution
-
-A portable prompt file that any LLM agent can execute to:
-
-1. Fetch PR diff and full files via `gh` CLI
-2. Run an adversarial code review (cynical, thorough)
-3. Transform the tone from "cynical asshole" to "cold engineering professional"
-4. Post the findings as a comment on the PR
-
-### Scope
-
-**In Scope:**
-
-- Manual trigger via any LLM agent (Claude Code, Cursor, Windsurf, etc.)
-- Review of GitHub PRs using `gh` CLI
-- Adversarial review with severity + confidence ratings
-- Tone transformation before posting
-- Preview and explicit confirmation before posting
-
-**Out of Scope:**
-
-- Automated triggers (webhooks, GitHub Actions)
-- Integration with CodeRabbit or other tools
-- Review of non-GitHub repositories
-- Persistent storage or history tracking
-
-## Context for Development
-
-### Codebase Patterns
-
-- Maintainer tools live in `tools/maintainer/`
-- Prompts are simple markdown files with clear instructions
-- Existing pattern: `review-adversarial.md` (3 lines, direct, effective)
-
-### Files to Reference
-
-- `tools/maintainer/fix-elicitation-wording.md` - Example of agent prompt in maintainer tools
-- `.claude/commands/review-adversarial.md` - Base cynical reviewer prompt to adapt
-
-### Technical Decisions
-
-| Decision       | Choice                                    | Rationale                                       |
-| -------------- | ----------------------------------------- | ----------------------------------------------- |
-| Location       | `tools/maintainer/pr-review/`             | Maintainer tooling, separate from product       |
-| Invocation     | Prompt file + PR URL/number               | Portable across all LLM platforms               |
-| PR data source | `gh` CLI                                  | Already available, handles auth                 |
-| Review input   | Diff + full files                         | Diff for focus, full files for tangents         |
-| Tone transform | Same session, Phase 2 with `task:` prefix | Spawns sub-agent if available, inline otherwise |
-| Output format  | Numbered, freeform, severity + confidence | Scannable, actionable                           |
-
-## Implementation Plan
-
-### Tasks
-
-- [x] Task 1: Create `tools/maintainer/pr-review/` directory structure
-- [x] Task 2: Write `review-prompt.md` - the main prompt file with all phases
-- [x] Task 3: Write `README.md` - usage instructions for maintainers
-- [ ] Task 4: Test with a real PR on the upstream repo
-- [ ] Task 5: Iterate based on output quality
-
-### File Structure
-
-```
-tools/maintainer/pr-review/
-  ├── README.md           # How to use
-  ├── review-prompt.md    # The main prompt file
-  └── output/             # Local backup folder (gitignored)
-```
-
-### Prompt File Structure (`review-prompt.md`)
-
-```
-## Phase 0: Pre-flight Checks
-- Verify PR number/URL provided (if not, STOP and ask)
-- Check PR size via gh pr view --json
-- Confirm repo if different from upstream
-- Note binary files to skip
-
-## Phase 1: Adversarial Review
-- Fetch diff + full files
-- Run cynical review
-- Output numbered findings with severity + confidence
-
-## Phase 2: Tone Transform
-- task: Rewrite findings as cold engineering professional
-- Preserve severity/confidence markers
-- Remove inflammatory language, keep substance
-
-## Phase 3: Post
-- Preview full comment
-- Ask for explicit confirmation
-- Post via gh pr comment
-- Handle auth failure gracefully
-```
-
-### Acceptance Criteria
-
-- [ ] AC 1: Given a PR URL, when agent reads prompt, then it fetches PR data via `gh` without hallucinating PR numbers
-- [ ] AC 2: Given PR data, when review runs, then findings are numbered with severity (🔴🟡🟢) and confidence (High/Medium/Low %)
-- [ ] AC 3: Given cynical output, when tone transform runs, then language is professional but findings retain substance
-- [ ] AC 4: Given transformed output, when user confirms, then comment posts to PR via `gh pr comment`
-- [ ] AC 5: Given missing PR number, when agent starts, then it stops and asks user explicitly
-- [ ] AC 6: Given PR from different repo, when agent detects mismatch, then it asks user to confirm before proceeding
-- [ ] AC 7: Given PR with >50 files or >5000 lines, when pre-flight runs, then agent warns and asks to proceed or focus
-- [ ] AC 8: Given auth failure during post, when error occurs, then review is saved locally and error is displayed loudly
-- [ ] AC 9: Given PR with binary files, when fetching diff, then binaries are skipped with a note
-
-## Additional Context
-
-### Dependencies
-
-- `gh` CLI installed and authenticated
-- Any LLM agent capable of running bash commands
-
-### Sandboxed Execution Rules
-
-The prompt MUST enforce:
-
-- ❌ No inferring PR from conversation history
-- ❌ No looking at git branches, recent commits, or local state
-- ❌ No guessing or assuming PR numbers
-- ✅ Use ONLY explicit PR number/URL from user message
-- ✅ If missing, STOP and ask: "What PR number or URL should I review?"
-
-### Severity Scale
-
-| Level       | Meaning                                                 |
-| ----------- | ------------------------------------------------------- |
-| 🔴 Critical | Security issue, data loss risk, or broken functionality |
-| 🟡 Moderate | Bug, performance issue, or significant code smell       |
-| 🟢 Minor    | Style, naming, minor improvement opportunity            |
-
-### Confidence Scale
-
-| Level           | Meaning                              |
-| --------------- | ------------------------------------ |
-| High (>80%)     | Definitely an issue                  |
-| Medium (40-80%) | Likely an issue, might need context  |
-| Low (<40%)      | Possible issue, could be intentional |
-
-### Example Output Format
-
-```markdown
-## PR Review: #1234
-
-### 1. Unbounded query in user search
-
-**Severity:** 🔴 Critical | **Confidence:** High (>80%)
-
-The search endpoint at `src/api/search.ts:47` doesn't limit results, which could return thousands of rows and cause memory issues.
-
-**Suggestion:** Add `.limit(100)` or implement pagination.
-
-### 2. Missing null check in callback
-
-**Severity:** 🟡 Moderate | **Confidence:** Medium (40-80%)
-
-The callback at `src/handlers/webhook.ts:23` could be undefined if the event type is unregistered.
-
-**Suggestion:** Add defensive check: `if (callback) callback(event)`
-
----
-
-_Review generated by Raven's Verdict - Deep PR Review Tool_
-```
-
-### Notes
-
-- The "cynical asshole" phase is internal only - never posted
-- Tone transform must happen before any external output
-- When in doubt, ask the user - never assume
-- This is a POC - iterate based on real usage