refactor(bmad-prd): replace mechanical checklist+renderer with quality rubric and LLM-synthesized report

The structural checklist + Python renderer produced mechanical pass/warn/fail reports that didn't speak to actual PRD quality, and additional reviewers (adversarial) wrote separate review-*.md files that never made it into the HTML. Replaces that pipeline with: - A judgment rubric across seven PRD-quality dimensions (decision-readiness, substance over theater, strategic coherence, done-ness clarity, scope honesty, downstream usability, shape fit) that adapts to stakes and PRD shape. Rubric walker writes review-rubric.md with per-dimension verdicts. - HTML skeleton with TEMPLATE_* placeholders the synthesis pass fills directly — no substitution engine, no Python. - Synthesis pipeline in references/validate.md: parent reads every review-*.md, fills the skeleton, writes validation-report.html plus markdown twin, opens via webbrowser. Folds every reviewer's findings into one report; grade derives from rubric verdicts and severity counts. - Drops scripts/render-validation-html.py and scripts/tests/ entirely. - finalize_reviewers defaults to empty (adversarial removed from defaults — too brutal and frequently wrong against PRDs; teams can append in override TOML). - Headless mode now writes both HTML and markdown; skips browser-open.
2026-05-15 21:44:53 -05:00 · 2026-05-15 21:44:53 -05:00 · 6dbafbf08a
parent a586c0fa64
commit 6dbafbf08a
8 changed files with 418 additions and 557 deletions
--- a/src/bmm-skills/2-plan-workflows/bmad-prd/SKILL.md
+++ b/src/bmm-skills/2-plan-workflows/bmad-prd/SKILL.md
@ -1,6 +1,6 @@
 ---
 name: bmad-prd
-description: Create, update, or validate a PRD. Use when the user wants help producing, editing, validating, or analyzing a PRD.
+description: Create, update, or validate a PRD. Use when the user wants help producing, editing, or validating a PRD.
 ---
 # BMad PRD
@ -62,13 +62,13 @@ Decisions, assumptions, open questions, and out-of-scope volunteers land in `.de
 Used by the Validate intent and at Finalize step 3.
-Assemble the menu: structural validator against `{workflow.validation_checklist_template}` + each entry in `{workflow.finalize_reviewers}` + any ad-hoc reviewers the artifact warrants. Stakes-calibrated — hobby/solo may run quietly or skip; higher stakes get the explicit all/subset/skip menu.
+Assemble the menu: rubric walker against `{workflow.validation_checklist_template}` (the PRD quality rubric) + each entry in `{workflow.finalize_reviewers}` + any ad-hoc reviewers the artifact warrants. Stakes-calibrated — hobby/solo may run quietly or skip; higher stakes get the explicit all/subset/skip menu.
-Dispatch entries as parallel subagents against `prd.md` (and `addendum.md` if present) using the standard prefix convention (`skill:` / `file:` / plain text). Each writes its full review to `{doc_workspace}/review-{slug}.md` and returns ONLY a compact summary (verdict, top 2-5 findings, file path) — the parent never holds full review text. If subagents are unavailable, run sequentially: write the file *before* anything else, then flush the review from working context.
+Dispatch entries as parallel subagents against `prd.md` (and `addendum.md` if present) using the standard prefix convention (`skill:` / `file:` / plain text). Each writes its full review to `{doc_workspace}/review-{slug}.md` and returns ONLY a compact summary (verdict, top 2-5 findings, file path) — the parent never holds full review text. The rubric walker uses the prompt and output format in `references/validate.md`. If subagents are unavailable, run sequentially: write the file *before* anything else, then flush the review from working context.
 Surface findings tiered, never dumped. Lead with a one-sentence gate verdict, then walk critical + high findings; medium/low roll into a single tail ("plus N more in {file}"). Read the full `review-{slug}.md` only when the user drills into a specific finding. Per finding: autofix, discuss, defer to open items, or ignore.
-Under Validate intent, the structural validator additionally runs the rendering pipeline in `references/validate.md`.
+Under Validate intent, the parent additionally runs the synthesis pipeline in `references/validate.md` — folding every selected reviewer's output into a single HTML + markdown report and opening the HTML.
 ## Finalize
--- a/src/bmm-skills/2-plan-workflows/bmad-prd/assets/prd-validation-checklist.md
+++ b/src/bmm-skills/2-plan-workflows/bmad-prd/assets/prd-validation-checklist.md
@ -1,33 +1,135 @@
-# PRD Validation Checklist
+# PRD Quality Rubric
-Loaded by the PRD validator subagent. For each item, return `{id, status: pass|fail|warn|n/a, severity: low|medium|high|critical, location, note}`. Skip items not applicable to the agreed stakes. Cite specific PRD locations — never abstract criticism.
+A judgment rubric for the validator subagent. Walk the PRD with these dimensions in mind and write substantive findings — not box-ticking. The goal is a review that tells the user whether this PRD is *good*, not whether it has the right section headers.
-## Quality
+Most PRDs do not need every dimension scrutinized equally. Calibrate to the agreed stakes, the PRD's shape (consumer product, internal tool, regulatory update, technical capability spec), and what the PRD itself is trying to do. Be specific — cite locations, quote phrases, name what's missing. Abstract criticism is failure of nerve.
- **Q-1. Information density.** Sentences carry weight. Flag filler, hedging, and conversational padding.
+## How to use this rubric
 - **Q-2. Measurability.** Where measurement matters, FRs and Success Metrics are measurable; subjective adjectives flagged. Counter-metrics named when Success Metrics exist.
 - **Q-3. Traceability.** Every FR includes `Realizes UJ-X` referencing a UJ in §2.4. Every SM includes `Validates FR-X` referencing one or more FRs. Cross-references resolve.
 - **Q-4. Vision and JTBDs concrete.** Vision is specific and stands alone — not a generic feature list. JTBDs are audience-grounded, not abstract.
 - **Q-5. Non-Goals explicit.** A Non-Goals section is present where it would do real work; inline `[NON-GOAL]` and `[v2]` callouts where omissions would otherwise be silently assumed.
 - **Q-6. Dual-audience and self-contained.** Each section makes sense pulled out alone (cross-references via Glossary terms, not "see above"); the PRD is readable by humans and structured cleanly for downstream source-extraction by UX, architecture, and story-creation workflows.
 - **Q-7. FR testability.** Every FR has at least one testable consequence (verifiable condition with measurable outcome). Flag hand-waves like "system handles X gracefully" or "reasonable performance."
-## Discipline
+1. Read the full PRD (and addendum.md if present) before writing anything.
 2. For each of the seven dimensions below, form a judgment — *strong / adequate / thin / broken* — backed by specifics from the PRD.
 3. Write findings only where they add information. A `strong` dimension may need no findings; a `broken` one needs concrete, fixable ones.
 4. Severity ranks impact on the PRD's usefulness, not how easy the fix is. A vague Vision statement is *critical* even though it's a one-paragraph fix; a glossary drift might be *low* even though it appears in many places.
 5. The overall verdict is your synthesis — 2–3 sentences that name what holds up and what's at risk. Earn it with the dimension judgments.
- **D-1. Capabilities, not implementation.** FRs describe what users/systems can do, not how. Flag technology names, library choices, architecture decisions.
+## Output format
 - **D-2. Input fidelity.** Requirements from input documents (brief, research, prior PRD) are still in scope or explicitly handled via Non-Goals or `[ASSUMPTION]`.
 - **D-3. Personas grounded.** If personas exist, they are research-grounded or marked `[ILLUSTRATIVE]`. Each persona drives at least one decision.
 - **D-4. No innovation theater.** Novelty claims are real, not invented.
-## Structural integrity
+Write findings to `{doc_workspace}/review-rubric.md`:
- **S-1. Glossary integrity.** Every domain noun is defined in the Glossary and used identically throughout — including in FR descriptions, consequences, UJ flows, and SM definitions. Flag drift (case, plural, synonyms) and candidate missing-term entries.
+```markdown
- **S-2. ID continuity.** FR / UJ / SM / Story IDs are contiguous, unique, and cross-references resolve.
+# PRD Quality Review — {prd_name}
 - **S-3. Assumptions Index.** Every inline `[ASSUMPTION: ...]` appears in the Assumptions Index and vice versa.
 - **S-4. Open-items density.** Count Open Questions + `[ASSUMPTION]` + `[NOTE FOR PM]`. Red flag if density is high relative to the agreed stakes.
 - **S-5. UJ persona linkage.** Every UJ names a persona from §2 by exact label. Flag floating UJs that don't tie to a defined persona.
-## Stakes-gated
+## Overall verdict
 [2–3 sentences. What holds up, what's at risk. Earned by the dimension judgments below.]
- **STK-1. Required sections.** The PRD includes the sections the agreed stakes and product type warrant.
+## Decision-readiness — [strong | adequate | thin | broken]
- **STK-2. UJ density.** For non-hobby/non-solo scope, at least two named-persona UJs present in §2.4. Hobby/solo scope is exempt.
+[1–3 paragraphs of judgment with specific PRD locations.]
 ### Findings
 - **[critical|high|medium|low]** [Title] (§ location) — [Note]. *Fix:* [suggested fix].
 ## Substance over theater — [verdict]
 ...
 (repeat for each dimension)
 ## Mechanical notes
 [Glossary drift, ID continuity, broken cross-refs, Assumptions Index roundtrip. Lighter weight — these matter for downstream but don't drive the overall verdict.]
 ```
 ## The seven dimensions
 ### 1. Decision-readiness
 Can a decision-maker act on this PRD? Are the trade-offs surfaced honestly, or has the PRD smoothed everything to neutral? Would someone pushing back find their objection acknowledged or dodged?
 Look for:
 - Decisions that are stated as decisions, not buried as "considerations."
 - Trade-offs named with what was given up, not just what was chosen.
 - Open Questions that are actually open — not rhetorical questions with an answer in the next sentence.
 - `[NOTE FOR PM]` callouts at real tensions, not at safe checkpoints.
 Red flag: a PRD where every choice "balances" everything, every NFR is "important," every persona "values" the product.
 ### 2. Substance over theater
 Is the content earned, or is it furniture? Distinguish:
 - **Persona theater** — invented persona detail unmarked as `[ILLUSTRATIVE]`. Personas that don't drive a single decision in the PRD. More than four personas. Personas whose only function is to make the PRD look thorough.
 - **Innovation theater** — claimed novelty that isn't novel. Differentiation sections written because the template had one, not because Discovery surfaced something.
 - **NFR theater** — copied boilerplate ("system must be scalable / secure / reliable") without product-specific thresholds.
 - **Vision theater** — a Vision statement that could swap into any PRD in this category without change.
 Flag what reads like furniture, even if it's well-written furniture.
 ### 3. Strategic coherence
 Does the PRD have a thesis? Do the features serve a unified arc, or is it a list of capabilities someone wanted?
 Look for:
 - A stated thesis the PRD bets on (problem framing, user insight, market move).
 - Feature prioritization that follows from the thesis — not from "what's easy first."
 - Success Metrics that validate the thesis, not metrics that just measure activity (DAU/MAU when the thesis is about engagement quality is a tell).
 - Counter-metrics named when SMs exist.
 - Coherent MVP scope kind — problem-solving, experience, platform, or revenue — with scope logic that matches.
 Red flag: a PRD that reads as a backlog with section headings.
 ### 4. Done-ness clarity
 Would an engineer reading this PRD know what "done" looks like for each FR?
 Look for:
 - FRs with at least one testable consequence per FR — verifiable condition, measurable outcome.
 - "System handles X gracefully," "reasonable performance," "user-friendly" — flag every one.
 - Acceptance criteria implied or explicit. Sometimes the FR's consequences carry this; sometimes the PRD genuinely needs an Acceptance section.
 - For non-functional sections (UX, performance, security): bounds, not adjectives.
 This is the dimension downstream story creation will lean on hardest. Be unforgiving here.
 ### 5. Scope honesty
 Are omissions explicit, or is the reader meant to infer them?
 Look for:
 - A Non-Goals section where it would do real work — and `[NON-GOAL for MVP]` / `[v2 — out of MVP]` callouts where omissions could be silently assumed.
 - `[ASSUMPTION: …]` tags on inferences the user didn't directly confirm, indexed at the end.
 - `[NOTE FOR PM]` callouts at deferred decisions and unresolved tensions.
 - De-scoping proposed honestly, not done silently.
 Open-items density: count Open Questions + `[ASSUMPTION]` + `[NOTE FOR PM]` callouts relative to stakes. High counts on a low-stakes PRD is fine; high counts on a green-light-to-build PRD is a blocker.
 ### 6. Downstream usability
 If this PRD feeds UX, architecture, or story creation, can those workflows source-extract from it cleanly?
 Look for:
 - Glossary present; every domain noun used identically across FRs, UJs, SM definitions.
 - FR / UJ / SM IDs contiguous, unique, and cross-references that resolve.
 - Each section makes sense pulled out alone — cross-references via Glossary terms, not "see above."
 - UJs each name a persona from §2 by exact label; no floating UJs.
 For standalone PRDs (no downstream), this dimension matters less — say so.
 ### 7. Shape fit
 Has the PRD been forced into a shape that doesn't match the product?
 - Consumer product / multi-stakeholder B2B / meaningful UX → UJs and personas are load-bearing.
 - Internal tool, single-operator role → capability spec shape; UJs may be overhead; SMs may be operational rather than user-facing.
 - Regulatory or compliance update → constraint traceability is non-negotiable; UJs may be irrelevant.
 - Hobby / solo → rigor light, substance bar still applies.
 - Brownfield → existing-code references must be accurate; new UJs and existing UJs must be distinguished.
 - Chain-top (feeds UX → architecture → stories) → downstream usability matters more; standalone PRDs can be lighter on traceability.
 Flag PRDs that are over-formalized (UJ density for a single-operator tool) or under-formalized (consumer product with no personas or UJs).
 ## Mechanical notes
 Cover these as a tail section, not a primary dimension. They matter for downstream but don't drive the verdict on whether the PRD is good.
 - Glossary drift (case, plural, synonyms across the PRD).
 - ID continuity (gaps, duplicates, unresolved cross-references).
 - Assumptions Index roundtrip (every inline `[ASSUMPTION]` indexed; index entries all appear inline).
 - UJ persona linkage (each UJ names a defined persona by exact label).
 - Required sections present for the agreed stakes and product type.
--- a/src/bmm-skills/2-plan-workflows/bmad-prd/assets/validation-report-template.html
+++ b/src/bmm-skills/2-plan-workflows/bmad-prd/assets/validation-report-template.html
@ -1,8 +1,30 @@
 <!DOCTYPE html>
 <!--
  PRD Validation Report — skeleton template.
  This file is a starter the synthesis pass fills in directly. There is no
  substitution engine. The LLM:
    1. Reads {doc_workspace}/review-rubric.md and every review-{slug}.md from
       additional reviewers.
    2. Copies this skeleton.
    3. Replaces the placeholder content (everything between TEMPLATE markers)
       with the consolidated review, preserving the structure and CSS.
    4. Writes the result to {doc_workspace}/validation-report.html.
    5. Writes a markdown twin to {doc_workspace}/validation-report.md.
  Visual rules the LLM must preserve:
    - The container width, the color tokens, the typography.
    - One dimension = one collapsible <section class="dimension">.
    - Verdict pill uses the verdict-* class matching its judgment.
    - Severity badge uses the sev-* class matching its level.
    - Each extra reviewer (adversarial, etc.) gets its own collapsible section
      below the rubric dimensions.
    - The footer always shows the artifact paths and timestamp.
 -->
 <html lang="en">
 <head>
 <meta charset="utf-8">
-<title>PRD Validation: $prd_name</title>
+<title>PRD Validation: TEMPLATE_PRD_NAME</title>
 <style>
  :root {
    --bg: #fafaf9;
@ -10,14 +32,17 @@
    --border: #e7e5e4;
    --text: #1c1917;
    --muted: #78716c;
-    --pass: #22c55e;
+
-    --warn: #eab308;
+    --verdict-strong: #16a34a;
-    --fail: #ef4444;
+    --verdict-adequate: #65a30d;
-    --na: #94a3b8;
+    --verdict-thin: #d97706;
    --verdict-broken: #dc2626;
    --sev-low: #64748b;
    --sev-medium: #ca8a04;
    --sev-high: #ea580c;
    --sev-critical: #dc2626;
    --grade-exc: #16a34a;
    --grade-good: #65a30d;
    --grade-fair: #d97706;
@ -29,14 +54,14 @@
    font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", Inter, system-ui, sans-serif;
    background: var(--bg);
    color: var(--text);
-    line-height: 1.55;
+    line-height: 1.6;
    font-size: 15px;
  }
  .container { max-width: 960px; margin: 0 auto; padding: 32px 24px 64px; }
  header.report-header {
    display: flex;
-    align-items: center;
+    align-items: flex-start;
    justify-content: space-between;
    gap: 24px;
    padding-bottom: 16px;
@ -63,63 +88,90 @@
    border: 1px solid var(--border);
    border-left: 3px solid var(--muted);
    border-radius: 8px;
-    padding: 16px 20px;
+    padding: 18px 22px;
    margin-bottom: 24px;
-    color: var(--text);
+    font-size: 15.5px;
    font-size: 15px;
  }
-  .synthesis:empty { display: none; }
+  .synthesis p { margin: 0 0 10px; }
  .synthesis p:last-child { margin-bottom: 0; }
-  .scoreboard {
+  .dimension-summary {
    display: grid;
    grid-template-columns: repeat(auto-fit, minmax(180px, 1fr));
    gap: 10px;
    margin-bottom: 24px;
  }
  .dim-card {
    background: var(--surface);
    border: 1px solid var(--border);
    border-radius: 8px;
-    padding: 18px 20px;
+    padding: 12px 14px;
    margin-bottom: 24px;
  }
-  .score-bar { margin: 0 0 14px; line-height: 0; }
+  .dim-card .dim-name { font-size: 13px; color: var(--muted); margin-bottom: 6px; }
-  .score-stats { display: flex; gap: 22px; font-size: 14px; flex-wrap: wrap; }
+  .dim-card .dim-verdict { font-size: 14px; font-weight: 600; }
  .score-stats span { display: inline-flex; align-items: center; gap: 6px; }
  .score-stats .dot { width: 10px; height: 10px; border-radius: 50%; display: inline-block; }
  .dot-pass { background: var(--pass); }
  .dot-warn { background: var(--warn); }
  .dot-fail { background: var(--fail); }
  .dot-na { background: var(--na); }
  .total-count { margin-left: auto; color: var(--muted); }
-  section.category { margin-bottom: 16px; }
+  section.dimension, section.reviewer-section { margin-bottom: 14px; }
-  section.category details {
+  section.dimension details, section.reviewer-section details {
    background: var(--surface);
    border: 1px solid var(--border);
    border-radius: 8px;
    overflow: hidden;
  }
-  section.category summary {
+  section summary {
    padding: 14px 20px;
    cursor: pointer;
    user-select: none;
    list-style: none;
    display: flex;
    align-items: center;
    gap: 12px;
  }
-  section.category summary::-webkit-details-marker { display: none; }
+  section summary::-webkit-details-marker { display: none; }
-  section.category summary::before {
+  section summary::before {
    content: "▸";
    display: inline-block;
    margin-right: 10px;
    color: var(--muted);
    transition: transform 0.15s ease;
  }
-  section.category details[open] summary::before { transform: rotate(90deg); }
+  section details[open] summary::before { transform: rotate(90deg); }
-  section.category summary h2 { display: inline; margin: 0; font-size: 16px; font-weight: 600; letter-spacing: -0.005em; }
+  section summary h2 {
-  section.category .count { color: var(--muted); font-weight: 400; margin-left: 6px; font-size: 14px; }
+    display: inline;
    margin: 0;
    font-size: 16px;
    font-weight: 600;
    letter-spacing: -0.005em;
    flex: 1;
  }
  .verdict-pill {
    font-size: 11px;
    padding: 3px 10px;
    border-radius: 999px;
    font-weight: 600;
    text-transform: uppercase;
    letter-spacing: 0.04em;
    color: white;
  }
  .verdict-strong { background: var(--verdict-strong); }
  .verdict-adequate { background: var(--verdict-adequate); }
  .verdict-thin { background: var(--verdict-thin); }
  .verdict-broken { background: var(--verdict-broken); }
-  article.finding { padding: 16px 20px; border-top: 1px solid var(--border); }
+  .dim-body { padding: 4px 20px 18px; }
-  article.finding-fail { background: rgba(239, 68, 68, 0.025); }
+  .dim-judgment { color: var(--text); font-size: 14.5px; }
  .dim-judgment p { margin: 0 0 10px; }
  .findings-list { padding: 0 20px 4px; }
  article.finding {
    padding: 14px 0;
    border-top: 1px solid var(--border);
  }
  article.finding:first-child { border-top: none; }
  article.finding header {
    display: flex;
    align-items: center;
    gap: 10px;
    flex-wrap: wrap;
-    margin-bottom: 8px;
+    margin-bottom: 6px;
  }
  .badge {
    font-size: 10.5px;
@ -130,18 +182,32 @@
    letter-spacing: 0.04em;
    line-height: 1.4;
  }
  .badge-pass { background: rgba(34, 197, 94, 0.12); color: #15803d; }
  .badge-warn { background: rgba(234, 179, 8, 0.14); color: #854d0e; }
  .badge-fail { background: rgba(239, 68, 68, 0.12); color: #b91c1c; }
  .badge-na { background: rgba(148, 163, 184, 0.16); color: #475569; }
  .badge-sev-low { background: rgba(100, 116, 139, 0.12); color: var(--sev-low); }
  .badge-sev-medium { background: rgba(202, 138, 4, 0.14); color: var(--sev-medium); }
  .badge-sev-high { background: rgba(234, 88, 12, 0.14); color: var(--sev-high); }
  .badge-sev-critical { background: rgba(220, 38, 38, 0.14); color: var(--sev-critical); }
  .finding-id { font-family: ui-monospace, "SF Mono", Menlo, monospace; font-size: 12px; color: var(--muted); }
  .finding-title { margin: 0; font-size: 15px; font-weight: 500; flex: 1; min-width: 200px; }
-  .finding-location, .finding-note, .finding-fix { margin-top: 6px; font-size: 14px; color: var(--text); }
+  .finding-location { font-family: ui-monospace, "SF Mono", Menlo, monospace; font-size: 12.5px; color: var(--muted); }
-  .finding-location strong, .finding-fix strong { color: var(--muted); font-weight: 500; }
+  .finding-note, .finding-fix { margin-top: 6px; font-size: 14px; }
  .finding-fix strong { color: var(--muted); font-weight: 500; }
  .reviewer-source {
    font-size: 12px;
    color: var(--muted);
    font-family: ui-monospace, "SF Mono", Menlo, monospace;
  }
  .mechanical {
    background: var(--surface);
    border: 1px solid var(--border);
    border-radius: 8px;
    padding: 16px 20px;
    margin-top: 24px;
    font-size: 14px;
  }
  .mechanical h3 { margin: 0 0 10px; font-size: 14px; font-weight: 600; color: var(--muted); text-transform: uppercase; letter-spacing: 0.04em; }
  .mechanical ul { margin: 0; padding-left: 20px; }
  .mechanical li { margin-bottom: 4px; color: var(--text); }
  footer.report-footer {
    margin-top: 40px;
@ -156,33 +222,102 @@
 </head>
 <body>
  <div class="container">
    <!-- TEMPLATE: header. Fill prd name, prd path, grade text & class. -->
    <header class="report-header">
      <div class="title">
-        <h1>$prd_name — Validation Report</h1>
+        <h1>TEMPLATE_PRD_NAME — Validation Report</h1>
-        <div class="subtitle">$prd_path</div>
+        <div class="subtitle">TEMPLATE_PRD_PATH</div>
      </div>
-      <div class="grade $grade_class">$grade</div>
+      <div class="grade TEMPLATE_GRADE_CLASS">TEMPLATE_GRADE</div>
    </header>
-    <div class="synthesis">$overall_synthesis</div>
+    <!-- TEMPLATE: overall synthesis paragraphs. Lift directly from
-
+         review-rubric.md "Overall verdict" section; expand if extra reviewers
-    <div class="scoreboard">
+         materially shift the picture. Wrap each paragraph in <p>. -->
-      <div class="score-bar">$score_svg</div>
+    <div class="synthesis">
-      <div class="score-stats">
+      <p>TEMPLATE_SYNTHESIS_PARAGRAPH</p>
        <span><span class="dot dot-pass"></span>$passed pass</span>
        <span><span class="dot dot-warn"></span>$warned warn</span>
        <span><span class="dot dot-fail"></span>$failed fail</span>
        <span><span class="dot dot-na"></span>$na n/a</span>
        <span class="total-count">$total items checked</span>
      </div>
    </div>
-    $categories_html
+    <!-- TEMPLATE: dimension summary cards. One per rubric dimension. The
         dim-verdict text uses one of: strong | adequate | thin | broken. -->
    <div class="dimension-summary">
      <div class="dim-card">
        <div class="dim-name">Decision-readiness</div>
        <div class="dim-verdict" style="color: var(--verdict-TEMPLATE_VERDICT)">TEMPLATE_VERDICT_TEXT</div>
      </div>
      <!-- repeat for each of the seven dimensions -->
    </div>
    <!-- TEMPLATE: one section per rubric dimension. Skip a dimension entirely
         if the rubric review marked it n/a for this PRD (e.g. downstream
         usability for a standalone PRD). Open the section by default if
         verdict is thin or broken. -->
    <section class="dimension">
      <details open>
        <summary>
          <h2>Decision-readiness</h2>
          <span class="verdict-pill verdict-TEMPLATE_VERDICT">TEMPLATE_VERDICT_TEXT</span>
        </summary>
        <div class="dim-body">
          <div class="dim-judgment">
            <p>TEMPLATE_DIMENSION_JUDGMENT</p>
          </div>
        </div>
        <div class="findings-list">
          <!-- TEMPLATE: zero or more findings -->
          <article class="finding">
            <header>
              <span class="badge badge-sev-TEMPLATE_SEVERITY">TEMPLATE_SEVERITY</span>
              <h3 class="finding-title">TEMPLATE_FINDING_TITLE</h3>
              <span class="finding-location">TEMPLATE_LOCATION</span>
            </header>
            <div class="finding-note">TEMPLATE_FINDING_NOTE</div>
            <div class="finding-fix"><strong>Fix:</strong> TEMPLATE_SUGGESTED_FIX</div>
          </article>
        </div>
      </details>
    </section>
    <!-- TEMPLATE: one section per extra reviewer that ran (adversarial, etc.).
         Skip this block entirely if only the rubric walker ran. -->
    <section class="reviewer-section">
      <details>
        <summary>
          <h2>Adversarial review</h2>
          <span class="reviewer-source">TEMPLATE_REVIEWER_SOURCE_FILE</span>
        </summary>
        <div class="dim-body">
          <div class="dim-judgment">
            <p>TEMPLATE_REVIEWER_PREAMBLE</p>
          </div>
        </div>
        <div class="findings-list">
          <article class="finding">
            <header>
              <span class="badge badge-sev-TEMPLATE_SEVERITY">TEMPLATE_SEVERITY</span>
              <h3 class="finding-title">TEMPLATE_FINDING_TITLE</h3>
              <span class="finding-location">TEMPLATE_LOCATION</span>
            </header>
            <div class="finding-note">TEMPLATE_FINDING_NOTE</div>
            <div class="finding-fix"><strong>Fix:</strong> TEMPLATE_SUGGESTED_FIX</div>
          </article>
        </div>
      </details>
    </section>
    <!-- TEMPLATE: mechanical notes — short, bulleted. Skip if there are none. -->
    <div class="mechanical">
      <h3>Mechanical notes</h3>
      <ul>
        <li>TEMPLATE_MECHANICAL_NOTE</li>
      </ul>
    </div>
    <footer class="report-footer">
      <div class="meta">
-        <span>Checklist: $checklist_path</span>
+        <span>Rubric: TEMPLATE_RUBRIC_PATH</span>
-        <span>Generated: $timestamp</span>
+        <span>Generated: TEMPLATE_TIMESTAMP</span>
      </div>
    </footer>
  </div>
--- a/src/bmm-skills/2-plan-workflows/bmad-prd/customize.toml
+++ b/src/bmm-skills/2-plan-workflows/bmad-prd/customize.toml
@ -43,17 +43,21 @@ on_complete = ""
 # to enforce a different structure (e.g. regulated-industry, internal-tool, investor-input).
 prd_template = "assets/prd-template.md"
-# Validation checklist used at the Validate intent and at Finalize step 3.
+# PRD quality rubric used at the Validate intent and at Finalize step 3.
-# A subagent walks the checklist against prd.md and returns structured findings.
+# A subagent walks the rubric against prd.md and writes a substantive review
-# Override the path in team/user TOML to enforce an org-specific checklist
+# organized by quality dimensions (decision-readiness, substance, strategic
-# (regulated-industry compliance, investor-pitch standards, etc.).
+# coherence, etc.). Override the path in team/user TOML to enforce an
 # org-specific rubric (regulated-industry compliance, investor-pitch standards,
 # etc.). The filename "checklist" is retained for back-compat with override
 # files; the content is a judgment rubric, not a boolean checklist.
 validation_checklist_template = "assets/prd-validation-checklist.md"
-# HTML template used to render validation findings into a styled, scannable
+# HTML skeleton the synthesis pass fills directly when consolidating reviewer
-# report. The renderer (scripts/render-validation-html.py) substitutes
+# outputs into a validation report. No substitution engine — the parent LLM
-# structured findings + summary stats into this template; the template is
+# reads every {doc_workspace}/review-*.md, fills the skeleton's TEMPLATE_*
-# fully overridable to match org branding. The default uses inline CSS, no
+# placeholders, and writes the result. Fully overridable to match org branding.
-# external dependencies, and native HTML <details> for collapse — no JS.
+# Uses inline CSS, no external dependencies, and native HTML <details> for
 # collapse — no JS.
 validation_report_template = "assets/validation-report-template.html"
 # Run folder location. The PRD, optional addendum, decision log, and optional
@ -140,6 +144,4 @@ external_handoffs = []
 #
 # Resolved on-demand by the authoring skill (not pulled at activation): only
 # when entering the Validate intent or assembling the gate at Finalize step 3.
-finalize_reviewers = [
+finalize_reviewers = []
  "skill:bmad-review-adversarial-general",
 ]
--- a/src/bmm-skills/2-plan-workflows/bmad-prd/references/headless.md
+++ b/src/bmm-skills/2-plan-workflows/bmad-prd/references/headless.md
@ -36,4 +36,4 @@ End with the JSON response (full schemas with examples in `assets/headless-schem
 **Update.** Apply the change, log to `.decision-log.md` with rationale, and surface any conflict-with-prior-decision in `conflicts_with_prior_decisions[]` in the JSON status. Halt `blocked` if intent is ambiguous.
-**Validate.** Always write `validation-report.md` to `{doc_workspace}` regardless of finding count. Always include `"offer_to_update": true` in the JSON status. Run the renderer without `--open`.
+**Validate.** Always write both `validation-report.html` and `validation-report.md` to `{doc_workspace}` regardless of finding count. Always include `"offer_to_update": true` in the JSON status. Skip the browser-open step in `references/validate.md` — write the artifacts and return.
--- a/src/bmm-skills/2-plan-workflows/bmad-prd/references/validate.md
+++ b/src/bmm-skills/2-plan-workflows/bmad-prd/references/validate.md
@ -1,6 +1,6 @@
 # Validate
-The Validate intent playbook. Standalone — this intent critiques an existing PRD without changing it and ends after the user has seen the report; it does not run Finalize. The structural-validator rendering pipeline below is also reused for mid-session report requests during Create/Update.
+The Validate intent playbook. Standalone — this intent critiques an existing PRD without changing it and ends after the user has seen the report; it does not run Finalize. The synthesis pipeline below is also reused for mid-session report requests during Create/Update.
 ## Orient
@ -8,47 +8,89 @@ Source-extract against `.decision-log.md`, any original inputs, and the PRD/adde
 ## Run the Reviewer Gate
-Run the Reviewer Gate (see SKILL.md) against `prd.md` (and `addendum.md` if present). The structural checklist validator is one entry in the gate menu; under Validate intent it additionally runs the rendering pipeline below. The Finalize discipline pass during Create/Update does NOT render a report — its findings stay in-conversation.
+Run the Reviewer Gate (see SKILL.md) against `prd.md` (and `addendum.md` if present). The rubric walker is the default entry in the gate menu; under Validate intent it additionally runs the synthesis pipeline below. The Finalize discipline pass during Create/Update does NOT render a report — findings stay in-conversation.
-## Structural validator pipeline
+## Rubric-walker pipeline
-The structural validator subagent walks `{workflow.validation_checklist_template}` against `prd.md` (and `addendum.md` if present) and writes findings to `{doc_workspace}/validation-findings.json`:
+The rubric walker is the primary review entry. Spawn it as a subagent with this prompt:
-```json
+> You are validating a PRD against the quality rubric at `{workflow.validation_checklist_template}`. Read the full rubric first, then read `prd.md` (and `addendum.md` if present). Form a judgment per dimension — *strong / adequate / thin / broken* — and write findings only where they add information. Cite specific PRD locations and quote phrases. Severity ranks impact on the PRD's usefulness, not how easy the fix is. Write your review to `{doc_workspace}/review-rubric.md` in the format the rubric specifies. Return ONLY a compact summary (overall verdict, dimension verdicts, finding counts by severity, file path).
-{
+
-  "prd_name": "Example Product",
+The Reviewer Gate may also dispatch additional reviewers from `{workflow.finalize_reviewers}` (adversarial-general by default) and any ad-hoc reviewers the parent judges warranted. Each writes its review to `{doc_workspace}/review-{slug}.md` and returns a compact summary. Run in parallel.
-  "prd_path": "{doc_workspace}/prd.md",
+
-  "checklist_path": "{workflow.validation_checklist_template}",
+## Synthesis pipeline
-  "timestamp": "2026-01-15T09:14:00",
+
-  "overall_synthesis": "2-3 sentences of judgment about the PRD's overall state — what holds up, what's at risk. Written by the subagent, not the parent.",
+Once every selected reviewer has returned, the parent synthesizes one consolidated report. **Do not skip this step under Validate intent** — it produces the persistent artifact the user opens.
-  "findings": [
+
-    {
+### Inputs
-      "id": "Q-7",
+
-      "category": "Quality",
+- `{doc_workspace}/review-rubric.md` — primary, structured by the seven dimensions
-      "title": "FR testability",
+- Zero or more `{doc_workspace}/review-{slug}.md` files — extra reviewers (adversarial, etc.)
-      "status": "warn",
+- `{workflow.validation_report_template}` — the HTML skeleton
-      "severity": "medium",
+
-      "location": "§4.2 Feature Name, FR-3",
+### What the synthesis pass does
-      "note": "FR-3's consequences include 'system handles errors gracefully' which is not testable.",
+
-      "suggested_fix": "Replace with a specific testable condition, e.g. 'System returns HTTP 4xx for invalid input within 200ms.'"
+1. Read every reviewer file in `{doc_workspace}/review-*.md`.
-    }
+2. Fill the HTML skeleton:
-  ]
+   - **Header.** PRD name, path. Grade derived from the rubric verdicts and severity counts: *Excellent* = all dimensions strong/adequate, no high/critical findings · *Good* = ≤1 thin dimension, no critical findings · *Fair* = multiple thin dimensions or any high finding · *Poor* = any broken dimension or any critical finding. Set the matching `grade-excellent | grade-good | grade-fair | grade-poor` class.
-}
+   - **Synthesis block.** Lift the rubric's *Overall verdict* paragraph as the lead; if adversarial or ad-hoc reviewers materially shift the picture, add a second paragraph that names what they surfaced.
   - **Dimension summary cards.** One per dimension that was assessed. Colored verdict text. Skip dimensions the rubric marked n/a for this PRD (e.g. downstream usability for a standalone PRD).
   - **Dimension sections.** One `<section class="dimension">` per assessed dimension, in rubric order. `<details open>` for *thin* and *broken*; closed for *strong* and *adequate*. Each contains the dimension judgment (the prose from review-rubric.md) and the findings list.
   - **Reviewer sections.** One `<section class="reviewer-section">` per extra reviewer that ran. The source file path goes in the `<span class="reviewer-source">`. Closed by default. Adversarial findings keep their adversarial voice — do not soften.
   - **Mechanical notes.** Bullet list from the rubric's "Mechanical notes" section. Skip the block if empty.
   - **Footer.** Rubric path, ISO timestamp.
 3. Write the filled HTML to `{doc_workspace}/validation-report.html`.
 4. Write the markdown twin to `{doc_workspace}/validation-report.md` (same content, grouped by severity rather than by dimension — see format below; this is the canonical form for downstream re-reading).
 5. Open the HTML in the default browser:
   ```bash
   python3 -c "import webbrowser, pathlib; webbrowser.open(pathlib.Path('{doc_workspace}/validation-report.html').resolve().as_uri())"
   ```
   Skip the open step in headless mode (see `references/headless.md`).
 ### Markdown twin format
 ```markdown
 # Validation Report — {prd_name}
 - **PRD:** `{prd_path}`
 - **Rubric:** `{rubric_path}`
 - **Run at:** {ISO timestamp}
 - **Grade:** {Excellent | Good | Fair | Poor}
 ## Overall verdict
 {synthesis paragraphs}
 ## Dimension verdicts
 - Decision-readiness — {verdict}
 - Substance over theater — {verdict}
 - (etc. for each assessed dimension)
 ## Findings by severity
 ### Critical (n)
 **[Dimension or Reviewer]** — Title (§ location)
 {Note}
 Fix: {suggested fix}
 ### High (n)
 ...
 ### Medium (n)
 ...
 ### Low (n)
 ...
 ## Mechanical notes
 - {bullet}
 ## Reviewer files
 - `review-rubric.md`
 - `review-adversarial-general.md` (if present)
 - (etc.)
 ```
-Required per-finding fields: `id`, `status` (`pass` | `warn` | `fail` | `n/a`), `severity` (`low` | `medium` | `high` | `critical`). Optional: `category` (renderer derives from ID prefix if omitted), `title`, `location` (cite specifics, never abstract criticism), `note`, `suggested_fix`.
+Re-running validation overwrites the consolidated report in place. The individual `review-*.md` files are preserved so the user can drill in.
 ### Render
 ```bash
 python3 {skill-root}/scripts/render-validation-html.py \
  --findings {doc_workspace}/validation-findings.json \
  --template {workflow.validation_report_template} \
  --output {doc_workspace}/validation-report.html \
  --open
 ```
 Use `--open` interactive, omit headless. Writes HTML + markdown twin + JSON summary on stdout; re-running overwrites in place. Update mode reads the markdown form when rolling findings into a revision.
 ## Close
--- a/src/bmm-skills/2-plan-workflows/bmad-prd/scripts/render-validation-html.py
+++ b/src/bmm-skills/2-plan-workflows/bmad-prd/scripts/render-validation-html.py
@ -1,290 +0,0 @@
 #!/usr/bin/env python3
 # /// script
 # requires-python = ">=3.10"
 # ///
 """Render a PRD validation findings JSON into HTML + markdown reports.
 Reads structured findings produced by the validator subagent, groups them by
 category (explicit `category` field, else derived from ID prefix), computes a
 pass/warn/fail summary and grade, substitutes into the configured HTML
 template, writes a markdown companion at the same path with `.md` extension,
 and optionally opens the HTML in the default browser.
 """
 import argparse
 import html
 import json
 import string
 import sys
 import webbrowser
 from datetime import datetime
 from pathlib import Path
 CATEGORY_FROM_PREFIX = {
    "Q": "Quality",
    "D": "Discipline",
    "S": "Structural integrity",
    "STK": "Stakes-gated",
    "M": "Mechanical",
 }
 CATEGORY_ORDER = ["Quality", "Discipline", "Structural integrity", "Stakes-gated", "Mechanical"]
 def category_for(finding: dict) -> str:
    explicit = finding.get("category")
    if explicit:
        return explicit
    fid = finding.get("id", "")
    prefix = fid.split("-", 1)[0] if "-" in fid else fid
    return CATEGORY_FROM_PREFIX.get(prefix, prefix or "Other")
 def compute_stats(findings: list[dict]) -> dict:
    total = len(findings)
    by_status = {"pass": 0, "warn": 0, "fail": 0, "n/a": 0}
    failed_critical = 0
    failed_high = 0
    for f in findings:
        status = (f.get("status") or "n/a").lower()
        if status in by_status:
            by_status[status] += 1
        if status == "fail":
            sev = (f.get("severity") or "low").lower()
            if sev == "critical":
                failed_critical += 1
            elif sev == "high":
                failed_high += 1
    return {
        "total": total,
        "passed": by_status["pass"],
        "warned": by_status["warn"],
        "failed": by_status["fail"],
        "na": by_status["n/a"],
        "failed_critical": failed_critical,
        "failed_high": failed_high,
    }
 def grade_from(stats: dict) -> tuple[str, str]:
    if stats["failed_critical"] > 0:
        return "Poor", "grade-poor"
    if stats["failed_high"] >= 1 or stats["failed"] >= 4:
        return "Fair", "grade-fair"
    if stats["failed"] > 0 or stats["warned"] > 2:
        return "Good", "grade-good"
    return "Excellent", "grade-excellent"
 def render_score_bar(stats: dict, width: int = 480, height: int = 22) -> str:
    total = max(stats["total"], 1)
    p = stats["passed"] / total * width
    w = stats["warned"] / total * width
    f = stats["failed"] / total * width
    n = stats["na"] / total * width
    return (
        f'<svg width="{width}" height="{height}" viewBox="0 0 {width} {height}" role="img" '
        f'aria-label="Pass / warn / fail / n-a breakdown">'
        f'<rect x="0" y="0" width="{p:.1f}" height="{height}" fill="#22c55e"/>'
        f'<rect x="{p:.1f}" y="0" width="{w:.1f}" height="{height}" fill="#eab308"/>'
        f'<rect x="{p + w:.1f}" y="0" width="{f:.1f}" height="{height}" fill="#ef4444"/>'
        f'<rect x="{p + w + f:.1f}" y="0" width="{n:.1f}" height="{height}" fill="#94a3b8"/>'
        f"</svg>"
    )
 def render_finding(f: dict) -> str:
    status = (f.get("status") or "n/a").lower()
    severity = (f.get("severity") or "low").lower()
    fid = html.escape(f.get("id") or "")
    title = html.escape(f.get("title") or fid)
    location = html.escape(f.get("location") or "")
    note = html.escape(f.get("note") or "")
    fix = html.escape(f.get("suggested_fix") or "")
    status_class = "na" if status == "n/a" else status
    parts = [
        f'<article class="finding finding-{status_class}">',
        '<header>',
        f'<span class="badge badge-status badge-{status_class}">{status.upper()}</span>',
        f'<span class="badge badge-severity badge-sev-{severity}">{severity}</span>',
        f'<span class="finding-id">{fid}</span>',
        f'<h3 class="finding-title">{title}</h3>',
        '</header>',
    ]
    if location:
        parts.append(f'<div class="finding-location"><strong>Location:</strong> {location}</div>')
    if note:
        parts.append(f'<div class="finding-note">{note}</div>')
    if fix:
        parts.append(f'<div class="finding-fix"><strong>Suggested fix:</strong> {fix}</div>')
    parts.append("</article>")
    return "\n".join(parts)
 def render_category(name: str, findings: list[dict]) -> str:
    items = "\n".join(render_finding(f) for f in findings)
    name_e = html.escape(name)
    return (
        f'<section class="category">'
        f"<details open>"
        f'<summary><h2>{name_e} <span class="count">({len(findings)})</span></h2></summary>'
        f"{items}"
        f"</details>"
        f"</section>"
    )
 SEVERITY_ORDER = ["critical", "high", "medium", "low"]
 def render_finding_md(f: dict) -> str:
    status = (f.get("status") or "n/a").upper()
    severity = (f.get("severity") or "low").lower()
    fid = f.get("id") or ""
    title = f.get("title") or fid
    location = f.get("location") or ""
    note = f.get("note") or ""
    fix = f.get("suggested_fix") or ""
    lines = [f"### [{status}] {fid} — {title} _(severity: {severity})_"]
    if location:
        lines.append(f"- **Location:** {location}")
    if note:
        lines.append(f"- **Finding:** {note}")
    if fix:
        lines.append(f"- **Suggested fix:** {fix}")
    return "\n".join(lines)
 def render_markdown_report(data: dict, findings: list[dict], stats: dict, grade: str) -> str:
    prd_name = data.get("prd_name") or "PRD"
    prd_path = data.get("prd_path") or ""
    checklist_path = data.get("checklist_path") or ""
    timestamp = data.get("timestamp") or datetime.now().isoformat(timespec="seconds")
    synthesis = data.get("overall_synthesis") or ""
    out = [
        f"# Validation Report — {prd_name}",
        "",
        f"- **PRD:** `{prd_path}`",
        f"- **Checklist:** `{checklist_path}`",
        f"- **Run at:** {timestamp}",
        f"- **Grade:** {grade}",
        "",
        f"**Summary:** {stats['passed']} pass · {stats['warned']} warn · {stats['failed']} fail · {stats['na']} n/a "
        f"(total {stats['total']}; critical fails: {stats['failed_critical']}, high fails: {stats['failed_high']})",
    ]
    if synthesis:
        out += ["", "## Overall synthesis", "", synthesis]
    # Group by severity then status: failed criticals first, then highs, etc.
    by_sev: dict[str, list[dict]] = {s: [] for s in SEVERITY_ORDER}
    other: list[dict] = []
    for f in findings:
        sev = (f.get("severity") or "low").lower()
        if sev in by_sev:
            by_sev[sev].append(f)
        else:
            other.append(f)
    out += ["", "## Findings by severity"]
    any_findings = False
    for sev in SEVERITY_ORDER:
        items = by_sev[sev]
        if not items:
            continue
        any_findings = True
        out += ["", f"### {sev.capitalize()} ({len(items)})", ""]
        out += [render_finding_md(f) for f in items]
    if other:
        any_findings = True
        out += ["", f"### Other ({len(other)})", ""]
        out += [render_finding_md(f) for f in other]
    if not any_findings:
        out += ["", "_No findings._"]
    return "\n".join(out) + "\n"
 def main(argv: list[str]) -> int:
    parser = argparse.ArgumentParser(description="Render PRD validation findings to HTML.")
    parser.add_argument("--findings", required=True, help="Path to validation-findings.json")
    parser.add_argument("--template", required=True, help="Path to HTML template")
    parser.add_argument("--output", required=True, help="Path to write the rendered HTML")
    parser.add_argument("--open", action="store_true", help="Open the rendered HTML in the default browser")
    args = parser.parse_args(argv)
    findings_path = Path(args.findings)
    template_path = Path(args.template)
    output_path = Path(args.output)
    try:
        data = json.loads(findings_path.read_text(encoding="utf-8"))
    except FileNotFoundError:
        print(f"error: findings file not found: {findings_path}", file=sys.stderr)
        return 1
    except json.JSONDecodeError as e:
        print(f"error: findings file is not valid JSON ({findings_path}): {e}", file=sys.stderr)
        return 1
    try:
        template = template_path.read_text(encoding="utf-8")
    except FileNotFoundError:
        print(f"error: template file not found: {template_path}", file=sys.stderr)
        return 1
    findings = data.get("findings", []) or []
    by_cat: dict[str, list[dict]] = {}
    for f in findings:
        by_cat.setdefault(category_for(f), []).append(f)
    sorted_cats = sorted(
        by_cat.keys(),
        key=lambda c: (CATEGORY_ORDER.index(c) if c in CATEGORY_ORDER else 99, c),
    )
    categories_html = "\n".join(render_category(c, by_cat[c]) for c in sorted_cats)
    stats = compute_stats(findings)
    grade, grade_class = grade_from(stats)
    score_svg = render_score_bar(stats)
    timestamp = data.get("timestamp") or datetime.now().isoformat(timespec="seconds")
    substitutions = {
        "prd_name": html.escape(str(data.get("prd_name") or "PRD")),
        "prd_path": html.escape(str(data.get("prd_path") or "")),
        "checklist_path": html.escape(str(data.get("checklist_path") or "")),
        "timestamp": html.escape(timestamp),
        "overall_synthesis": html.escape(str(data.get("overall_synthesis") or "")),
        "grade": grade,
        "grade_class": grade_class,
        "total": str(stats["total"]),
        "passed": str(stats["passed"]),
        "failed": str(stats["failed"]),
        "warned": str(stats["warned"]),
        "na": str(stats["na"]),
        "score_svg": score_svg,
        "categories_html": categories_html,
    }
    rendered = string.Template(template).safe_substitute(substitutions)
    output_path.write_text(rendered, encoding="utf-8")
    md_path = output_path.with_suffix(".md")
    md_path.write_text(render_markdown_report(data, findings, stats, grade), encoding="utf-8")
    print(json.dumps({
        "output": str(output_path),
        "markdown": str(md_path),
        "grade": grade,
        "stats": stats,
    }))
    if args.open:
        webbrowser.open(output_path.resolve().as_uri())
    return 0
 if __name__ == "__main__":
    sys.exit(main(sys.argv[1:]))
--- a/src/bmm-skills/2-plan-workflows/bmad-prd/scripts/tests/test_render_validation_html.py
+++ b/src/bmm-skills/2-plan-workflows/bmad-prd/scripts/tests/test_render_validation_html.py
@ -1,130 +0,0 @@
 #!/usr/bin/env python3
 # /// script
 # requires-python = ">=3.10"
 # ///
 """Unit tests for render-validation-html.py.
 Run from the skill root:
    python3 -m unittest scripts/tests/test_render_validation_html.py
 """
 import importlib.util
 import sys
 import unittest
 from pathlib import Path
 # Load the sibling module via importlib because its filename has a hyphen.
 SCRIPT_PATH = Path(__file__).resolve().parents[1] / "render-validation-html.py"
 spec = importlib.util.spec_from_file_location("render_validation_html", SCRIPT_PATH)
 rvh = importlib.util.module_from_spec(spec)
 sys.modules["render_validation_html"] = rvh
 spec.loader.exec_module(rvh)
 class CategoryForTests(unittest.TestCase):
    def test_explicit_category_wins(self):
        self.assertEqual(rvh.category_for({"id": "Q-1", "category": "Custom"}), "Custom")
    def test_known_prefixes_map(self):
        self.assertEqual(rvh.category_for({"id": "Q-3"}), "Quality")
        self.assertEqual(rvh.category_for({"id": "D-2"}), "Discipline")
        self.assertEqual(rvh.category_for({"id": "S-1"}), "Structural integrity")
        self.assertEqual(rvh.category_for({"id": "STK-1"}), "Stakes-gated")
        self.assertEqual(rvh.category_for({"id": "M-9"}), "Mechanical")
    def test_unknown_prefix_falls_through(self):
        self.assertEqual(rvh.category_for({"id": "X-1"}), "X")
    def test_id_without_hyphen_used_directly(self):
        self.assertEqual(rvh.category_for({"id": "FOO"}), "FOO")
    def test_empty_id_yields_other(self):
        self.assertEqual(rvh.category_for({}), "Other")
 class GradeThresholdTests(unittest.TestCase):
    def _stats(self, **kw):
        base = {"total": 0, "passed": 0, "warned": 0, "failed": 0, "na": 0,
                "failed_critical": 0, "failed_high": 0}
        base.update(kw)
        return base
    def test_any_critical_fail_is_poor(self):
        grade, cls = rvh.grade_from(self._stats(failed=1, failed_critical=1))
        self.assertEqual(grade, "Poor")
        self.assertEqual(cls, "grade-poor")
    def test_single_high_fail_is_fair(self):
        grade, _ = rvh.grade_from(self._stats(failed=1, failed_high=1))
        self.assertEqual(grade, "Fair")
    def test_four_failures_is_fair_even_without_high(self):
        grade, _ = rvh.grade_from(self._stats(failed=4))
        self.assertEqual(grade, "Fair")
    def test_three_failures_no_high_is_good(self):
        grade, _ = rvh.grade_from(self._stats(failed=3))
        self.assertEqual(grade, "Good")
    def test_many_warnings_drops_to_good(self):
        grade, _ = rvh.grade_from(self._stats(warned=3))
        self.assertEqual(grade, "Good")
    def test_clean_run_is_excellent(self):
        grade, cls = rvh.grade_from(self._stats(passed=10))
        self.assertEqual(grade, "Excellent")
        self.assertEqual(cls, "grade-excellent")
    def test_two_warnings_still_excellent(self):
        grade, _ = rvh.grade_from(self._stats(passed=5, warned=2))
        self.assertEqual(grade, "Excellent")
 class ComputeStatsTests(unittest.TestCase):
    def test_counts_by_status_and_severity(self):
        findings = [
            {"status": "pass"},
            {"status": "warn"},
            {"status": "fail", "severity": "critical"},
            {"status": "fail", "severity": "high"},
            {"status": "fail", "severity": "low"},
            {"status": "n/a"},
        ]
        stats = rvh.compute_stats(findings)
        self.assertEqual(stats["total"], 6)
        self.assertEqual(stats["passed"], 1)
        self.assertEqual(stats["warned"], 1)
        self.assertEqual(stats["failed"], 3)
        self.assertEqual(stats["na"], 1)
        self.assertEqual(stats["failed_critical"], 1)
        self.assertEqual(stats["failed_high"], 1)
    def test_missing_status_treated_as_na(self):
        stats = rvh.compute_stats([{}, {}])
        self.assertEqual(stats["na"], 2)
    def test_empty_findings(self):
        stats = rvh.compute_stats([])
        self.assertEqual(stats["total"], 0)
 class ScoreBarTests(unittest.TestCase):
    def test_renders_svg_with_four_segments(self):
        stats = {"total": 4, "passed": 1, "warned": 1, "failed": 1, "na": 1}
        svg = rvh.render_score_bar(stats, width=400, height=20)
        self.assertIn("<svg", svg)
        self.assertEqual(svg.count("<rect"), 4)
        self.assertIn('fill="#22c55e"', svg)  # pass
        self.assertIn('fill="#eab308"', svg)  # warn
        self.assertIn('fill="#ef4444"', svg)  # fail
        self.assertIn('fill="#94a3b8"', svg)  # n/a
    def test_zero_total_does_not_divide_by_zero(self):
        stats = {"total": 0, "passed": 0, "warned": 0, "failed": 0, "na": 0}
        svg = rvh.render_score_bar(stats)
        self.assertIn("<svg", svg)
        self.assertEqual(svg.count("<rect"), 4)
 if __name__ == "__main__":
    unittest.main()
@ -36,4 +36,4 @@ End with the JSON response (full schemas with examples in `assets/headless-schem

	Update. Apply the change, log to `.decision-log.md` with rationale, and surface any conflict-with-prior-decision in `conflicts_with_prior_decisions[]` in the JSON status. Halt `blocked` if intent is ambiguous.	Update. Apply the change, log to `.decision-log.md` with rationale, and surface any conflict-with-prior-decision in `conflicts_with_prior_decisions[]` in the JSON status. Halt `blocked` if intent is ambiguous.

	Validate. Always write `validation-report.md` to `{doc_workspace}` regardless of finding count. Always include `"offer_to_update": true` in the JSON status. Run the renderer without `--open`.	Validate. Always write both `validation-report.html` and `validation-report.md` to `{doc_workspace}` regardless of finding count. Always include `"offer_to_update": true` in the JSON status. Skip the browser-open step in `references/validate.md` — write the artifacts and return.