refactor(bmad-prd): aggressive trim and quality fixes from analysis

SKILL.md cut 49% (143 -> 84 lines); references/validate.md 35%; references/headless.md 22%. Package-wide ~21% reduction. Customization sweep: - {finalize_reviewers} -> {workflow.finalize_reviewers} (was a silent override no-op) - Add canonical ## Conventions block - Rename decision-log.md -> .decision-log.md across SKILL, refs, schemas - customize.toml: validation_checklist -> validation_checklist_template, output_dir -> prd_output_path, output_folder_name -> run_folder_pattern; lifecycle comments on external_sources / external_handoffs New behavior: - Activation step 4: explicit by-name greeting using {user_name}, {communication_language} persisted for every turn (not just greeting) - Right-skill scan on first message before brain dump - Neutral defaults when config.yaml is missing; never block on it - Headless: Detection section + per-intent Inputs section + 'partial' status semantics + Update conflict-override rationale - Brownfield Update bootstraps .decision-log.md via subagent when absent - Reviewer Gate findings-overwhelm fix: tiered surfacing (verdict -> critical/high -> medium/low rolled into tail) - Discovery edge cases: conditional MVP question, persona/UJ push contextually triggered, working modes renamed as outcomes (Fast path / Coaching path) - Subagent discipline: Finalize step 2 return-format contract, step 5 structure->prose ordering, explicit no-subagent fallback Tests: - scripts/tests/test_render_validation_html.py (17 passing, covers grade thresholds, category mapping, stats, score-bar rendering)
2026-05-15 17:29:04 -05:00 · 2026-05-15 17:29:04 -05:00 · a586c0fa64
parent 71bac22a60
commit a586c0fa64
7 changed files with 224 additions and 145 deletions
--- a/src/bmm-skills/2-plan-workflows/bmad-prd/SKILL.md
+++ b/src/bmm-skills/2-plan-workflows/bmad-prd/SKILL.md
@ -4,137 +4,81 @@ description: Create, update, or validate a PRD. Use when the user wants help pro
 ---
 # BMad PRD

-## Overview
+Coach the user to a PRD they are proud of. Guide; do not do the thinking for them.

-You are an expert PM facilitator. The user has an idea that needs to be captured in a PRD; your job is to coach them to a PRD they are proud of: guide, do not do the thinking for them.
+## Conventions
+
+- Bare paths resolve from skill root; `{skill-root}` is this skill's install dir; `{project-root}` is the project working dir.
+- `{workflow.<name>}` resolves to fields in `customize.toml`'s `[workflow]` table (overrides win per BMad merge rules).
+- `{doc_workspace}` is the bound run folder. `.decision-log.md` is the canonical ledger at its root; `addendum.md` is the optional technical-how sidecar.

 ## On Activation

-1. Resolve customization: `python3 {project-root}/_bmad/scripts/resolve_customization.py --skill {skill-root} --key workflow`. On failure, fall back to reading `{skill-root}/customize.toml` directly and proceed with built-in defaults.
-2. Execute each entry in `{workflow.activation_steps_prepend}` in order.
-3. Treat every entry in `{workflow.persistent_facts}` as foundational context. Entries prefixed `file:` are paths or globs under `{project-root}` (load their contents as facts). All others are facts verbatim.
-4. Note `{workflow.external_sources}` as a registry to consult on demand when the conversation surfaces a relevant need. Do not query preemptively. If a named tool is unavailable at runtime, fall back to standard behavior and note the gap.
-5. Load `{project-root}/_bmad/bmm/config.yaml` (and `config.user.yaml` if present). Resolve `{user_name}`, `{communication_language}`, `{document_output_language}`, `{planning_artifacts}`, `{project_name}`, `{date}`.
-6. Detect mode and intent. If headless, read `references/headless.md` and follow it for the whole run. Otherwise greet `{user_name}` in `{communication_language}`. Detect intent from the user's first message and existing files:
-   - **Create**: no existing PRD resolves.
-   - **Update**: an existing `prd.md` is referenced or located.
-   - **Validate** (or *analyze*): user supplies an artifact and asks for feedback, or wants critique without changes.
+1. Resolve customization: `python3 {project-root}/_bmad/scripts/resolve_customization.py --skill {skill-root} --key workflow`. On failure, read `{skill-root}/customize.toml` directly and use defaults.
+2. Run `{workflow.activation_steps_prepend}`. Treat `{workflow.persistent_facts}` as foundational context (entries prefixed `file:` are loaded). Note `{workflow.external_sources}` as an on-demand registry; never query preemptively.
+3. Load `{project-root}/_bmad/bmm/config.yaml` (+ `config.user.yaml` if present). Resolve `{user_name}`, `{communication_language}`, `{document_output_language}`, `{planning_artifacts}`, `{project_name}`, `{date}`. Missing keys → neutral defaults; never block.
+4. If headless, follow `references/headless.md` for the whole run. Otherwise greet the user **by name** using `{user_name}` and **in their language** using `{communication_language}` — and stay in `{communication_language}` for every turn for the entire run, not just the greeting. Then scan for misroute on the first message: if the signal points elsewhere (game → BMad GDS; express build → `bmad-quick-dev`; one-pager → `bmad-product-brief`; vet product idea → `bmad-prfaq`), redirect in one sentence before continuing.
+5. Detect intent: **Create** (no PRD), **Update** (existing PRD), **Validate** (critique only). If ambiguous, ask.
+6. Run `{workflow.activation_steps_append}`.

-   If two intents could apply, confirm with one question.
-7. Execute each entry in `{workflow.activation_steps_append}` in order.
+## Intent Modes

-## Session Posture
+**Create.** Bind `{doc_workspace}` to `{workflow.prd_output_path}/{workflow.run_folder_pattern}/`. Write `prd.md` with YAML frontmatter (title, created, updated) and tell the user the path. Run `## Discovery`, then `## Finalize`.

- Plain and direct. No sycophancy ("great question", "absolutely"), no procedure narration ("I'll first...", "let me now..."), no length-nudge phrases ("concise", "brief").
- Record as the user surfaces it. Decisions, assumptions, open questions land in `decision-log.md` or inline tags before the next turn.
- Anti-caving: when the user says "just pick something" on a foundation question, log `[ASSUMPTION]` + `[NOTE FOR PM]`, not a decision. When evidence is thin, say so; don't draft around the gap. When a validator surfaces a blocker, fix or log; don't rationalize.
+**Update.** Reconcile the PRD with a change signal. Source-extract against PRD, addendum, `.decision-log.md`, and original inputs (extract, don't ingest). If `.decision-log.md` is missing, spawn a one-time bootstrap subagent to reverse-engineer a thin log from the PRD before continuing. Surface conflicts with prior decisions before applying. Then `## Finalize`.

-## Intent Operating Modes
-
-**Create.** A PRD the user is proud of, drawn out through real conversation. Discovery first, drafting second. Bind `{doc_workspace}` to a fresh folder at `{workflow.output_dir}/{workflow.output_folder_name}/` and write `prd.md` there with YAML frontmatter (title, created, updated); tell the user the workspace path. Version and state transitions live in `decision-log.md`. When drafting is complete, proceed to `## Finalize`.
-
-**Update.** Reconcile an existing PRD with a change signal. Orient via source extractors (see `## Constraints` → Extract, don't ingest) against the PRD, addendum, `decision-log.md`, and original inputs, then run the `## Discovery` posture against the change signal. Surface conflicts with prior decisions before changing: read `decision-log.md`, identify decisions touching the same area as the change, and for each, check whether the change reverses it, contradicts a stated assumption, or removes a non-goal; if so, raise it to the user before applying. Once changes are applied, proceed to `## Finalize`.
-
-**Validate** (or *analyze*). Critique an existing PRD without changing it. Load `references/validate.md` and follow it.
+**Validate** (or *analyze*). Critique without changing. Load `references/validate.md`.

 ## Discovery

-For Create and Update intent, run Discovery in this order: **Brain dump → Four-dimension read → Right-skill check → Working mode.** Never skip ahead.
+Order: **Brain dump → Four-dimension read → Working mode.** The brain dump is always the first move, even when the user opens with paragraphs of context — that is intake, not the dump. Read what exists; ask only what is missing. A simple "anything else?" surfaces what they almost forgot.

-### Posture
+Four dimensions: **Stakes** (rigor + section depth), **Audience** (tone + approval needs), **Existing inputs** (greenfield vs brownfield framing — brownfield means parts of the PRD reference, not relitigate), **Downstream depth** (standalone, or top of UX → architecture → epics → stories chain).

-Coach, do not quiz. Push hardest on PRD Discipline risks: unexamined assumptions, capability-vs-implementation confusion, term drift, scope creep, ambiguity for downstream readers.
+Push for two-to-three named-persona user journeys *when personas drive decisions* — consumer products, multi-stakeholder B2B, meaningful UX. Drop or downscale for internal tooling with a single operator role, regulatory-only updates, hobby/solo, and pure technical PRDs; there a capability spec is the right shape.

-Push for two to three named-persona user journeys even when the user does not propose them. UJs are how teams discover persona gaps, scope ambiguity, and missed failure handling. Solo/hobby scope can drop to one or none.
+Facilitation moves: walk UJs as a story arc (opening → rising → climax → resolution; real edge cases live at the climax, not as invented error states). For MVP scope when it exists, name the kind — problem-solving, experience, platform, or revenue — each implies different scope logic. Infer and confirm beats quiz ("I'm assuming X works like Y — right?" beats multiple choice). Load `references/probing.md` for the seven probe categories and the PRD/solution-design boundary.

-Facilitation moves to keep handy. For User Journeys, walk the story shape: opening scene, rising action, climax (the moment value is delivered), resolution; real edge cases surface at the climax, not as invented error states. For MVP scope, ask which kind of MVP this is before listing in/out: problem-solving (proves the problem is solved), experience (proves the interaction model), platform (proves the base others build on), or revenue (proves willingness to pay); each implies different scope logic. For feature decisions, state the inference and invite correction rather than quizzing the user ("I'm assuming X works like Y, is that right?" beats "should X work like Y or Z?").
+**Working mode.** Once the read is complete, offer the choice in the user's language:

-Load `references/probing.md` for the seven probing categories, critical-assumption scans, and the PRD / solution-design boundary; apply in Discovery and Update.
+- **Fast path:** I batch remaining critical gaps, draft the full PRD, you review.
+- **Coaching path:** we walk PM-thinking sections together before drafting, capture decisions section by section.

-### Brain dump
+The workspace persists, so they can stop and resume. Anti-cave: when the user says "just pick something" on a foundation question, log `[ASSUMPTION]` + `[NOTE FOR PM]`, not a decision. When evidence is thin, say so; do not draft around the gap.

-**Always the first move for Create and Update intent, no exceptions.** Even when the user opens with a few sentences of context, that is intake — not the brain dump. Before any questions (especially multiple-choice), explicitly invite them to unload everything they have in mind: full picture, inputs, ideas, WHY. Read what exists first; ask only what is missing *after* the dump. A simple "anything else?" often surfaces what they almost forgot.
-
-Anti-pattern: treating the user's opening message as the full picture and jumping to assumptions, intake questions, or working-mode choices. The user may have a lot more to say; ask before assuming. The more they give us, the more we have to work with and the faster the process will be also for the user as we can help tighten and organize their own thinking and creativity as part of the PRD process as the user is the expert.
-
-### Four-dimension read
-
-Before drafting, read the situation across four dimensions; they determine the PRD's shape:
-
- **Stakes.** Calibrates rigor, section depth, and which adapt-in clusters apply.
- **Audience.** Drives tone, evidence requirements, and approval sections.
- **Existing inputs.** Existing artifacts mean those parts of the PRD reference, not relitigate. When project-context, prior PRDs, or existing UX/architecture are present, this is brownfield; frame Discovery around what is new or changing.
- **Downstream depth.** Standalone spec, or top of a UX → architecture → epics → stories chain? Affects how much to encode vs. defer.
-
-### Right-skill check
-
-Sanity-check that PRD is the right tool before drafting:
-
- **Games** → suggest BMad Game Dev Studio (GDS).
- **Small scope, wants a captured artifact** (small tweak to existing codebase, single doc to point at) → stay here, use the adapt-in Stories cluster for an all-inclusive document.
- **Express implementation** (build now, no planning chain, no captured artifact, small intent or story) → suggest skill `bmad-quick-dev` instead.
-
-### Working mode
-
-Working-mode choice comes *after* the brain dump and four-dimension read, never before. Once the situational read is complete, offer the user a choice before proceeding (one sentence per option):
-
- **Express:** resolve any remaining critical gaps in a short batch, then draft the full PRD at once.
- **Facilitative:** work through the sections that require PM thinking before drafting. Capture all decisions in the log, section to section. Draft after the key sections are walked. The goal is that the user has authored the thinking, not just answered intake questions.
-
-In both modes, resolve decisions conversationally rather than silently deferring them into `[ASSUMPTION]` tags. Only use `[ASSUMPTION]` when the answer requires research or external input they cannot provide in the moment.
+Decisions, assumptions, open questions, and out-of-scope volunteers land in `.decision-log.md` (or `addendum.md` for technical-how) as the user surfaces them — never interrupt the train of thought to gate a topic.

 ## PRD Discipline

-### Artifact shape
+**Shape.** Features grouped; FRs nested with globally numbered stable IDs. Cross-cutting NFRs in their own section; skip traceability matrices. Capabilities, not implementation — tech choices live in `addendum.md`. Load `{workflow.prd_template}` as a menu, not a skeleton: drop, reorder, combine sections as the PRD needs; never include a section because it appears. Counter-metrics named when Success Metrics exist.

- **Features grouped, FRs nested.** Features open with behavioral description; FRs nested and numbered globally for stable IDs. Cross-cutting NFRs in their own section; skip traceability matrices.
- **Capabilities, not implementation.** FRs describe what users or systems can do, not how. Tech choices go in addendum.
- **Right-size to purpose.** Load `{workflow.prd_template}` as the shape menu. Treat it as a menu, not a skeleton: drop, reorder, combine, or rename sections as the PRD needs; never include a section just because it appears. The essential spine is the floor, not the ceiling — hobby keeps it short, enterprise layers in clusters from the template's adapt-in menu.
- **Counter-metrics named.** When Success Metrics is present, name what NOT to optimize.
+**Substance.** Personas are research-grounded or marked `[ILLUSTRATIVE]` — invented detail is *persona theater*. Two to four max, and they must drive decisions. Do not fabricate novelty (*innovation theater*); add a differentiation section only when Discovery surfaces something genuinely novel. Regulatory and compliance constraints surface in the PRD, not deferred to architecture. Inferences without direct user confirmation are tagged `[ASSUMPTION: ...]` inline and indexed at the end.

-### Substance
+**Scope honesty.** `[NON-GOAL for MVP]` and `[v2 — out of MVP]` callouts so omissions are not silently assumed. Never silently de-scope; propose phasing, never impose. `[NOTE FOR PM]` callouts at deferred decisions or unresolved tensions.

- **Personas, when used, are research-grounded or marked `[ILLUSTRATIVE]`.** Invented detail is *persona theater*: false specificity the team builds for. Personas must drive decisions; two to four max.
- **No innovation theater.** Don't fabricate novelty; add a differentiation section only when Discovery surfaced something genuinely novel.
- **Domain awareness.** Regulatory or compliance constraints surface in the PRD, not deferred to architecture.
- **Assumptions visible.** Inferences without direct user confirmation are tagged `[ASSUMPTION: ...]` inline and indexed at the end.
-
-### Honesty about scope
-
- **Non-Goals explicit.** Pair with inline `[NON-GOAL for MVP]` and `[v2 — out of MVP]` callouts so omissions aren't silently assumed.
- **Never silently de-scope.** Nothing the user explicitly included drops without asking. Propose phasing; never impose it.
- **`[NOTE FOR PM]` callouts** at decision points the user deferred or left tension on.
-
-## Constraints
-
- **File roles.** `decision-log.md`: every decision, change, and version transition. `addendum.md`: Technical-how detail to addendum immediately when the user volunteers it.
- **Continuity across sessions.** If a prior draft exists in `{workflow.output_dir}`, offer to resume; surface open items first.
- **Extract, don't ingest.** Never load source documents into the parent context wholesale. Delegate to subagents to extract what's relevant; the parent assembles from extracts.
+**Extract, don't ingest.** Source documents go to subagents for extraction; the parent assembles from extracts. Never load source documents into the parent context wholesale.

 ## Reviewer Gate

-Run during the Validate intent and at Finalize step 3.
+Used by the Validate intent and at Finalize step 3.

-**Menu.** Structural checklist validator (against `{workflow.validation_checklist}`) + each entry in `{finalize_reviewers}` (resolved on-demand from `customize.toml`) + any ad-hoc reviewers warranted by the artifact content (composed as plain-text entries on the fly). Stakes-calibrated: hobby/solo may run defaults quietly or skip; higher stakes get the explicit menu with all/subset/skip.
+Assemble the menu: structural validator against `{workflow.validation_checklist_template}` + each entry in `{workflow.finalize_reviewers}` + any ad-hoc reviewers the artifact warrants. Stakes-calibrated — hobby/solo may run quietly or skip; higher stakes get the explicit all/subset/skip menu.

-**Dispatch.** Spawn the selection as parallel subagents against `prd.md` (and `addendum.md` if present), by prefix:
- `skill:NAME` — invoke named skill.
- `file:PATH` — spawn an adversarial subagent giving the agent the file path to use as review lens.
- `text` — follow the instructions use directly as the subagent's prompt.
+Dispatch entries as parallel subagents against `prd.md` (and `addendum.md` if present) using the standard prefix convention (`skill:` / `file:` / plain text). Each writes its full review to `{doc_workspace}/review-{slug}.md` and returns ONLY a compact summary (verdict, top 2-5 findings, file path) — the parent never holds full review text. If subagents are unavailable, run sequentially: write the file *before* anything else, then flush the review from working context.

-**Subagent contract.** Each reviewer writes its full review to `{doc_workspace}/review-{slug}.md` and returns ONLY a compact summary (verdict, top 2-5 findings succinct, file path). The parent never holds full review text unless tool does not support subagents. Under Validate intent, the structural validator additionally runs the rendering pipeline in `references/validate.md`.
+Surface findings tiered, never dumped. Lead with a one-sentence gate verdict, then walk critical + high findings; medium/low roll into a single tail ("plus N more in {file}"). Read the full `review-{slug}.md` only when the user drills into a specific finding. Per finding: autofix, discuss, defer to open items, or ignore.

-**Walk-through.** Present the findings in an organized facilitative fashion. Per finding user can choose to: autofix, discuss, defer to open items, or ignore.
+Under Validate intent, the structural validator additionally runs the rendering pipeline in `references/validate.md`.

 ## Finalize

-At entry, name the sequence for the user in one sentence: walk the decision log, reconcile inputs, run the reviewer pass, resolve open items, then polish over the final content. Polish goes last so it does not have to redo work after reviewer fixes.
+Tell the user the sequence in one sentence, then walk it. Polish goes last so it does not redo work after reviewer fixes.

-1. Decision log audit: walk `decision-log.md` with the user; each entry captured in PRD, in addendum, or set aside.
-2. Input reconciliation: subagent per user-supplied input against `prd.md` + `addendum.md`; surface gaps, especially qualitative ideas (tone, voice, feel) the FR structure silently drops. Must happen before polish.
-3. Reviewer pass: run the `## Reviewer Gate` against `prd.md`. Resolve before polish.
-4. Triage all Open Questions, `[ASSUMPTION]` tags, and `[NOTE FOR PM]` callouts. A phase-blocker is an open item that would make the PRD unsafe for the next phase of work (UX, architecture, epics); a non-blocker is deferrable to a later cycle with an owner and revisit condition. Surface only phase-blockers one at a time; resolve before calling the PRD ready. Log deferred items to `decision-log.md`. If phase-blocker count is high, flag it.
-5. Polish: apply `{workflow.doc_standards}` to `prd.md` and `addendum.md` via parallel subagents.
-6. External handoffs: execute `{workflow.external_handoffs}` entries; surface returned URLs/IDs. Skip and flag unavailable tools.
-7. Record finalization to `decision-log.md`. Share all artifact paths. Invoke `bmad-help` to share possible steps.
+1. **Decision log audit.** Walk `.decision-log.md` with the user; each entry captured in PRD, in addendum, or set aside.
+2. **Input reconciliation.** Subagent per user-supplied input against `prd.md` + `addendum.md`. Each writes its extract to `{doc_workspace}/reconcile-{slug}.md` and returns ONLY a compact summary (input name, gaps 2-5, file path). Surface gaps — especially qualitative ideas (tone, voice, feel) the FR structure silently drops. Must happen before polish.
+3. **Reviewer pass.** Run `## Reviewer Gate`. Resolve before polish.
+4. **Triage open items.** All Open Questions, `[ASSUMPTION]` tags, `[NOTE FOR PM]` callouts. Phase-blockers (would make the PRD unsafe for UX/architecture/epics) surfaced one at a time and resolved; non-blockers deferred with owner + revisit condition logged to `.decision-log.md`. If phase-blocker count is high, flag it.
+5. **Polish.** Apply `{workflow.doc_standards}` to `prd.md` and `addendum.md` in declared order (structural passes before prose — prose should not polish soon-to-be-cut text). Parallelize across documents, sequential within.
+6. **External handoffs.** Execute `{workflow.external_handoffs}`; surface returned URLs/IDs. Skip and flag unavailable tools.
+7. **Close.** Record finalization to `.decision-log.md`. Share artifact paths. Common next: `bmad-create-ux-design`, `bmad-create-architecture`, `bmad-create-epics-and-stories`; invoke `bmad-help` for authoritative routing.
 8. Run `{workflow.on_complete}` if non-empty.
--- a/src/bmm-skills/2-plan-workflows/bmad-prd/assets/headless-schemas.md
+++ b/src/bmm-skills/2-plan-workflows/bmad-prd/assets/headless-schemas.md
@ -18,7 +18,7 @@ Every headless run ends with one of these payloads. Omit keys for artifacts not
  "intent": "create",
  "prd": "{doc_workspace}/prd.md",
  "addendum": "{doc_workspace}/addendum.md",
-  "decision_log": "{doc_workspace}/decision-log.md",
+  "decision_log": "{doc_workspace}/.decision-log.md",
  "open_questions": [],
  "assumptions": [],
  "external_handoffs": [
@ -34,7 +34,7 @@ Every headless run ends with one of these payloads. Omit keys for artifacts not
  "status": "complete",
  "intent": "update",
  "prd": "{doc_workspace}/prd.md",
-  "decision_log": "{doc_workspace}/decision-log.md",
+  "decision_log": "{doc_workspace}/.decision-log.md",
  "changes_summary": "1-3 sentences describing what changed and why",
  "conflicts_with_prior_decisions": [],
  "open_questions": [],
--- a/src/bmm-skills/2-plan-workflows/bmad-prd/customize.toml
+++ b/src/bmm-skills/2-plan-workflows/bmad-prd/customize.toml
@ -47,7 +47,7 @@ prd_template = "assets/prd-template.md"
 # A subagent walks the checklist against prd.md and returns structured findings.
 # Override the path in team/user TOML to enforce an org-specific checklist
 # (regulated-industry compliance, investor-pitch standards, etc.).
-validation_checklist = "assets/prd-validation-checklist.md"
+validation_checklist_template = "assets/prd-validation-checklist.md"

 # HTML template used to render validation findings into a styled, scannable
 # report. The renderer (scripts/render-validation-html.py) substitutes
@ -57,9 +57,10 @@ validation_checklist = "assets/prd-validation-checklist.md"
 validation_report_template = "assets/validation-report-template.html"

 # Run folder location. The PRD, optional addendum, decision log, and optional
-# validation report all land inside `{output_dir}/{output_folder_name}/`.
-output_dir = "{planning_artifacts}/prds"
-output_folder_name = "prd-{project_name}-{date}"
+# validation report all land inside `{prd_output_path}/{run_folder_pattern}/`.
+# Resume-check scans `{prd_output_path}` for prior unfinished runs.
+prd_output_path = "{planning_artifacts}/prds"
+run_folder_pattern = "prd-{project_name}-{date}"

 # Document standards applied to human-consumed docs at finalize. Each entry is
 # a `skill:`, `file:`, or plain-text directive; the parent LLM applies the
@ -90,6 +91,11 @@ doc_standards = [
 # tool needs. If a named MCP tool is unavailable at runtime, the LLM falls
 # back to standard behavior and notes the gap. Empty by default.
 #
+# Lifecycle note: distinct from persistent_facts. persistent_facts are loaded
+# once at activation and kept in mind for the whole run; external_sources are
+# a registry consulted on demand and only when the conversation surfaces a
+# matching need.
+#
 # Examples (set in team/user override TOML):
 #   "When researching internal product context, consult corp:kb_search (database='product-docs') before web search."
 #   "For competitive landscape during Discovery, query corp:competitive_db with category={project_name}."
@ -106,6 +112,9 @@ external_sources = []
 # status; local files always exist regardless. Fires automatically — users
 # can opt out in their prompt for a specific run. Empty by default.
 #
+# Lifecycle note: distinct from persistent_facts and external_sources.
+# Fired once at Finalize step 6, never during Discovery or drafting.
+#
 # Examples (set in team/user override TOML):
 #   "After finalize, upload prd.md and addendum.md to Confluence via corp:confluence_upload (space_key='PROD', parent_page='PRDs', label='prd', author={user_name})."
 #   "Mirror the PRD to Notion via notion:create_page (database_id='abc123', title='PRD: '+{project_name})."
--- a/src/bmm-skills/2-plan-workflows/bmad-prd/references/headless.md
+++ b/src/bmm-skills/2-plan-workflows/bmad-prd/references/headless.md
@ -2,23 +2,38 @@

 Load this file when bmad-prd is invoked headless (no interactive user). Follow it for the whole run.

+## Detection
+
+Headless mode is in effect when any of the following is true:
+
+- the invoking caller sets a `headless: true` flag (or equivalent argument the harness exposes),
+- the invocation is from another skill or a non-interactive runner (no TTY, no user message stream),
+- `{workflow.activation_steps_prepend}` includes an entry that explicitly declares headless,
+- the first message comes from an automation context that pre-supplies all inputs and asks for an artifact path back.
+
+When ambiguous, default to interactive.
+
+## Inputs the caller is expected to provide
+
+The caller passes inputs in their first message (free-form structured payload; no fixed schema, but every field below should be present when applicable):
+
+- `intent` — `"create"`, `"update"`, or `"validate"`. If absent, infer from the artifact set.
+- For **Create**: a brief or product spec the LLM works from (plain text, file path, or URL), plus any persona/scope notes; `doc_workspace` if a specific run folder is required (otherwise the workflow binds the default).
+- For **Update**: the existing `prd.md` path (or a workspace path that contains one), and a change signal (the request: what to change and why).
+- For **Validate**: the existing `prd.md` path (or workspace path), and optionally a checklist override path. Workspace defaults to the PRD's containing directory.
+
+Anything the caller does not provide is either inferred from inputs/workspace or recorded as `assumptions[]` / `open_questions[]` in the JSON status. Do not invent persona detail, success metrics, or scope decisions to fill gaps — record them.
+
 ## General

-Do not ask. Complete the intent using what is provided, what exists in `{doc_workspace}`, or what you can discover yourself. If intent remains ambiguous after inference, halt with a `blocked` JSON status and a `reason` field — do not prompt. Do not greet.
+Do not ask. Complete the intent using what is provided, what exists in `{doc_workspace}`, or what you can discover yourself. If intent remains ambiguous after inference, halt with `status: "blocked"` and a `reason` field — do not prompt. Do not greet.

-End with a JSON response listing status, intent, and artifact paths. The `intent` field must match the detected intent: `"create"`, `"update"`, or `"validate"`. Omit keys for artifacts not produced. Full schemas with examples for each intent are in `assets/headless-schemas.md`. Minimal shape:
+Populate `assumptions[]` with every value you inferred without direct caller confirmation; populate `open_questions[]` with every gap that needs a human decision. Use `status: "partial"` when the artifact was produced but `open_questions[]` is non-empty or critical inputs were inferred (Create with no brief; Update with a vague signal acted on best-effort; Validate that could not load the checklist). `complete` = stands on its own; `partial` = caller should review before downstream use; `blocked` = no artifact produced.

-```json
-{
-  "status": "complete",
-  "intent": "validate",
-  "validation_report": "{doc_workspace}/validation-report.md",
-  "offer_to_update": true
-}
-```
+End with the JSON response (full schemas with examples in `assets/headless-schemas.md`). The `intent` field must match the detected intent. Omit keys for artifacts not produced.

 ## Mode-specific overrides

-**Update.** Log the reversal to `decision-log.md`, then apply. Halt `blocked` if intent is ambiguous.
+**Update.** Apply the change, log to `.decision-log.md` with rationale, and surface any conflict-with-prior-decision in `conflicts_with_prior_decisions[]` in the JSON status. Halt `blocked` if intent is ambiguous.

-**Validate.** Always write `validation-report.md` to `{doc_workspace}` regardless of finding count. Always include `"offer_to_update": true` in the JSON status block.
+**Validate.** Always write `validation-report.md` to `{doc_workspace}` regardless of finding count. Always include `"offer_to_update": true` in the JSON status. Run the renderer without `--open`.
--- a/src/bmm-skills/2-plan-workflows/bmad-prd/references/probing.md
+++ b/src/bmm-skills/2-plan-workflows/bmad-prd/references/probing.md
@ -9,10 +9,10 @@ One probe per turn. When multiple gaps surface, pick the most load-bearing; reco
 | Trigger | Probe | Where it lands |
 |---|---|---|
 | Vague outcome ("better", "faster", "love it") | "What does it look like when it's working? How would you know?" | §7 Success Metrics, or Open Question |
-| Evidence conflict (implied behavior change contradicts existing PRD/code/decision) | "Current behavior is X per [source]. Is the change to Y deliberate?" | `decision-log.md`; `[NOTE FOR PM]` in affected feature |
+| Evidence conflict (implied behavior change contradicts existing PRD/code/decision) | "Current behavior is X per [source]. Is the change to Y deliberate?" | `.decision-log.md`; `[NOTE FOR PM]` in affected feature |
 | Scope ambiguity ("all users", "the checkout flow") | "Does this apply to X? Is Y in or out?" | §5 Non-Goals or inline `[NON-GOAL]` |
 | Missing failure UX (happy path only) | "What does the user see when X fails? Acceptable, or needs handling?" | Feature/FR; Open Question if undecided |
-| Competing commitments (two requirements in tension) | "When A and B conflict, which wins?" | `decision-log.md` + counter-metric in §7 |
+| Competing commitments (two requirements in tension) | "When A and B conflict, which wins?" | `.decision-log.md` + counter-metric in §7 |
 | Data availability (SM depends on data not verified to exist) | "Does the data exist today in measurable form? Baseline?" | SM definition or `[ASSUMPTION]` |
 | Design readiness (UI implied without design) | "Reviewed design exists, or Open Question?" | Open Question; ref `bmad-create-ux-design` |

--- a/src/bmm-skills/2-plan-workflows/bmad-prd/references/validate.md
+++ b/src/bmm-skills/2-plan-workflows/bmad-prd/references/validate.md
@ -1,28 +1,24 @@
 # Validate

-The full Validate intent playbook. Also covers the structural-validator rendering pipeline used during mid-session report requests.
-
-Critique an existing PRD without changing it. Standalone; does NOT enter `## Finalize`.
+The Validate intent playbook. Standalone — this intent critiques an existing PRD without changing it and ends after the user has seen the report; it does not run Finalize. The structural-validator rendering pipeline below is also reused for mid-session report requests during Create/Update.

 ## Orient

-Source-extract against `decision-log.md`, any original inputs, and the PRD/addendum themselves. Delegate to subagents per `## Constraints` → Extract, don't ingest; the parent assembles from extracts.
+Source-extract against `.decision-log.md`, any original inputs, and the PRD/addendum themselves. Delegate to subagents per PRD Discipline → "Extract, don't ingest" (in SKILL.md); the parent assembles from extracts.

 ## Run the Reviewer Gate

-Run the `## Reviewer Gate` (in SKILL.md) against `prd.md` (and `addendum.md` if present). The gate handles menu assembly, parallel dispatch, the subagent contract, and the section-by-section walk-through.
-
-The structural checklist validator is one entry in the gate menu; under Validate intent it additionally runs the rendering pipeline below. The Finalize discipline pass during Create/Update does NOT render a report — its findings stay in-conversation.
+Run the Reviewer Gate (see SKILL.md) against `prd.md` (and `addendum.md` if present). The structural checklist validator is one entry in the gate menu; under Validate intent it additionally runs the rendering pipeline below. The Finalize discipline pass during Create/Update does NOT render a report — its findings stay in-conversation.

 ## Structural validator pipeline

-The structural validator subagent walks `{workflow.validation_checklist}` against `prd.md` (and `addendum.md` if present) and writes findings to `{doc_workspace}/validation-findings.json`:
+The structural validator subagent walks `{workflow.validation_checklist_template}` against `prd.md` (and `addendum.md` if present) and writes findings to `{doc_workspace}/validation-findings.json`:

 ```json
 {
  "prd_name": "Example Product",
  "prd_path": "{doc_workspace}/prd.md",
-  "checklist_path": "{workflow.validation_checklist}",
+  "checklist_path": "{workflow.validation_checklist_template}",
  "timestamp": "2026-01-15T09:14:00",
  "overall_synthesis": "2-3 sentences of judgment about the PRD's overall state — what holds up, what's at risk. Written by the subagent, not the parent.",
  "findings": [
@ -40,21 +36,10 @@ The structural validator subagent walks `{workflow.validation_checklist}` agains
 }
 ```

-Per-finding fields:
-
- `id` (required) — checklist item ID (e.g., `Q-1`, `D-2`, `STK-1`, or org-custom prefixes).
- `category` (optional) — explicit category name; if omitted, the renderer maps from the ID prefix.
- `title` (optional but recommended) — the checklist item's short name.
- `status` — `pass` | `warn` | `fail` | `n/a`.
- `severity` — `low` | `medium` | `high` | `critical`.
- `location` (optional) — section/line/range in the PRD where the finding lives. Cite specifics, never abstract criticism.
- `note` (optional) — the finding itself, in one or two sentences.
- `suggested_fix` (optional) — concrete next action.
+Required per-finding fields: `id`, `status` (`pass` | `warn` | `fail` | `n/a`), `severity` (`low` | `medium` | `high` | `critical`). Optional: `category` (renderer derives from ID prefix if omitted), `title`, `location` (cite specifics, never abstract criticism), `note`, `suggested_fix`.

 ### Render

-After the subagent writes findings:
-
 ```bash
 python3 {skill-root}/scripts/render-validation-html.py \
  --findings {doc_workspace}/validation-findings.json \
@ -63,12 +48,8 @@ python3 {skill-root}/scripts/render-validation-html.py \
  --open
 ```

-Include `--open` for interactive runs (auto-opens in default browser). Omit `--open` in headless runs.
-
-The script writes two artifacts side-by-side: the HTML report at `--output`, and a markdown companion at the same path with `.md` extension (e.g. `validation-report.md`). Both are always produced when the script runs. It computes pass/warn/fail/na counts, derives a grade (Excellent / Good / Fair / Poor) from critical-fail and total-fail counts, renders an inline SVG score bar in the HTML, groups findings by category, and returns a one-line JSON summary on stdout: `{"output": "...", "markdown": "...", "grade": "...", "stats": {...}}`.
-
-Re-running validation overwrites existing report files in place. Markdown form is what Update mode reads when rolling findings into a revision.
+Use `--open` interactive, omit headless. Writes HTML + markdown twin + JSON summary on stdout; re-running overwrites in place. Update mode reads the markdown form when rolling findings into a revision.

 ## Close

-Surface all artifact paths and summaries to the user. The Reviewer Gate's walk-through covers section-by-section review of findings; the rendered HTML/markdown report is the persistent artifact. Always offer to roll findings into an Update; Validate itself does not change `prd.md`.
+Surface artifact paths; the rendered HTML/markdown is the persistent artifact. Always offer to roll findings into an Update.
--- a/src/bmm-skills/2-plan-workflows/bmad-prd/scripts/tests/test_render_validation_html.py
+++ b/src/bmm-skills/2-plan-workflows/bmad-prd/scripts/tests/test_render_validation_html.py
@ -0,0 +1,130 @@
+#!/usr/bin/env python3
+# /// script
+# requires-python = ">=3.10"
+# ///
+"""Unit tests for render-validation-html.py.
+
+Run from the skill root:
+    python3 -m unittest scripts/tests/test_render_validation_html.py
+"""
+
+import importlib.util
+import sys
+import unittest
+from pathlib import Path
+
+# Load the sibling module via importlib because its filename has a hyphen.
+SCRIPT_PATH = Path(__file__).resolve().parents[1] / "render-validation-html.py"
+spec = importlib.util.spec_from_file_location("render_validation_html", SCRIPT_PATH)
+rvh = importlib.util.module_from_spec(spec)
+sys.modules["render_validation_html"] = rvh
+spec.loader.exec_module(rvh)
+
+
+class CategoryForTests(unittest.TestCase):
+    def test_explicit_category_wins(self):
+        self.assertEqual(rvh.category_for({"id": "Q-1", "category": "Custom"}), "Custom")
+
+    def test_known_prefixes_map(self):
+        self.assertEqual(rvh.category_for({"id": "Q-3"}), "Quality")
+        self.assertEqual(rvh.category_for({"id": "D-2"}), "Discipline")
+        self.assertEqual(rvh.category_for({"id": "S-1"}), "Structural integrity")
+        self.assertEqual(rvh.category_for({"id": "STK-1"}), "Stakes-gated")
+        self.assertEqual(rvh.category_for({"id": "M-9"}), "Mechanical")
+
+    def test_unknown_prefix_falls_through(self):
+        self.assertEqual(rvh.category_for({"id": "X-1"}), "X")
+
+    def test_id_without_hyphen_used_directly(self):
+        self.assertEqual(rvh.category_for({"id": "FOO"}), "FOO")
+
+    def test_empty_id_yields_other(self):
+        self.assertEqual(rvh.category_for({}), "Other")
+
+
+class GradeThresholdTests(unittest.TestCase):
+    def _stats(self, **kw):
+        base = {"total": 0, "passed": 0, "warned": 0, "failed": 0, "na": 0,
+                "failed_critical": 0, "failed_high": 0}
+        base.update(kw)
+        return base
+
+    def test_any_critical_fail_is_poor(self):
+        grade, cls = rvh.grade_from(self._stats(failed=1, failed_critical=1))
+        self.assertEqual(grade, "Poor")
+        self.assertEqual(cls, "grade-poor")
+
+    def test_single_high_fail_is_fair(self):
+        grade, _ = rvh.grade_from(self._stats(failed=1, failed_high=1))
+        self.assertEqual(grade, "Fair")
+
+    def test_four_failures_is_fair_even_without_high(self):
+        grade, _ = rvh.grade_from(self._stats(failed=4))
+        self.assertEqual(grade, "Fair")
+
+    def test_three_failures_no_high_is_good(self):
+        grade, _ = rvh.grade_from(self._stats(failed=3))
+        self.assertEqual(grade, "Good")
+
+    def test_many_warnings_drops_to_good(self):
+        grade, _ = rvh.grade_from(self._stats(warned=3))
+        self.assertEqual(grade, "Good")
+
+    def test_clean_run_is_excellent(self):
+        grade, cls = rvh.grade_from(self._stats(passed=10))
+        self.assertEqual(grade, "Excellent")
+        self.assertEqual(cls, "grade-excellent")
+
+    def test_two_warnings_still_excellent(self):
+        grade, _ = rvh.grade_from(self._stats(passed=5, warned=2))
+        self.assertEqual(grade, "Excellent")
+
+
+class ComputeStatsTests(unittest.TestCase):
+    def test_counts_by_status_and_severity(self):
+        findings = [
+            {"status": "pass"},
+            {"status": "warn"},
+            {"status": "fail", "severity": "critical"},
+            {"status": "fail", "severity": "high"},
+            {"status": "fail", "severity": "low"},
+            {"status": "n/a"},
+        ]
+        stats = rvh.compute_stats(findings)
+        self.assertEqual(stats["total"], 6)
+        self.assertEqual(stats["passed"], 1)
+        self.assertEqual(stats["warned"], 1)
+        self.assertEqual(stats["failed"], 3)
+        self.assertEqual(stats["na"], 1)
+        self.assertEqual(stats["failed_critical"], 1)
+        self.assertEqual(stats["failed_high"], 1)
+
+    def test_missing_status_treated_as_na(self):
+        stats = rvh.compute_stats([{}, {}])
+        self.assertEqual(stats["na"], 2)
+
+    def test_empty_findings(self):
+        stats = rvh.compute_stats([])
+        self.assertEqual(stats["total"], 0)
+
+
+class ScoreBarTests(unittest.TestCase):
+    def test_renders_svg_with_four_segments(self):
+        stats = {"total": 4, "passed": 1, "warned": 1, "failed": 1, "na": 1}
+        svg = rvh.render_score_bar(stats, width=400, height=20)
+        self.assertIn("<svg", svg)
+        self.assertEqual(svg.count("<rect"), 4)
+        self.assertIn('fill="#22c55e"', svg)  # pass
+        self.assertIn('fill="#eab308"', svg)  # warn
+        self.assertIn('fill="#ef4444"', svg)  # fail
+        self.assertIn('fill="#94a3b8"', svg)  # n/a
+
+    def test_zero_total_does_not_divide_by_zero(self):
+        stats = {"total": 0, "passed": 0, "warned": 0, "failed": 0, "na": 0}
+        svg = rvh.render_score_bar(stats)
+        self.assertIn("<svg", svg)
+        self.assertEqual(svg.count("<rect"), 4)
+
+
+if __name__ == "__main__":
+    unittest.main()