BMAD-METHOD/evals/bmm-skills
Brian 1c1abaa5b1
refactor(bmm): streamline bmad-product-brief into lean facilitator (#2370)
* refactor(bmm): streamline bmad-product-brief into lean facilitator

Rewrites bmad-product-brief as an outcome-driven, intent-routed
facilitator with a configurable reviewer panel.

- Replace scripted prompt files with a concise SKILL.md
- Add brief-shape detection and intent routing
- Make the reviewer panel config-driven via customize.toml
- Harden the required-reviewer gate
- Add helper scripts and tests for panel resolution
- Replace embedded resources with a single brief-template asset
- Register skill metadata in module.yaml

* refactor(bmm): tighten bmad-product-brief discovery and polish flow

- Discovery now invites a brain dump and source material upfront,
  followed by "anything else?" before drilling
- Create and Update modes explicitly invoke Discovery before drafting
- Calibrate "make them sweat" to ease as the brief firms up
- Update mode regenerates distillate.md when changes apply
- Replace polish_skills with polish_passes: polymorphic entries
  (skill:/file:/plain-text), parallel subagents, applied automatically
  so the user sees a polished draft, not a polish review
- Default persistent_facts to empty with opt-in examples instead of an
  unbounded project-context.md glob
- Remove research-librarian, field-researcher, and skeptic agents no
  longer needed with this rework

* refactor(bmm): make persistence and resume explicit in bmad-product-brief

- Replace "Hold a working memory" with "Persistence is real-time": the
  workspace exists on disk and the user knows the path the moment Create
  intent is confirmed; decision-log.md is canonical memory so walk-away
  leaves nothing in the conversation
- Add "Continuity across sessions": prior in-progress drafts for this
  project surface a resume offer

* test(bmm): scaffold evals for bmad-product-brief

Adds an eval suite for bmad-product-brief following the Anthropic skill-creator
schema, plus a new "Extract, don't ingest" constraint on the skill itself.

Skill change:
- New constraint: source artifacts (user-provided or run-discovered) enter the
  parent conversation as relevance-filtered extracts via subagents, not loaded
  wholesale. Keeps the parent context lean against transcripts, brainstorms,
  research reports, code, web results, and prior briefs.

Evals (evals/bmm-skills/bmad-product-brief/):
- evals.json: 8 artifact/behavioral evals covering Create one-shot,
  source-memo ingest, Update with contradiction surfacing, Validate inline,
  Headless mode, brainstorm filtering, research-report filtering, persona
  filtering. All scenarios use fictional entities (InsuLens, Branfield CC,
  Forkbird Kitchen, Mossridge Library, Sproutkeeper, Hatchet & Loop Studio,
  Brightway, Pantry Bridge).
- triggers.json: 15 description-firing checks (7 should-fire, 8 should-not-fire)
  to catch under-triggering and adjacent-skill poaching.
- files/: realistic fixtures including a brainstorm with the relevant idea
  buried at the end, a 3000-word market research report with the relevant
  section in the middle, and customer interviews with the target persona in
  position 3 of 4 — each shaped to test that filtering happens against the
  user's stated focus regardless of where the relevant material sits.

Eval directory placement: top-level evals/ outside src/, matching the convention
in anthropics/skills (zero of 17 production skills include an evals/ subdir;
their skill-creator places dev workspaces as siblings to skill folders). Keeps
evals out of any installer or marketplace.json distribution path.

* refactor(bmm): rename brief workflow knobs and resolve PR review

- polish_passes -> doc_standards (TOML key + SKILL.md reference). Name
  generalizes for every doc-producing skill: encodes standards applied at
  finalize, not options.
- {run_folder} -> {doc_workspace}. Bound in Create to
  {workflow.output_dir}/{workflow.output_folder_name}/; reused by Update,
  Validate, Headless, and Finalize as the active brief's folder.
- On Activation Step 4 now resolves {date} explicitly (used by
  output_folder_name default).
- Headless Mode: ambiguous-intent fallback is a `blocked` JSON status with
  a `reason` field, no prompting. Resolves the activation/headless
  contradiction flagged in review.

* test(bmm): overhaul product-brief evals into A/B/C pattern split

Refactor evals from 8 numbered single-shot tests into 16 typed evals:
- Pattern A (A1-A8): artifact-correctness tests with headless prompts and
  precise, falsifiable expectations (no invented facts, right-sized output,
  file boundary enforcement)
- Pattern B (B1-B8): process-discipline tests verifying decision-log fidelity,
  polish phase ordering, contradiction detection, and distillate generation
- Pattern C (C1): config-compliance test for custom output paths and
  document_output_language

Also tighten SKILL.md: add dependencies frontmatter, clarify headless override
autonomy in Update (proceed when intent is clear; block when ambiguous), and
make distillate skip explicit when bmad-distillator is not installed.
2026-05-09 17:24:28 -05:00
..
bmad-product-brief refactor(bmm): streamline bmad-product-brief into lean facilitator (#2370) 2026-05-09 17:24:28 -05:00