From c19f6cd72a874b85ac00f100626c19136b5e120d Mon Sep 17 00:00:00 2001 From: Brian Madison Date: Sat, 9 May 2026 18:59:37 -0500 Subject: [PATCH] fix(bmad-product-brief): tighten update/validate rules and eval expectations - Update: audit trail (decision-log + addendum) is now mandatory before modifying brief.md in headless mode; distillate regeneration is required - Validate: always emits offer_to_update in headless JSON output - Headless: added validate example to JSON status block docs - Evals B8: add 900s timeout, replace hard 30%-smaller check with meaningful-condensation expectation - Eval B9: sharpen right-sized expectation wording --- evals/bmm-skills/bmad-product-brief/evals.json | 7 ++++--- .../1-analysis/bmad-product-brief/SKILL.md | 14 +++++++++++--- 2 files changed, 15 insertions(+), 6 deletions(-) diff --git a/evals/bmm-skills/bmad-product-brief/evals.json b/evals/bmm-skills/bmad-product-brief/evals.json index 98eacdf5a..424a64341 100644 --- a/evals/bmm-skills/bmad-product-brief/evals.json +++ b/evals/bmm-skills/bmad-product-brief/evals.json @@ -238,12 +238,13 @@ { "id": "B8", "_pattern": "process-discipline", + "timeout": 900, "prompt": "Run headless. Create a product brief for InsuLens (smartphone app that pairs with thermal imaging accessories for homeowner insulation audits, target suburban homeowners 35-65 with houses pre-2000, 50 user interviews with 78% willingness to pay $49, Series A pitch input). Generate a distillate \u2014 this brief will feed downstream PRD work.", - "expected_output": "distillate.md exists alongside brief.md and decision-log.md. The distillate is meaningfully shorter than the brief. Content of the distillate matches the brief without introducing new facts. The transcript shows the bmad-distillator subagent invoked.", + "expected_output": "distillate.md exists alongside brief.md and decision-log.md. The distillate is a meaningful condensation of the brief. Content of the distillate matches the brief without introducing new facts. The transcript shows the bmad-distillator subagent invoked.", "files": [], "expectations": [ "distillate.md exists in the run folder alongside brief.md and decision-log.md", - "distillate.md is shorter than brief.md (file size, in characters, is at least 30% smaller)", + "distillate.md is a meaningful condensation of brief.md \u2014 substantially more concise and capturing only the key decisions, target audience, validation evidence, and known unknowns needed for downstream PRD work, not a near-verbatim copy", "distillate.md does not introduce facts or claims not present in brief.md (no inventions on compression)", "The transcript contains a Skill tool call invoking bmad-distillator" ] @@ -259,7 +260,7 @@ "The final JSON status block artifact paths reference test-output/ rather than _bmad-output/", "brief.md body is written in Spanish \u2014 the majority of prose content (headings, section bodies) is in Spanish, not English", "brief.md covers the TaskFlow concept: freelancer daily planning, multi-client context, the sticky-notes-plus-calendar-plus-spreadsheet problem", - "brief.md is right-sized for a bootstrapped side project (250-600 words, no investor-grade framing such as TAM/SAM/SOM or Series A language)", + "brief.md is right-sized for a bootstrapped side project — appropriate depth and scope for a solo-founder app with no investor audience, no TAM/SAM/SOM framing, no Series A language, and no sections that pad for enterprise credibility", "The assistant's non-document output (transcript text content outside of brief.md) contains at least one marker of British informal register (e.g., 'mate', 'cheers', 'brilliant', 'sorted', 'innit', 'blimey', 'proper', 'right then', or equivalent pub-idiom phrasing)" ] } diff --git a/src/bmm-skills/1-analysis/bmad-product-brief/SKILL.md b/src/bmm-skills/1-analysis/bmad-product-brief/SKILL.md index a03d83cef..0d26145af 100644 --- a/src/bmm-skills/1-analysis/bmad-product-brief/SKILL.md +++ b/src/bmm-skills/1-analysis/bmad-product-brief/SKILL.md @@ -29,13 +29,13 @@ Briefs produced here are honest, right-sized to purpose, and built for what come **Create.** A brief the user is proud of, that meets their needs, drawn out through real conversation — do not assume: instead converse and understand, and then help craft the best product brief for their needs. Begin in `## Discovery` before drafting; the brief comes after the picture is on the table. Shape follows the product and need. Treat `{workflow.brief_template}` as a starting structure, not a contract: drop sections that do not earn their place, add sections the product needs, reorder freely - create sections for specialized domains or concerns also as needed. The brief serves the product's story, not the template's shape. Bind `{doc_workspace}` to a fresh folder at `{workflow.output_dir}/{workflow.output_folder_name}/` and write `brief.md` there with YAML frontmatter (title, status, created, updated). For Update and Validate, `{doc_workspace}` is the existing folder of the brief being targeted. -**Update.** Reconcile an existing brief with a change signal (edit request, downstream artifact, anything). Read the brief, the addendum if present, `decision-log.md`, and any original inputs first — past decisions and rejected ideas matter. Then run the `## Discovery` posture against the change signal before proposing changes. Identify what is now stale or wrong, propose changes, apply on agreement, bump `updated`. If the change signal contradicts prior decisions, surface the conflict before changing anything. In headless mode, if the prompt clearly signals intent to override the contradicted decision, proceed with the change and autonomously write the full audit trail — a new `decision-log.md` entry naming the reversal and its rationale, and an `addendum.md` override section — without waiting for user confirmation; if intent to override is ambiguous, halt with `blocked` status naming the specific conflict. If the change is fundamental, name it as a re-draft and offer Create instead. If `distillate.md` exists, regenerate it after changes are applied (re-invoke `bmad-distillator` per Finalize step 3); if unavailable, flag the distillate as stale. +**Update.** Reconcile an existing brief with a change signal (edit request, downstream artifact, anything). Read the brief, the addendum if present, `decision-log.md`, and any original inputs first — past decisions and rejected ideas matter. Then run the `## Discovery` posture against the change signal before proposing changes. Identify what is now stale or wrong, propose changes, apply on agreement, bump `updated`, and write a new `decision-log.md` entry recording what changed and why — every update, clean or override, must be logged. If the change signal contradicts prior decisions, surface the conflict before changing anything. In headless mode, if the prompt clearly signals intent to override the contradicted decision, write the full audit trail first, then apply the change — you must: (1) add a new entry to `decision-log.md` naming the decision being reversed and its rationale, (2) add an override section to `addendum.md` (creating it if absent). Both are mandatory before modifying `brief.md`; do not wait for user confirmation. If intent to override is ambiguous, halt with `blocked` status naming the specific conflict. If the change is fundamental, name it as a re-draft and offer Create instead. If `distillate.md` exists, you must regenerate it after changes are applied by invoking `bmad-distillator`; this step is required, not optional. If `bmad-distillator` is unavailable, flag the distillate as stale in the JSON output. -**Validate.** Honest critique against the brief's own purpose. Read the brief, the addendum if present, `decision-log.md`, and any original inputs first — a validation that ignores prior decisions, rejected ideas, or context the user supplied is shallow. Cite specific lines. Caveat what cannot be evaluated. Return inline — no separate file unless asked. Offer to roll findings into an Update. +**Validate.** Honest critique against the brief's own purpose. Read the brief, the addendum if present, `decision-log.md`, and any original inputs first — a validation that ignores prior decisions, rejected ideas, or context the user supplied is shallow. Cite specific lines. Caveat what cannot be evaluated. Return inline — no separate file unless asked. Always offer to roll findings into an Update, even in headless mode — include `"offer_to_update": true` in the JSON status block. ## Headless Mode -When invoked headless, do not ask. Complete the intent using what is provided, what exists in `{doc_workspace}`, or what you can discover yourself. If intent remains ambiguous after inference, halt with a `blocked` JSON status and a `reason` field — do not prompt. End with a JSON response listing status, intent, and artifact paths, for example: +When invoked headless, do not ask. Complete the intent using what is provided, what exists in `{doc_workspace}`, or what you can discover yourself. If intent remains ambiguous after inference, halt with a `blocked` JSON status and a `reason` field — do not prompt. End with a JSON response listing status, intent, and artifact paths. The `intent` field must match the detected intent: `"create"`, `"update"`, or `"validate"`. Examples: ```json { @@ -49,6 +49,14 @@ When invoked headless, do not ask. Complete the intent using what is provided, w } ``` +```json +{ + "status": "complete", + "intent": "validate", + "offer_to_update": true +} +``` + Omit keys for artifacts that were not produced. ## Discovery