Merge branch 'main' into bma-d/add-automator

2026-05-12 20:04:16 -05:00 · 2026-05-12 20:04:16 -05:00 · fd0018900e
parent a3e0545847 b5b33c08fa
commit fd0018900e
39 changed files with 1879 additions and 752 deletions
--- a/docs/explanation/forensic-investigation.md
+++ b/docs/explanation/forensic-investigation.md
@ -0,0 +1,137 @@
 ---
 title: "Forensic Investigation"
 description: How bmad-investigate treats every issue like a crime scene, grades evidence, and produces a structured case file engineers can act on
 sidebar:
  order: 6
 ---
 You hand `bmad-investigate` a crash log, a stack trace, or just a "this used to work, now it doesn't". The skill takes
 over the investigator's discipline for the duration of the run. It does not start fixing. It opens a case file.
 Every finding gets graded. Every hypothesis gets a status. Wrong turns are kept, not erased. The deliverable is a
 document another engineer can pick up cold.
 This page explains why investigation is its own discipline, and what the skill buys you that a regular dev workflow
 doesn't.
 ## The Problem With "Just Debug It"
 Normal debugging blends three things: looking at evidence, reasoning about cause, and changing code to test the theory.
 When they're blended, two failure modes show up.
 The first is **narrative lock-in**. The first plausible story becomes the working theory, and every observation gets
 bent to fit it. The bug stays unfixed until someone gives up and starts over. Hours later.
 The second is **evidence amnesia**. You traced something, ruled it out, but didn't write down why. Two days later, with
 fresh eyes, you trace it again. Or worse, a colleague picks up the bug and re-runs the same dead end you already
 eliminated.
 The skill's design is a direct response to both.
 ## Evidence Grading
 Every finding in an investigation is one of three things.
 - **Confirmed.** Directly observed in logs, code, or dumps; cited with a specific reference (a `path:line`, a log
  timestamp, a commit hash). If someone asks "how do you know?", you point at the citation.
 - **Deduced.** Logically follows from confirmed evidence; the reasoning chain is shown. If a step in the chain is wrong,
  the deduction is wrong, and you can see exactly which step.
 - **Hypothesized.** Plausible but unconfirmed. States what evidence would confirm or refute, and declares upfront what
  would close it. Hypotheses are explicitly *not facts*.
 The grading is not about being humble. It's about making the case file readable. A reader can scan the Confirmed section
 to know what is true, the Deduced section to know what follows, and the Hypothesized section to know what is still open.
 Confusion between the three is the most common reason investigations spiral.
 ## Stronghold First
 Investigation never starts from a theory. It starts from one piece of confirmed evidence and expands outward. That
 evidence might be a specific error message, a stack frame, or a timestamped log entry.
 This is the opposite of how investigations often go. Someone has a hunch, builds a theory, and then hunts for evidence
 that supports it. The hunch can be right; the *method* is fragile because it makes confirmation bias the default.
 A stronghold is a fact you can return to when reasoning gets murky. If a deduction takes you somewhere strange, you can
 walk it back to the stronghold and try a different branch. Without one, you don't know which step to undo.
 When evidence is sparse, the skill says so and switches to hypothesis-driven exploration: form hypotheses from what's
 available, identify what would test each, present a prioritized data-collection list. Missing evidence is itself a
 finding.
 ## Hypothesis Discipline
 Hypotheses are never deleted from the case file. When evidence confirms or refutes one, its **Status** field updates
 from Open to Confirmed or Refuted, and a **Resolution** explains what evidence settled it.
 This rule has a real cost. Case files grow. The benefit is real too. The full reasoning history becomes part of the
 deliverable. Six months later, when a similar bug surfaces, the next investigator can read the original case file and
 see which paths were already eliminated and why. Without that history, every new investigator re-runs the same dead
 ends.
 It also disciplines the present-tense investigator. If you can't delete a wrong hypothesis, you have to disprove it
 with cited evidence. Quietly dropping it when it becomes inconvenient is no longer an option.
 ## Challenge the Premise
 The user's description of the problem is a hypothesis, not a fact. "The cache is broken" is something a user *believes*.
 Before the skill builds an investigation around it, the technical claims are verified independently. If the evidence
 contradicts the premise, the report says so directly.
 This is the forensic instinct: the witness's account is data, not truth. Sometimes the reported bug is real but
 mislabeled. Sometimes the described symptom is downstream of a different cause. Investigations that take the premise as
 gospel diagnose the wrong defect, and the bug returns in a slightly different form.
 ## A Calibrated Walk
 The skill is one procedure, not two modes. It calibrates how much defect-chasing versus how much area-exploration the
 input demands, on a continuous scale.
 A symptom-driven case (a ticket, a crash, an error message, a "this used to work") leans into hypothesis tracking,
 timeline reconstruction, and a fix direction. A no-symptom case (understanding a module before you touch it, evaluating
 reusability, building a mental model) leans into I/O mapping, control-flow filtering, and a verification plan. Most
 real cases sit somewhere between, and the case file reflects whichever balance the evidence required.
 The discipline is the same regardless of where on the scale a case lands: stronghold first, evidence grading, hypothesis
 tracking, never erase. The output is always at `{implementation_artifacts}/investigations/{slug}-investigation.md`, with
 sections that don't apply to a given case left empty or omitted.
 When a deep bug requires understanding a broader subsystem, the procedure folds in the I/O mapping, control-flow
 filtering, working-backward-from-outputs, and cross-component boundary tracing techniques inline. The area model lands
 in the same case file. There is no mode switch.
 ## Methodology Lives in the Skill
 The investigator's discipline is a property of the skill itself. Whoever invokes `bmad-investigate` takes on the
 methodology and communication style for the run: clinical precision, evidence-first language, no hedging, case-file
 framing. When the skill ends, the caller returns to its prior voice. No persona swap, just a tone shift from the skill's
 principles.
 This matters because investigation and implementation reward different instincts. Investigators are slow and precise.
 Implementers are fast and confident. The same brain doing both in one session tends to do neither well. The skill
 carves out the investigative posture inline, without a context switch to a separate identity.
 ## What You Get
 A completed investigation file:
 - Separates Confirmed findings (with citations) from Deductions and Hypotheses
 - Preserves all hypotheses ever formed, with their final Status and Resolution
 - Reconstructs a timeline of events from multiple evidence sources
 - Identifies data gaps and what they would resolve
 - Provides actionable conclusions grounded in evidence
 - Includes a reproduction plan when a root cause is identified
 - Maintains an investigation backlog of paths still to explore
 Hand it to an engineer who was not present and they understand what happened, what is known, and what remains uncertain.
 That's the bar.
 ## The Bigger Idea
 Most "AI debugging" today blends evidence, reasoning, and code changes into one stream of plausible-looking text. The
 signal is hard to find, the dead ends repeat, and the case file, if there is one, is a chat log nobody wants to read.
 `bmad-investigate` treats investigation as a discipline with its own deliverable. Evidence has a grade. Hypotheses have
 a status. Wrong turns are documented, not erased. The case file outlives the session.
 When the next bug shows up that looks like one you've seen before, you have somewhere to start that isn't a blank
 prompt.
--- a/docs/fr/explanation/forensic-investigation.md
+++ b/docs/fr/explanation/forensic-investigation.md
@ -0,0 +1,157 @@
 ---
 title: "Enquête de code"
 description: Comment bmad-investigate traite chaque problème comme une scène d'enquête, classe les preuves et produit un dossier structuré sur lequel les ingénieurs peuvent agir
 sidebar:
  order: 6
 ---
 Vous confiez à `bmad-investigate` un journal de plantage, une trace de pile, ou simplement un « ça marchait avant, plus
 maintenant ». Le skill prend le relais avec la discipline d'enquête le temps de l'exécution. Il ne se met pas à
 corriger. Il ouvre un dossier d'enquête.
 Chaque constatation reçoit une note. Chaque hypothèse a un statut. Les fausses pistes sont conservées, pas effacées. Le
 livrable est un document qu'un autre ingénieur peut reprendre à froid.
 Cette page explique pourquoi l'enquête est une discipline à part entière, et ce que le skill apporte qu'un workflow de
 développement classique n'apporte pas.
 ## Le problème du « débogue, c'est tout »
 Le débogage classique mélange trois activités : examiner les preuves, raisonner sur la cause, et modifier le code pour
 tester la théorie. Quand elles sont mélangées, deux modes de défaillance apparaissent.
 Le premier est le **verrouillage narratif**[^1]. La première histoire plausible devient la théorie de travail, et chaque
 observation est tordue pour la confirmer. Le bug reste non corrigé jusqu'à ce que quelqu'un abandonne et reparte de
 zéro. Des heures plus tard.
 Le second est l'**amnésie probatoire**. Vous avez tracé quelque chose, l'avez écarté, mais n'avez pas écrit pourquoi.
 Deux jours plus tard, avec un regard frais, vous le retracez. Pire encore, un collègue reprend le bug et refait la même
 impasse que vous aviez déjà éliminée.
 La conception du skill est une réponse directe à ces deux modes.
 ## Classement des preuves
 Chaque constatation dans une enquête appartient à l'une de trois catégories.
 - **Confirmé.** Directement observé dans les logs, le code ou les dumps ; cité avec une référence spécifique (un
  `chemin:ligne`, un horodatage de log, un hash de commit). Si quelqu'un demande « comment le sais-tu ? », vous pointez
  la citation.
 - **Déduit.** Découle logiquement de preuves confirmées ; la chaîne de raisonnement est explicite. Si une étape de la
  chaîne est fausse, la déduction est fausse, et on peut voir précisément quelle étape.
 - **Hypothétique.** Plausible mais non confirmé. Énonce quelle preuve confirmerait ou réfuterait, et déclare d'avance ce
  qui le clôturerait. Les hypothèses sont explicitement *non factuelles*.
 Le classement n'est pas une posture d'humilité. Il rend le dossier lisible. Un lecteur peut parcourir la section
 Confirmé pour savoir ce qui est vrai, la section Déduit pour savoir ce qui en découle, et la section Hypothétique pour
 savoir ce qui reste ouvert. Confondre les trois est la première raison pour laquelle les enquêtes dérapent.
 ## Tête de pont d'abord
 L'enquête ne part jamais d'une théorie. Elle part d'une seule preuve confirmée et étend la zone à partir de là. Cette
 preuve peut être un message d'erreur précis, une trame de pile, ou une entrée de log horodatée.
 C'est l'inverse de la manière dont les enquêtes se déroulent souvent : quelqu'un a une intuition, construit une théorie,
 puis cherche les preuves qui la soutiennent. L'intuition peut être correcte ; la *méthode* est fragile parce qu'elle
 fait du biais de confirmation[^2] le comportement par défaut.
 Une tête de pont est un fait sur lequel vous pouvez revenir quand le raisonnement devient flou. Si une déduction vous
 emmène quelque part d'étrange, vous pouvez remonter jusqu'à la tête de pont et essayer une autre branche. Sans elle,
 vous ne savez pas quelle étape annuler.
 Quand les preuves sont rares, le skill le dit et bascule en exploration guidée par hypothèses : formuler des hypothèses
 à partir de ce qui est disponible, identifier ce qui testerait chacune, présenter une liste priorisée de données à
 collecter. L'absence de preuve est elle-même une constatation.
 ## Discipline des hypothèses
 Les hypothèses ne sont jamais supprimées du dossier. Quand une preuve en confirme ou en réfute une, son champ **Statut**
 passe d'Ouvert à Confirmé ou Réfuté, et une **Résolution** explique quelle preuve a tranché.
 Cette règle a un coût réel : les dossiers grossissent. Le bénéfice est réel aussi. L'historique complet du raisonnement
 fait partie du livrable. Six mois plus tard, quand un bug similaire surgit, le prochain enquêteur peut lire le dossier
 original et voir quelles pistes ont déjà été éliminées et pourquoi. Sans cet historique, chaque nouvel enquêteur refait
 les mêmes impasses.
 Cela discipline aussi l'enquêteur du présent. Si vous ne pouvez pas supprimer une hypothèse fausse, vous devez la
 réfuter avec une preuve citée. L'abandonner discrètement quand elle devient gênante n'est plus une option.
 ## Remettre en question la prémisse
 La description du problème par l'utilisateur est une hypothèse, pas un fait. « Le cache est cassé » est quelque chose
 que l'utilisateur *croit*. Avant que le skill ne construise une enquête autour, les affirmations techniques sont
 vérifiées de manière indépendante. Si la preuve contredit la prémisse, le rapport le dit directement.
 C'est l'instinct de l'enquêteur : le récit du témoin est une donnée, pas la vérité. Parfois le bug rapporté est réel
 mais mal étiqueté. Parfois le symptôme décrit est en aval d'une cause différente. Les enquêtes qui prennent la prémisse
 pour argent comptant diagnostiquent le mauvais défaut, et le bug revient sous une forme légèrement différente.
 ## Une marche calibrée
 Le skill est une seule procédure, pas deux modes. Il calibre la part d'investigation de défaut versus la part
 d'exploration de zone que l'entrée demande, sur une échelle continue.
 Un cas piloté par symptôme (un ticket, un plantage, un message d'erreur, un « ça marchait avant ») penche vers le suivi
 d'hypothèses, la reconstruction de la chronologie et une direction de correction. Un cas sans symptôme (comprendre un
 module avant de le toucher, évaluer la réutilisabilité, bâtir un modèle mental) penche vers la cartographie
 entrées/sorties, le filtrage du flux de contrôle et un plan de vérification. La plupart des cas réels se situent quelque
 part entre les deux, et le dossier reflète l'équilibre que les preuves ont exigé.
 La discipline est la même quel que soit l'endroit de l'échelle où se situe un cas : tête de pont d'abord, classement
 des preuves, suivi des hypothèses, jamais effacer. La sortie est toujours
 `{implementation_artifacts}/investigations/{slug}-investigation.md`, avec les sections qui ne s'appliquent pas à un cas
 laissées vides ou omises.
 Quand un bug profond exige de comprendre un sous-système plus large, la procédure intègre en ligne les techniques de
 cartographie entrées/sorties, de filtrage du flux de contrôle, de raisonnement à rebours depuis les sorties et de
 traçage des frontières inter-composants[^3]. Le modèle de la zone atterrit dans le même dossier. Pas de changement de
 mode.
 ## La méthodologie vit dans le skill
 La discipline d'enquête est une propriété du skill lui-même. Quiconque invoque `bmad-investigate` adopte la méthodologie
 et le style de communication pour l'exécution : précision clinique, langage centré sur la preuve, pas de prudence
 inutile, présentation en dossier de cas. Quand le skill se termine, l'appelant retrouve sa voix d'avant. Pas de
 changement de persona, juste un déplacement de ton issu des principes du skill.
 Cela compte parce que l'enquête et l'implémentation récompensent des instincts différents. Les enquêteurs sont lents et
 précis. Les implémenteurs sont rapides et confiants. Le même cerveau faisant les deux dans une seule session finit par
 mal faire les deux. Le skill délimite la posture d'enquête en ligne, sans changement de contexte vers une identité
 séparée.
 ## Ce que vous obtenez
 Un fichier d'enquête achevé :
 - Sépare les constatations Confirmées (avec citations) des Déductions et des Hypothèses
 - Préserve toutes les hypothèses jamais formulées, avec leur Statut final et leur Résolution
 - Reconstruit une chronologie des événements à partir de plusieurs sources de preuves
 - Identifie les lacunes de données et ce qu'elles résoudraient
 - Fournit des conclusions actionnables ancrées dans les preuves
 - Inclut un plan de reproduction quand une cause racine est identifiée
 - Maintient un backlog d'enquête de pistes encore à explorer
 Donnez-le à un ingénieur qui n'était pas là, et il comprend ce qui s'est passé, ce qui est connu, et ce qui reste
 incertain. C'est la barre.
 ## L'idée plus large
 La plupart du « débogage par IA » d'aujourd'hui mélange preuves, raisonnement et changements de code en un seul flux de
 texte plausible. Le signal est difficile à trouver, les impasses se répètent, et le dossier, s'il en existe un, est un
 journal de chat que personne ne veut lire.
 `bmad-investigate` traite l'enquête comme une discipline avec son propre livrable. La preuve a une note. Les hypothèses
 ont un statut. Les fausses pistes sont documentées, pas effacées. Le dossier survit à la session.
 Quand le prochain bug ressemblant à un que vous avez déjà vu apparaîtra, vous aurez un point de départ qui ne sera pas
 une invite vide.
 ## Glossaire
 [^1]: **Verrouillage narratif** : phénomène cognitif par lequel un raisonnement adopte la première explication plausible
 et l'enrichit progressivement, devenant de plus en plus difficile à abandonner même face à des preuves contraires.
 [^2]: **Biais de confirmation** : tendance cognitive à rechercher, interpréter et favoriser les informations qui
 confirment des croyances préexistantes, tout en ignorant ou minimisant celles qui les contredisent.
 [^3]: **Passage de frontière** : transition entre deux zones d'exécution distinctes (langage, processus, machine,
 client/serveur, code/configuration). Les frontières concentrent les bugs car chaque côté suppose que l'autre s'est
 comporté comme documenté.
--- a/docs/fr/reference/workflow-map.md
+++ b/docs/fr/reference/workflow-map.md
@ -5,13 +5,23 @@ sidebar:
  order: 1
 ---
-La méthode BMad (BMM) est un module de l'écosystème BMad, conçu pour suivre les meilleures pratiques de l'ingénierie du contexte et de la planification. Les agents IA fonctionnent de manière optimale avec un contexte clair et structuré. Le système BMM construit ce contexte progressivement à travers 4 phases distinctes — chaque phase, et plusieurs workflows optionnels au sein de chaque phase, produisent des documents qui alimentent la phase suivante, afin que les agents sachent toujours quoi construire et pourquoi.
+La méthode BMad (BMM) est un module de l'écosystème BMad, conçu pour suivre les meilleures pratiques de l'ingénierie du
 contexte et de la planification. Les agents IA fonctionnent de manière optimale avec un contexte clair et structuré. Le
 système BMM construit ce contexte progressivement à travers 4 phases distinctes — chaque phase, et plusieurs workflows
 optionnels au sein de chaque phase, produisent des documents qui alimentent la phase suivante, afin que les agents
 sachent toujours quoi construire et pourquoi.
-La logique et les concepts proviennent des méthodologies agiles qui ont été utilisées avec succès dans l'industrie comme cadre mental de référence.
+La logique et les concepts proviennent des méthodologies agiles qui ont été utilisées avec succès dans l'industrie comme
 cadre mental de référence.
-Si à tout moment vous ne savez pas quoi faire, le skill `bmad-help` vous aidera à rester sur la bonne voie ou à savoir quoi faire ensuite. Vous pouvez toujours vous référer à cette page également — mais `bmad-help` est entièrement interactif et beaucoup plus rapide si vous avez déjà installé la méthode BMad. De plus, si vous utilisez différents modules qui ont étendu la méthode BMad ou ajouté d'autres modules complémentaires non extensifs — `bmad-help` évolue pour connaître tout ce qui est disponible et vous donner les meilleurs conseils du moment.
+Si à tout moment vous ne savez pas quoi faire, le skill `bmad-help` vous aidera à rester sur la bonne voie ou à savoir
 quoi faire ensuite. Vous pouvez toujours vous référer à cette page également — mais `bmad-help` est entièrement
 interactif et beaucoup plus rapide si vous avez déjà installé la méthode BMad. De plus, si vous utilisez différents
 modules qui ont étendu la méthode BMad ou ajouté d'autres modules complémentaires non extensifs — `bmad-help` évolue
 pour connaître tout ce qui est disponible et vous donner les meilleurs conseils du moment.
-Note finale importante : Chaque workflow ci-dessous peut être exécuté directement avec l'outil de votre choix via un skill ou en chargeant d'abord un agent et en utilisant l'entrée du menu des agents.
+Note finale importante : Chaque workflow ci-dessous peut être exécuté directement avec l'outil de votre choix via un
 skill ou en chargeant d'abord un agent et en utilisant l'entrée du menu des agents.
 <iframe src="/workflow-map-diagram-fr.html" title="Diagramme de la carte des workflows de la méthode BMad" width="100%" height="100%" style="border-radius: 8px; border: 1px solid #334155; min-height: 900px;"></iframe>
@ -21,14 +31,15 @@ Note finale importante : Chaque workflow ci-dessous peut être exécuté directe
 ## Phase 1 : Analyse (Optionnelle)
-Explorez l’espace problème et validez les idées avant de vous engager dans la planification. [**Découvrez ce que fait chaque outil et quand l’utiliser**](../explanation/analysis-phase.md).
+Explorez l’espace problème et validez les idées avant de vous engager dans la planification. [**Découvrez ce que fait
 chaque outil et quand l’utiliser**](../explanation/analysis-phase.md).
 | Workflow                                                                  | Objectif                                                                                 | Produit                   |
 |---------------------------------------------------------------------------|------------------------------------------------------------------------------------------|---------------------------|
 | `bmad-brainstorming`                                                      | Brainstormez des idées de projet avec l’accompagnement guidé d’un coach de brainstorming | `brainstorming-report.md` |
 | `bmad-domain-research`, `bmad-market-research`, `bmad-technical-research` | Validez les hypothèses de marché, techniques ou de domaine                               | Rapport de recherches     |
 | `bmad-product-brief`                                                      | Capturez la vision stratégique — idéal lorsque votre concept est clair                   | `product-brief.md`        |
-| `bmad-prfaq`                                                              | Working Backwards — éprouvez et forgez votre concept produit                             | `prfaq-{project}.md`       |
+| `bmad-prfaq`                                                              | Working Backwards — éprouvez et forgez votre concept produit                             | `prfaq-{project}.md`      |
 ## Phase 2 : Planification
@ -36,60 +47,75 @@ Définissez ce qu'il faut construire et pour qui.
 | Workflow                | Objectif                                                | Produit      |
 |-------------------------|---------------------------------------------------------|--------------|
-| `bmad-create-prd`       | Définissez les exigences (FRs/NFRs)[^1]                     | `PRD.md`[^2]     |
+| `bmad-create-prd`       | Définissez les exigences (FRs/NFRs)[^1]                 | `PRD.md`[^2] |
 | `bmad-create-ux-design` | Concevez l'expérience utilisateur (lorsque l'UX compte) | `ux-spec.md` |
 ## Phase 3 : Solutioning
 Décidez comment le construire et décomposez le travail en stories.
-| Workflow                              | Objectif                                          | Produit                      |
+| Workflow                              | Objectif                                          | Produit                         |
-|---------------------------------------|---------------------------------------------------|------------------------------|
+|---------------------------------------|---------------------------------------------------|---------------------------------|
-| `bmad-create-architecture`            | Rendez les décisions techniques explicites        | `architecture.md` avec ADRs[^3]  |
+| `bmad-create-architecture`            | Rendez les décisions techniques explicites        | `architecture.md` avec ADRs[^3] |
-| `bmad-create-epics-and-stories`       | Décomposez les exigences en travail implémentable | Fichiers d'epic avec stories |
+| `bmad-create-epics-and-stories`       | Décomposez les exigences en travail implémentable | Fichiers d'epic avec stories    |
-| `bmad-check-implementation-readiness` | Vérification avant implémentation                 | Décision Passe/Réserves/Échec  |
+| `bmad-check-implementation-readiness` | Vérification avant implémentation                 | Décision Passe/Réserves/Échec   |
 ## Phase 4 : Implémentation
 Construisez, une story à la fois. Bientôt disponible : automatisation complète de la phase 4 !
-| Workflow               | Objectif                                                                            | Produit                          |
+| Workflow               | Objectif                                                                            | Produit                                              |
-|------------------------|-------------------------------------------------------------------------------------|----------------------------------|
+|------------------------|-------------------------------------------------------------------------------------|------------------------------------------------------|
-| `bmad-sprint-planning` | Initialisez le suivi (une fois par projet pour séquencer le cycle de développement) | `sprint-status.yaml`             |
+| `bmad-sprint-planning` | Initialisez le suivi (une fois par projet pour séquencer le cycle de développement) | `sprint-status.yaml`                                 |
-| `bmad-create-story`    | Préparez la story suivante pour implémentation                                      | `story-[slug].md`                |
+| `bmad-create-story`    | Préparez la story suivante pour implémentation                                      | `story-[slug].md`                                    |
-| `bmad-dev-story`       | Implémentez la story                                                                | Code fonctionnel + tests         |
+| `bmad-dev-story`       | Implémentez la story                                                                | Code fonctionnel + tests                             |
-| `bmad-code-review`     | Validez la qualité de l'implémentation                                              | Approuvé ou changements demandés |
+| `bmad-code-review`     | Validez la qualité de l'implémentation                                              | Approuvé ou changements demandés                     |
-| `bmad-correct-course`  | Gérez les changements significatifs en cours de sprint                              | Plan mis à jour ou réorientation |
+| `bmad-correct-course`  | Gérez les changements significatifs en cours de sprint                              | Plan mis à jour ou réorientation                     |
-| `bmad-sprint-status`   | Suivez la progression du sprint et le statut des stories                            | Mise à jour du statut du sprint  |
+| `bmad-sprint-status`   | Suivez la progression du sprint et le statut des stories                            | Mise à jour du statut du sprint                      |
-| `bmad-retrospective`   | Revue après complétion d'un epic                                                    | Leçons apprises                  |
+| `bmad-retrospective`   | Revue après complétion d'un epic                                                    | Leçons apprises                                      |
 | `bmad-investigate`     | Enquête de cas avec conclusions à preuves graduées, calibrée selon l'entrée         | `{slug}-investigation.md`        |
 ## Quick Dev (Parcours Parallèle)
 Sautez les phases 1-3 pour les travaux de faible envergure et bien compris.
-| Workflow         | Objectif                                                                            | Produit               |
+| Workflow         | Objectif                                                                            | Produit            |
-|------------------|-------------------------------------------------------------------------------------|-----------------------|
+|------------------|-------------------------------------------------------------------------------------|--------------------|
 | `bmad-quick-dev` | Flux rapide unifié — clarifie l'intention, planifie, implémente, révise et présente | `spec-*.md` + code |
 ## Gestion du Contexte
-Chaque document devient le contexte de la phase suivante. Le PRD[^2] indique à l'architecte quelles contraintes sont importantes. L'architecture indique à l'agent de développement quels modèles suivre. Les fichiers de story fournissent un contexte focalisé et complet pour l'implémentation. Sans cette structure, les agents prennent des décisions incohérentes.
+Chaque document devient le contexte de la phase suivante. Le PRD[^2] indique à l'architecte quelles contraintes sont
 importantes. L'architecture indique à l'agent de développement quels modèles suivre. Les fichiers de story fournissent
 un contexte focalisé et complet pour l'implémentation. Sans cette structure, les agents prennent des décisions
 incohérentes.
 ### Contexte du Projet
 :::tip[Recommandé]
-Créez `project-context.md` pour vous assurer que les agents IA suivent les règles et préférences de votre projet. Ce fichier fonctionne comme une constitution pour votre projet — il guide les décisions d'implémentation à travers tous les workflows. Ce fichier optionnel peut être généré à la fin de la création de l'architecture, ou dans un projet existant il peut également être généré pour capturer ce qui est important de conserver aligné avec les conventions actuelles.
+Créez `project-context.md` pour vous assurer que les agents IA suivent les règles et préférences de votre projet. Ce
 fichier fonctionne comme une constitution pour votre projet — il guide les décisions d'implémentation à travers tous les
 workflows. Ce fichier optionnel peut être généré à la fin de la création de l'architecture, ou dans un projet existant
 il peut également être généré pour capturer ce qui est important de conserver aligné avec les conventions actuelles.
 :::
 **Comment le créer :**
- **Manuellement** — Créez `_bmad-output/project-context.md` avec votre pile technologique et vos règles d'implémentation
+- **Manuellement** — Créez `_bmad-output/project-context.md` avec votre pile technologique et vos règles
- **Générez-le** — Exécutez `bmad-generate-project-context` pour l'auto-générer à partir de votre architecture ou de votre codebase
+  d'implémentation
 - **Générez-le** — Exécutez `bmad-generate-project-context` pour l'auto-générer à partir de votre architecture ou de
  votre codebase
 [**En savoir plus sur project-context.md**](../explanation/project-context.md)
 ## Glossaire
-[^1]: FR / NFR (Functional / Non-Functional Requirement) : exigences décrivant respectivement **ce que le système doit faire** (fonctionnalités, comportements attendus) et **comment il doit le faire** (contraintes de performance, sécurité, fiabilité, ergonomie, etc.).
+[^1]: FR / NFR (Functional / Non-Functional Requirement) : exigences décrivant respectivement **ce que le système doit
-[^2]: PRD (Product Requirements Document) : document de référence qui décrit les objectifs du produit, les besoins utilisateurs, les fonctionnalités attendues, les contraintes et les critères de succès, afin d’aligner les équipes sur ce qui doit être construit et pourquoi.
+faire** (fonctionnalités, comportements attendus) et **comment il doit le faire** (contraintes de performance, sécurité,
-[^3]: ADR (Architecture Decision Record) : document qui consigne une décision d’architecture, son contexte, les options envisagées, le choix retenu et ses conséquences, afin d’assurer la traçabilité et la compréhension des décisions techniques dans le temps.
+fiabilité, ergonomie, etc.).
 [^2]: PRD (Product Requirements Document) : document de référence qui décrit les objectifs du produit, les besoins
 utilisateurs, les fonctionnalités attendues, les contraintes et les critères de succès, afin d’aligner les équipes sur
 ce qui doit être construit et pourquoi.
 [^3]: ADR (Architecture Decision Record) : document qui consigne une décision d’architecture, son contexte, les options
 envisagées, le choix retenu et ses conséquences, afin d’assurer la traçabilité et la compréhension des décisions
 techniques dans le temps.
--- a/docs/reference/workflow-map.md
+++ b/docs/reference/workflow-map.md
@ -5,13 +5,22 @@ sidebar:
  order: 1
 ---
-The BMad Method (BMM) is a module in the BMad Ecosystem, targeted at following the best practices of context engineering and planning. AI agents work best with clear, structured context. The BMM system builds that context progressively across 4 distinct phases - each phase, and multiple workflows optionally within each phase, produce documents that inform the next, so agents always know what to build and why.
+The BMad Method (BMM) is a module in the BMad Ecosystem, targeted at following the best practices of context engineering
 and planning. AI agents work best with clear, structured context. The BMM system builds that context progressively
 across 4 distinct phases - each phase, and multiple workflows optionally within each phase, produce documents that
 inform the next, so agents always know what to build and why.
-The rationale and concepts come from agile methodologies that have been used across the industry with great success as a mental framework.
+The rationale and concepts come from agile methodologies that have been used across the industry with great success as a
 mental framework.
-If at any time you are unsure what to do, the `bmad-help` skill will help you stay on track or know what to do next. You can always refer to this for reference also - but `bmad-help` is fully interactive and much quicker if you have already installed the BMad Method. Additionally, if you are using different modules that have extended the BMad Method or added other complementary non-extension modules - `bmad-help` evolves to know all that is available to give you the best in-the-moment advice.
+If at any time you are unsure what to do, the `bmad-help` skill will help you stay on track or know what to do next. You
 can always refer to this for reference also - but `bmad-help` is fully interactive and much quicker if you have already
 installed the BMad Method. Additionally, if you are using different modules that have extended the BMad Method or added
 other complementary non-extension modules - `bmad-help` evolves to know all that is available to give you the best
 in-the-moment advice.
-Final important note: Every workflow below can be run directly with your tool of choice via skill or by loading an agent first and using the entry from the agents menu.
+Final important note: Every workflow below can be run directly with your tool of choice via skill or by loading an agent
 first and using the entry from the agents menu.
 <iframe src="/workflow-map-diagram.html" title="BMad Method Workflow Map Diagram" width="100%" height="100%" style="border-radius: 8px; border: 1px solid #334155; min-height: 900px;"></iframe>
@ -21,30 +30,31 @@ Final important note: Every workflow below can be run directly with your tool of
 ## Phase 1: Analysis (Optional)
-Explore the problem space and validate ideas before committing to planning. [**Learn what each tool does and when to use it**](../explanation/analysis-phase.md).
+Explore the problem space and validate ideas before committing to planning. [**Learn what each tool does and when to use
 it**](../explanation/analysis-phase.md).
-| Workflow                        | Purpose                                                                    | Produces                  |
+| Workflow                                                                  | Purpose                                                                    | Produces                  |
-| ------------------------------- | -------------------------------------------------------------------------- | ------------------------- |
+|---------------------------------------------------------------------------|----------------------------------------------------------------------------|---------------------------|
-| `bmad-brainstorming`            | Brainstorm Project Ideas with guided facilitation of a brainstorming coach | `brainstorming-report.md` |
+| `bmad-brainstorming`                                                      | Brainstorm Project Ideas with guided facilitation of a brainstorming coach | `brainstorming-report.md` |
-| `bmad-domain-research`, `bmad-market-research`, `bmad-technical-research` | Validate market, technical, or domain assumptions | Research findings |
+| `bmad-domain-research`, `bmad-market-research`, `bmad-technical-research` | Validate market, technical, or domain assumptions                          | Research findings         |
-| `bmad-product-brief`            | Capture strategic vision — best when your concept is clear                 | `product-brief.md`        |
+| `bmad-product-brief`                                                      | Capture strategic vision — best when your concept is clear                 | `product-brief.md`        |
-| `bmad-prfaq`                    | Working Backwards — stress-test and forge your product concept             | `prfaq-{project}.md`      |
+| `bmad-prfaq`                                                              | Working Backwards — stress-test and forge your product concept             | `prfaq-{project}.md`      |
 ## Phase 2: Planning
 Define what to build and for whom.
-| Workflow                    | Purpose                                  | Produces     |
+| Workflow                | Purpose                                  | Produces     |
-| --------------------------- | ---------------------------------------- | ------------ |
+|-------------------------|------------------------------------------|--------------|
-| `bmad-create-prd`    | Define requirements (FRs/NFRs)           | `PRD.md`     |
+| `bmad-create-prd`       | Define requirements (FRs/NFRs)           | `PRD.md`     |
-| `bmad-create-ux-design`  | Design user experience (when UX matters) | `ux-spec.md` |
+| `bmad-create-ux-design` | Design user experience (when UX matters) | `ux-spec.md` |
 ## Phase 3: Solutioning
 Decide how to build it and break work into stories.
-| Workflow                                  | Purpose                                    | Produces                    |
+| Workflow                              | Purpose                                    | Produces                    |
-| ----------------------------------------- | ------------------------------------------ | --------------------------- |
+|---------------------------------------|--------------------------------------------|-----------------------------|
 | `bmad-create-architecture`            | Make technical decisions explicit          | `architecture.md` with ADRs |
 | `bmad-create-epics-and-stories`       | Break requirements into implementable work | Epic files with stories     |
 | `bmad-check-implementation-readiness` | Gate check before implementation           | PASS/CONCERNS/FAIL decision |
@ -53,32 +63,38 @@ Decide how to build it and break work into stories.
 Build it, one story at a time. Coming soon, full phase 4 automation!
-| Workflow                   | Purpose                                                                  | Produces                         |
+| Workflow               | Purpose                                                                       | Produces                                             |
-| -------------------------- | ------------------------------------------------------------------------ | -------------------------------- |
+|------------------------|-------------------------------------------------------------------------------|------------------------------------------------------|
-| `bmad-sprint-planning` | Initialize tracking (once per project to sequence the dev cycle)         | `sprint-status.yaml`          |
+| `bmad-sprint-planning` | Initialize tracking (once per project to sequence the dev cycle)              | `sprint-status.yaml`                                 |
-| `bmad-create-story`    | Prepare next story for implementation                                    | `story-[slug].md`             |
+| `bmad-create-story`    | Prepare next story for implementation                                         | `story-[slug].md`                                    |
-| `bmad-dev-story`       | Implement the story                                                      | Working code + tests          |
+| `bmad-dev-story`       | Implement the story                                                           | Working code + tests                                 |
-| `bmad-code-review`     | Validate implementation quality                                          | Approved or changes requested |
+| `bmad-code-review`     | Validate implementation quality                                               | Approved or changes requested                        |
-| `bmad-correct-course`  | Handle significant mid-sprint changes                                    | Updated plan or re-routing    |
+| `bmad-correct-course`  | Handle significant mid-sprint changes                                         | Updated plan or re-routing                           |
-| `bmad-sprint-status`   | Track sprint progress and story status                                   | Sprint status update          |
+| `bmad-sprint-status`   | Track sprint progress and story status                                        | Sprint status update                                 |
-| `bmad-retrospective`   | Review after epic completion                                             | Lessons learned               |
+| `bmad-retrospective`   | Review after epic completion                                                  | Lessons learned                                      |
 | `bmad-investigate`     | Forensic case investigation with evidence-graded findings, calibrated to the input | `{slug}-investigation.md` |
 ## Quick Flow (Parallel Track)
 Skip phases 1-3 for small, well-understood work.
-| Workflow           | Purpose                                                                     | Produces               |
+| Workflow         | Purpose                                                                   | Produces           |
-| ------------------ | --------------------------------------------------------------------------- | ---------------------- |
+|------------------|---------------------------------------------------------------------------|--------------------|
-| `bmad-quick-dev`   | Unified quick flow — clarify intent, plan, implement, review, and present   | `spec-*.md` + code  |
+| `bmad-quick-dev` | Unified quick flow — clarify intent, plan, implement, review, and present | `spec-*.md` + code |
 ## Context Management
-Each document becomes context for the next phase. The PRD tells the architect what constraints matter. The architecture tells the dev agent which patterns to follow. Story files give focused, complete context for implementation. Without this structure, agents make inconsistent decisions.
+Each document becomes context for the next phase. The PRD tells the architect what constraints matter. The architecture
 tells the dev agent which patterns to follow. Story files give focused, complete context for implementation. Without
 this structure, agents make inconsistent decisions.
 ### Project Context
 :::tip[Recommended]
-Create `project-context.md` to ensure AI agents follow your project's rules and preferences. This file works like a constitution for your project — it guides implementation decisions across all workflows. This optional file can be generated at the end of Architecture Creation, or in an existing project it can be generated also to capture whats important to keep aligned with current conventions.
+Create `project-context.md` to ensure AI agents follow your project's rules and preferences. This file works like a
 constitution for your project — it guides implementation decisions across all workflows. This optional file can be
 generated at the end of Architecture Creation, or in an existing project it can be generated also to capture whats
 important to keep aligned with current conventions.
 :::
 **How to create it:**
--- a/evals/bmm-skills/bmad-product-brief/evals.json
+++ b/evals/bmm-skills/bmad-product-brief/evals.json
@ -0,0 +1,268 @@
 {
  "skill_name": "bmad-product-brief",
  "_design_notes": "16 single-shot evals across two patterns. Pattern A (A1-A8) tests artifact correctness given complete inputs in headless mode. Pattern B (B1-B8) tests process discipline (decision log fidelity, polish execution, phase ordering, intent boundaries, distillate generation) by inspecting transcript and side-artifacts. Facilitation/conversation-quality evals are deferred to a future multi-turn simulator.",
  "evals": [
    {
      "id": "A1",
      "_pattern": "artifact-correctness",
      "prompt": "Run headless. Create a product brief for InsuLens.\n\nContext (use exactly this \u2014 do not invent):\n- Product: a smartphone app that pairs with off-the-shelf $200 thermal imaging accessories (FLIR ONE Pro and Seek Compact Pro). The app guides homeowners through a structured walkthrough and produces a professional-grade insulation audit in under 20 minutes.\n- Target: suburban homeowners aged 35-65 with houses built before 2000 (poor original insulation, rising energy bills).\n- Validation evidence: 50 user interviews completed in Q4 2025; 78% expressed willingness to pay $49 for a one-time audit if results were credible.\n- Stakes: this brief is the primary input investors will read before our first Series A pitch call.\n- Hardware dependency: requires a thermal imaging accessory (we do not manufacture hardware).\n- Known unknowns: insurance/warranty implications of homeowner-driven audits; whether the 78% intent translates to paid conversion at scale.\n- Distillate: yes, generate one \u2014 the brief will feed downstream PRD work.\n\nRight-size for investor-stage rigor. Output a JSON status block at the end with status, intent, and artifact paths.",
      "expected_output": "A run folder containing brief.md (with valid YAML frontmatter), decision-log.md, and distillate.md. Brief is 1-2 pages, addresses target audience, hardware dependency, validation evidence, and surfaces unknowns alongside knowns. Final assistant message includes JSON with status='complete', intent='create', and artifact paths.",
      "files": [],
      "expectations": [
        "A run folder is created with brief.md and decision-log.md",
        "brief.md has YAML frontmatter containing all four required fields: title, status, created (ISO 8601), updated (ISO 8601)",
        "brief.md frontmatter status is 'draft' or 'final' (not missing or empty)",
        "brief.md word count is between 250 and 1500",
        "brief.md identifies the suburban-homeowner-aged-35-65 target audience",
        "brief.md references the thermal imaging hardware dependency (FLIR ONE Pro / Seek Compact Pro or equivalent)",
        "brief.md references the 50-user interview validation OR the 78% willingness-to-pay finding",
        "brief.md surfaces at least one explicit unknown or unvalidated assumption",
        "brief.md does not introduce facts absent from the input prompt (no invented competitors, fabricated metrics, or unstated partnerships)",
        "distillate.md exists in the run folder",
        "The final assistant message contains a JSON object with keys 'status', 'intent', and at least one artifact path",
        "The JSON 'intent' value is 'create'"
      ]
    },
    {
      "id": "A2",
      "_pattern": "artifact-correctness",
      "prompt": "Run headless. Create a brief for our app idea.",
      "expected_output": "Headless mode with insufficient context should halt with status='blocked' and a reason field. No (or only skeleton) brief should be written. The skill must not invent a product to draft against.",
      "files": [],
      "expectations": [
        "The final assistant message contains a JSON object with key 'status' set to 'blocked'",
        "The JSON output includes a 'reason' field explaining what context is missing",
        "If brief.md exists at all, its body content (excluding frontmatter) is fewer than 100 words OR is a placeholder skeleton \u2014 the skill did not invent a product",
        "The transcript contains no clarifying questions sent back to the user (headless mode honored)"
      ]
    },
    {
      "id": "A3",
      "_pattern": "artifact-correctness",
      "prompt": "Run headless. Create a brief for our neighborhood compost coordinator app idea \u2014 we are moving forward with it. Q2 brainstorming session notes are at evals/bmm-skills/bmad-product-brief/files/q2-brainstorm.md; pull only what is relevant to the compost concept.",
      "expected_output": "Brief focuses tightly on the compost coordinator concept. Source brainstorm is filtered, not ingested wholesale. Decision-log records that filtering occurred.",
      "files": ["evals/bmm-skills/bmad-product-brief/files/q2-brainstorm.md"],
      "expectations": [
        "brief.md addresses the neighborhood compost coordinator concept",
        "brief.md does not introduce content from unrelated brainstorm topics (weather + mood, meditation chime, podcasting tool, craft beer subscription, AI sommelier, office plants, ride coordinator, cookbook app, AR home staging)",
        "brief.md word count is between 250 and 1500",
        "brief.md incorporates at least 2 specific details from the compost section of the brainstorm (e.g., two-sided market with apartment dwellers and home compost-pile owners, hyperlocal neighborhood scope, free-at-launch with eventual subscription, Portland Sunnyside/Hawthorne pilot)",
        "decision-log.md indicates the brainstorm was filtered for relevance, not ingested whole"
      ]
    },
    {
      "id": "A4",
      "_pattern": "artifact-correctness",
      "prompt": "Run headless. Validate the brief at evals/bmm-skills/bmad-product-brief/files/mossridge-brief/brief.md \u2014 the Mossridge Public Library board meets Monday and we need this to land. Read the addendum and decision-log in the same folder first. Cite specific sections, identify weaknesses, caveat what cannot be evaluated. Return inline only \u2014 no separate validation file.",
      "expected_output": "Inline critique citing specific sections from the input brief. No new files. Caveats at least one claim that cannot be evaluated from the brief alone. Offers to roll findings into an Update.",
      "files": [
        "evals/bmm-skills/bmad-product-brief/files/mossridge-brief/brief.md",
        "evals/bmm-skills/bmad-product-brief/files/mossridge-brief/addendum.md",
        "evals/bmm-skills/bmad-product-brief/files/mossridge-brief/decision-log.md"
      ],
      "expectations": [
        "The final output cites specific section names or line content from the input brief (not generic feedback)",
        "The output identifies at least one specific weakness or area for improvement in the input brief",
        "The output explicitly caveats at least one claim that cannot be evaluated from the brief alone (e.g., community demand, funding feasibility, volunteer sustainability)",
        "The output offers to roll findings into an Update (or equivalent next-step proposal)",
        "The final assistant message contains a JSON object with intent='validate'"
      ]
    },
    {
      "id": "A5",
      "_pattern": "artifact-correctness",
      "prompt": "Run headless. Create a brief for: a weekend-project iOS app called Sproutkeeper that reminds houseplant owners when to water their plants based on plant type and indoor humidity sensor data. Target is hobbyist plant owners. MVP scope only, single-developer side project, no investors, no team, just personal evening project.",
      "expected_output": "Lightweight brief right-sized to a side project. Low rigor. No investor-grade framing. Probably no distillate unless the side-project user explicitly asked.",
      "files": [],
      "expectations": [
        "The final assistant message contains a JSON object with intent='create'",
        "brief.md exists at the path referenced in the JSON output",
        "brief.md is right-sized for a side project (closer to 250-500 words than 1500)",
        "brief.md does not include investor-grade framing (no 'Series A inputs', 'TAM/SAM/SOM', 'go-to-market strategy' boilerplate when the user said this is a personal evening project)",
        "The transcript contains no clarifying questions to the user",
        "Sections that do not earn their place for a side project are dropped or kept minimal (e.g., no extensive Risk or Success Criteria padding)"
      ]
    },
    {
      "id": "A6",
      "_pattern": "artifact-correctness",
      "prompt": "Run headless. Create a brief from this memo. It is from our last working group on a new microcredential program at Branfield Community College. Memo is at evals/bmm-skills/bmad-product-brief/files/branfield-memo.md. Use what is there; do not re-elicit facts already present.",
      "expected_output": "Brief reflects content from the memo. No re-asking for facts already present. Decision-log notes ingestion of the memo.",
      "files": ["evals/bmm-skills/bmad-product-brief/files/branfield-memo.md"],
      "expectations": [
        "brief.md incorporates at least 3 distinct facts or decisions present in the input memo",
        "decision-log.md references having used the memo as source material",
        "The transcript does not ask the user to re-state the program name, target student, or core curriculum focus if those are present in the memo",
        "brief.md does not invent program details not present in the memo"
      ]
    },
    {
      "id": "A7",
      "_pattern": "artifact-correctness",
      "prompt": "Run headless. Create a brief for Brightway \u2014 our smart bike helmet with crash detection, turn signals, and braking lights. Meridian Insights produced a market research report on e-mobility at evals/bmm-skills/bmad-product-brief/files/meridian-mobility-report.md. Use only what is relevant to the safety helmet category \u2014 do not let the e-scooter, charging-infrastructure, or bike-share segments bleed into the brief.",
      "expected_output": "Brief focuses on the smart bike helmet concept. Pulls relevant findings from the helmet section. Other mobility segments do not appear.",
      "files": ["evals/bmm-skills/bmad-product-brief/files/meridian-mobility-report.md"],
      "expectations": [
        "brief.md addresses the Brightway smart bike helmet concept",
        "brief.md does not introduce content from unrelated mobility segments (e-scooters, charging infrastructure, bike-share, vehicle-to-grid)",
        "brief.md word count is between 250 and 1500",
        "brief.md incorporates at least 2 specific findings from the smart helmet section of the report (e.g., market sizing, key players, crash detection technology trends, regulatory or insurance landscape)",
        "decision-log.md indicates the report was filtered to the helmet category rather than ingested whole"
      ]
    },
    {
      "id": "A8",
      "_pattern": "artifact-correctness",
      "prompt": "Run headless. Create a brief for Pantry Bridge \u2014 a meal-kit subscription targeted at adults 65+ who live alone and want fresh meals without grocery shopping. Customer research transcripts are at evals/bmm-skills/bmad-product-brief/files/pantry-bridge-interviews.md. Pull what is relevant from the older-adult interviews; do not conflate insights from the working-parent, student, or corporate-buyer personas.",
      "expected_output": "Brief focuses on the older-adult target persona. Eleanor's interview drives the insights. Other personas do not pollute the brief.",
      "files": ["evals/bmm-skills/bmad-product-brief/files/pantry-bridge-interviews.md"],
      "expectations": [
        "brief.md addresses the Pantry Bridge older-adult meal-kit concept",
        "brief.md does not conflate insights from non-target personas (working parent Susan, college student Marcus, corporate cafeteria buyer Dimitri)",
        "brief.md word count is between 250 and 1500",
        "brief.md incorporates at least 2 specific insights from Eleanor's interview (e.g., grocery-trip difficulty, portion sizing, dietary restrictions, social aspects of meals, trust concerns)",
        "decision-log.md notes which interviews were used and which were excluded"
      ]
    },
    {
      "id": "B1",
      "_pattern": "process-discipline",
      "prompt": "Run headless. Create a brief for HelmStack \u2014 an open-source observability platform for distributed systems.\n\nWe have made these specific decisions and want each captured in the decision log with rationale:\n\n1. Pricing: Free open-source core; paid SaaS at $29/seat/month. Rejected paid-one-shot-license model because it would limit network effects in the OSS community.\n2. Launch: Invite-only beta for 6 weeks before public launch. Rejected open public launch \u2014 operational risk too high before stability is proven on real workloads.\n3. Stack: TypeScript + Postgres for the backend. Rejected Go + MongoDB \u2014 TypeScript aligned better with our team's existing skills and the frontend codebase.\n4. ICP: 5-50 person engineering teams for MVP. Rejected enterprise-first focus because the sales cycle is too long for our capital runway.\n5. Self-host: SaaS-only at launch; self-host arrives in v2. Rejected concurrent self-host because it would slow shipping velocity past our funding window.\n\nProduce brief.md, decision-log.md, and a distillate.",
      "expected_output": "Decision log contains all five named decisions with rationale captured. Brief reflects the decisions but the decision log is the canonical record.",
      "files": [],
      "expectations": [
        "decision-log.md exists in the run folder",
        "decision-log.md captures the pricing decision (free OSS + $29/seat SaaS) with the rejected alternative (paid one-shot license) and rationale (network effects)",
        "decision-log.md captures the invite-only-beta decision with the rejected alternative (open public launch) and rationale (operational risk before stability)",
        "decision-log.md captures the platform-stack decision (TypeScript + Postgres) with the rejected alternative (Go + MongoDB) and rationale (team skills / frontend alignment)",
        "decision-log.md captures the ICP decision (5-50 person eng teams) with rationale referencing sales cycle / runway",
        "decision-log.md captures the self-host-timing decision (SaaS-only at launch, self-host v2) with rationale (shipping velocity / funding window)"
      ]
    },
    {
      "id": "B2",
      "_pattern": "process-discipline",
      "prompt": "Run headless. Create a brief for HelmStack \u2014 an open-source observability platform for distributed systems.\n\nWe have made these specific decisions and want each captured in the decision log with rationale:\n\n1. Pricing: Free open-source core; paid SaaS at $29/seat/month. Rejected paid-one-shot-license model because it would limit network effects in the OSS community.\n2. Launch: Invite-only beta for 6 weeks before public launch. Rejected open public launch \u2014 operational risk too high before stability is proven on real workloads.\n3. Stack: TypeScript + Postgres for the backend. Rejected Go + MongoDB \u2014 TypeScript aligned better with our team's existing skills and the frontend codebase.\n4. ICP: 5-50 person engineering teams for MVP. Rejected enterprise-first focus because the sales cycle is too long for our capital runway.\n5. Self-host: SaaS-only at launch; self-host arrives in v2. Rejected concurrent self-host because it would slow shipping velocity past our funding window.\n\nProduce brief.md, decision-log.md, and a distillate.",
      "expected_output": "Brief is consistent with the decision log: every decision in the log is reflected in the brief, and no claim in the brief is absent from the input prompt or the log. Tests bidirectional fidelity.",
      "files": [],
      "expectations": [
        "brief.md mentions the OSS-core + paid-SaaS pricing structure",
        "brief.md references the invite-only-beta launch sequencing OR identifies the launch model consistent with the decision log",
        "brief.md references the platform-stack choice (TypeScript + Postgres) OR is silent on stack \u2014 but does not contradict it (no mention of Go, MongoDB, etc.)",
        "brief.md identifies 5-50 person eng teams as the ICP (or equivalent \u2014 small-to-mid-size eng teams)",
        "brief.md does not introduce decisions, competitors, partnerships, metrics, or product features absent from both the input prompt and decision-log.md (no invented facts)",
        "Each substantive decision in decision-log.md has a corresponding reflection in brief.md (no log-to-brief drops)"
      ]
    },
    {
      "id": "B3",
      "_pattern": "process-discipline",
      "prompt": "Run headless. Create a product brief for InsuLens.\n\nContext (use exactly this \u2014 do not invent):\n- Product: a smartphone app that pairs with off-the-shelf $200 thermal imaging accessories (FLIR ONE Pro and Seek Compact Pro). The app guides homeowners through a structured walkthrough and produces a professional-grade insulation audit in under 20 minutes.\n- Target: suburban homeowners aged 35-65 with houses built before 2000.\n- Validation: 50 user interviews completed in Q4 2025; 78% willingness to pay $49 for a one-time audit.\n- Stakes: Series A pitch input.\n- Hardware: requires a thermal accessory (we do not manufacture hardware).\n\nProduce brief.md, decision-log.md, and a distillate. Run the polish phase before presenting.",
      "expected_output": "The transcript shows the polish phase executing \u2014 the skill invokes bmad-editorial-review-structure and bmad-editorial-review-prose, either via the Skill tool directly or via Agent tool calls whose description or prompt targets those editorial skills. Both passes must occur after the initial draft is written and before the final JSON status block.",
      "files": [],
      "expectations": [
        "The transcript contains either a Skill tool call invoking bmad-editorial-review-structure, OR an Agent tool call whose description or prompt references structural review or bmad-editorial-review-structure",
        "The transcript contains either a Skill tool call invoking bmad-editorial-review-prose, OR an Agent tool call whose description or prompt references prose review or bmad-editorial-review-prose",
        "Both editorial-pass dispatches (Skill or Agent) occur after the first Write tool call that creates brief.md",
        "Both editorial-pass dispatches (Skill or Agent) occur before the final assistant message containing the JSON status block"
      ]
    },
    {
      "id": "B4",
      "_pattern": "process-discipline",
      "prompt": "Run headless. Create a product brief for InsuLens.\n\nContext (use exactly this \u2014 do not invent):\n- Product: a smartphone app that pairs with off-the-shelf $200 thermal imaging accessories (FLIR ONE Pro and Seek Compact Pro). Walkthrough produces a professional-grade insulation audit in under 20 minutes.\n- Target: suburban homeowners aged 35-65 with houses built before 2000.\n- Validation: 50 user interviews; 78% willingness to pay $49.\n- Stakes: Series A pitch input.\n- Hardware: requires a thermal accessory.\n\nProduce brief.md, decision-log.md, and a distillate. Follow the standard Create flow: workspace setup, draft, finalize (decision log audit, polish, distillate, close-out).",
      "expected_output": "Workspace setup happens before drafting. Draft happens before polish. Polish happens before distillate generation. Distillate generation happens before the final close-out JSON block. Each phase boundary is observable in the transcript.",
      "files": [],
      "expectations": [
        "The first Write tool call to decision-log.md OR brief.md (skeleton) occurs before the substantive Write that produces the full brief body",
        "The polish-phase Skill tool calls (bmad-editorial-review-structure and/or bmad-editorial-review-prose) occur after the brief body is written",
        "The bmad-distillator Skill tool call (or distillate.md write) occurs after the polish-phase Skill tool calls",
        "The final JSON status block in the assistant message occurs after distillate.md is written or skipped with explanation"
      ]
    },
    {
      "id": "B5",
      "_pattern": "process-discipline",
      "prompt": "Run headless. Update the brief at evals/bmm-skills/bmad-product-brief/files/forkbird-brief/brief.md \u2014 we have decided to add B2B catering services for corporate events, in addition to the direct-to-consumer delivery model. Read the existing decision-log.md and addendum.md in the same folder first.",
      "expected_output": "The skill MUST detect the contradiction with the prior 'rejected B2B catering for MVP' decision (in decision-log.md) before applying the change. Acceptable resolutions: (a) halt with blocked status surfacing the conflict, or (b) apply the change with addendum.md capturing the override and rationale. Brief must not silently flip without acknowledging the prior decision.",
      "files": [
        "evals/bmm-skills/bmad-product-brief/files/forkbird-brief/brief.md",
        "evals/bmm-skills/bmad-product-brief/files/forkbird-brief/addendum.md",
        "evals/bmm-skills/bmad-product-brief/files/forkbird-brief/decision-log.md",
        "evals/bmm-skills/bmad-product-brief/files/forkbird-brief/distillate.md"
      ],
      "expectations": [
        "The transcript or output explicitly references the prior 'rejected B2B catering for MVP' decision from decision-log.md",
        "The contradiction is surfaced before the brief body is modified (a Read of decision-log.md occurs before the Edit/Write to brief.md, AND the conflict is named in the assistant output)",
        "Either the JSON status is 'blocked' with the conflict in the reason field, OR addendum.md is updated with an override entry capturing the rationale for reversing the prior decision",
        "If the brief is updated, decision-log.md gains a new entry referencing the catering reversal",
        "If the brief is updated, the YAML frontmatter 'updated' field is later than the original 'created' field"
      ]
    },
    {
      "id": "B6",
      "_pattern": "process-discipline",
      "prompt": "Run headless. Update the brief at evals/bmm-skills/bmad-product-brief/files/forkbird-brief/brief.md \u2014 we have signed our fifth chef partner (Chicago metro). Add this to the existing operating-model and what's-known sections. Read the existing decision-log.md first.",
      "expected_output": "Clean update \u2014 does not contradict any prior decision. Brief gets updated, decision-log gains a new entry, distillate is regenerated, YAML 'updated' bumps but 'created' stays the same. No spurious addendum since this is a status update, not an override.",
      "files": [
        "evals/bmm-skills/bmad-product-brief/files/forkbird-brief/brief.md",
        "evals/bmm-skills/bmad-product-brief/files/forkbird-brief/addendum.md",
        "evals/bmm-skills/bmad-product-brief/files/forkbird-brief/decision-log.md",
        "evals/bmm-skills/bmad-product-brief/files/forkbird-brief/distillate.md"
      ],
      "expectations": [
        "brief.md is updated to reflect the signed fifth chef partner in Chicago",
        "brief.md frontmatter 'updated' field is later than the original 'created' timestamp; 'created' is unchanged",
        "decision-log.md contains a new entry referencing the fifth chef signing",
        "distillate.md is regenerated (modification timestamp newer than the input fixture)",
        "The transcript does not surface a fictional contradiction \u2014 this is a clean update, not an override of a prior decision"
      ]
    },
    {
      "id": "B7",
      "_pattern": "process-discipline",
      "prompt": "Run headless. Validate the brief at evals/bmm-skills/bmad-product-brief/files/mossridge-brief/brief.md \u2014 we are presenting to the library board Monday. Read the addendum and decision-log in the same folder. Cite specific sections. Return inline only.",
      "expected_output": "Validate is read-only. No new files created. No existing files modified. Critique returned inline in the assistant output.",
      "files": [
        "evals/bmm-skills/bmad-product-brief/files/mossridge-brief/brief.md",
        "evals/bmm-skills/bmad-product-brief/files/mossridge-brief/addendum.md",
        "evals/bmm-skills/bmad-product-brief/files/mossridge-brief/decision-log.md"
      ],
      "expectations": [
        "No new files appear in the mossridge-brief artifacts directory after the run (only the three input files)",
        "The input brief.md, addendum.md, and decision-log.md are byte-identical to the staged fixtures (no Edit/Write tool calls modified them)",
        "The transcript contains no Write tool calls and no Edit tool calls targeting the mossridge-brief folder",
        "The final assistant message contains a JSON object with intent='validate'"
      ]
    },
    {
      "id": "B8",
      "_pattern": "process-discipline",
      "timeout": 900,
      "prompt": "Run headless. Create a product brief for InsuLens (smartphone app that pairs with thermal imaging accessories for homeowner insulation audits, target suburban homeowners 35-65 with houses pre-2000, 50 user interviews with 78% willingness to pay $49, Series A pitch input). Generate a distillate \u2014 this brief will feed downstream PRD work.",
      "expected_output": "distillate.md exists alongside brief.md and decision-log.md. The distillate is a meaningful condensation of the brief. Content of the distillate matches the brief without introducing new facts. The transcript shows the bmad-distillator subagent invoked.",
      "files": [],
      "expectations": [
        "distillate.md exists in the run folder alongside brief.md and decision-log.md",
        "distillate.md is a meaningful condensation of brief.md \u2014 substantially more concise and capturing only the key decisions, target audience, validation evidence, and known unknowns needed for downstream PRD work, not a near-verbatim copy",
        "distillate.md does not introduce facts or claims not present in brief.md (no inventions on compression)",
        "The transcript contains a Skill tool call invoking bmad-distillator"
      ]
    },
    {
      "id": "C1",
      "_pattern": "config-compliance",
      "prompt": "Run headless. Create a product brief for TaskFlow \u2014 a lightweight daily planning app for freelancers who juggle multiple clients. Core idea: a single daily view that pulls together tasks, time blocks, and client context so the freelancer always knows what to work on next. Target is independent freelancers, 1-3 clients at a time, who currently manage their day across sticky notes, calendar apps, and spreadsheets. MVP is mobile-first. No investors \u2014 the founder is bootstrapping.",
      "expected_output": "Brief written in Spanish (document_output_language=Spanish). Assistant's conversational output reflects the configured British-accent communication style. Brief lands at the custom output path (test-output/artifacts/briefs/...) rather than the default _bmad-output path. Brief is right-sized for a bootstrapped solo project.",
      "files": [],
      "expectations": [
        "brief.md exists under test-output/artifacts/briefs/ (the custom planning_artifacts path), not under _bmad-output/",
        "The final JSON status block artifact paths reference test-output/ rather than _bmad-output/",
        "brief.md body is written in Spanish \u2014 the majority of prose content (headings, section bodies) is in Spanish, not English",
        "brief.md covers the TaskFlow concept: freelancer daily planning, multi-client context, the sticky-notes-plus-calendar-plus-spreadsheet problem",
        "brief.md is right-sized for a bootstrapped side project — appropriate depth and scope for a solo-founder app with no investor audience, no TAM/SAM/SOM framing, no Series A language, and no sections that pad for enterprise credibility",
        "The assistant's non-document output (transcript text content outside of brief.md) contains at least one marker of British informal register (e.g., 'mate', 'cheers', 'brilliant', 'sorted', 'innit', 'blimey', 'proper', 'right then', or equivalent pub-idiom phrasing)"
      ]
    }
  ]
 }
--- a/evals/bmm-skills/bmad-product-brief/files/branfield-memo.md
+++ b/evals/bmm-skills/bmad-product-brief/files/branfield-memo.md
@ -0,0 +1,46 @@
 # Working Group Notes — Microcredential Program
 **Branfield Community College**
 **Meeting:** 2026-04-22
 **Attendees:** Provost, Workforce Dev Director, Chair of Industry Advisory Board, two faculty leads (Data Analytics, Healthcare Admin), Financial Aid Director
 ## Why we're doing this
 Regional employer survey (Q1 2026) showed 340+ unfilled mid-skill jobs in the three-county area. State workforce board approved a $1.4M grant if we can launch by fall 2027 with at least three tracks. Existing AAS programs are too long for working adults — average completion 3.5 years.
 ## What we're building
 Six-month stackable microcredentials. Three tracks at launch:
 1. **Data Analytics** (SQL, Excel/Power BI, intro Python). Faculty lead Marisol Reyes. Strongest employer demand. Will be MVP — first to launch, used to validate format.
 2. **Healthcare Admin** (medical coding, EHR systems, patient workflow). Faculty lead Dev Patel. Aging population in region drives demand.
 3. **Sustainable Construction** (green building practices, retrofit basics, code compliance). New faculty hire required.
 Stackable means credits transfer into related AAS or BAS later if the student wants.
 ## Decisions made today
 - **Data Analytics is MVP.** Launch fall 2027, others phase in spring/fall 2028. Validate format before scaling.
 - **Hybrid delivery.** Two evenings/week in person + asynchronous online. Board rejected pure-online (concerns about adult learner outcomes data).
 - **Stipend program.** Up to $3,000/student for low-income students, funded from the state grant. Means-tested.
 - **Industry Advisory Board** has approval authority on curriculum. Three employers committed (regional hospital, mid-size data consultancy, county housing authority). All three commit to interview every graduate.
 - **Cohort cap: 24 per track per term.** Driven by classroom size and faculty load.
 ## Open questions
 - Childcare for evening sessions — can we partner with the campus childcare center? Deferred to next meeting.
 - Marketing — provost wants to know cost per enrolled student before approving budget. Need workforce dev to model.
 - Do we offer a tuition payment plan in addition to the stipend? Financial aid director thinks yes; provost wants to see uptake projections first.
 ## What we're NOT doing
 - Not pursuing pure-online delivery (rejected — see above).
 - Not launching all three tracks at once (rejected — risk concentration, faculty bandwidth).
 - Not building employer-customized cohorts (rejected — too operationally complex for MVP).
 ## Next steps
 - Workforce Dev: marketing cost model by 2026-05-15.
 - Provost: childcare partnership exploratory conversation.
 - Faculty leads: draft data analytics curriculum outline by 2026-06-01.
 - Reconvene 2026-05-20.
--- a/evals/bmm-skills/bmad-product-brief/files/forkbird-brief/addendum.md
+++ b/evals/bmm-skills/bmad-product-brief/files/forkbird-brief/addendum.md
@ -0,0 +1,40 @@
 # Addendum — Forkbird Kitchen
 ## Options considered (and not taken)
 ### B2B / corporate catering
 Considered as a parallel revenue stream from day one. Rejected for MVP. Different operational rhythm (bulk orders, fixed delivery windows, invoiced billing), different customer (procurement, not eaters), different unit economics. Splitting attention at launch risked degrading both. Revisit if consumer foundation is established by month 12.
 ### Subscription / meal plan
 Considered as a recurring-revenue layer. Rejected for MVP. Operationally expensive at our planned scale: requires demand forecasting per subscriber, kitchen scheduling locked further out, and packaging/refrigerated handling we are not yet equipped for. Reasonable to revisit once kitchen utilization stabilizes.
 ### Retail / grocery channel
 Considered (refrigerated meals in Whole Foods, Sprouts). Rejected for MVP. Different product (cold meals, longer shelf life, different texture profile), different go-to-market (broker relationships, slotting fees, category management). Parked for year 2 — would require a separate product line, not a channel extension.
 ### Lower-priced everyday tier
 Considered. Rejected for now. The brand position is chef-driven; introducing a value tier alongside risks the premium signal in marketplace search ranking and review patterns. Explored alternative of separate brand for value tier; deferred.
 ## Personas (extended)
 **The plant-based weekday professional.** Lives in a dense urban neighborhood, orders 4–6 times a month, splits between own-cooking and delivery. Sources of dissatisfaction with current options: chain plant-based menus feel formulaic, fine-dining plant-based is too expensive for weeknight, marketplace search surfaces too many low-quality options.
 **The dietary-flex household member.** One person in a household is plant-based by preference; the other(s) are not. Ordering pattern is "tonight one of us wants Forkbird, the other wants something else." We benefit from being a dependable single-cuisine option that doesn't require negotiating across diets.
 ## Sizing notes
 - Total addressable: ~6.2M urban professionals across 5 metros eating plant-based 3+ times/week (based on 2024 Plant Based Foods Association data, urban segmentation).
 - Serviceable addressable (within delivery radius of planned kitchens at launch): ~840K.
 - Realistic Y1 capture (per metro forecast): 0.4% of SAM = 3,360 active customers across all metros.
 ## Sourcing standard — exact wording
 "For each dish on the menu, we publish the source of every ingredient that represents at least 5% of cost. We commit that at least 60% of total ingredient weight is sourced within 200 miles of the kitchen preparing that dish. Both numbers are auditable; we publish them per-dish in the app. If we cannot meet the 60% local threshold for a dish, the dish does not ship."
 ## Technical constraints
 - Marketplace integration (DoorDash, UberEats, Grubhub) requires their menu management API. We are using a third-party middleware (Olo) to avoid maintaining three separate integrations.
 - Ingredient transparency display requires structured data per dish. We need an ingredient-master database; current option is to extend our recipe-management software vendor.
--- a/evals/bmm-skills/bmad-product-brief/files/forkbird-brief/brief.md
+++ b/evals/bmm-skills/bmad-product-brief/files/forkbird-brief/brief.md
@ -0,0 +1,56 @@
 ---
 title: Forkbird Kitchen — Product Brief
 status: final
 created: 2026-02-14
 updated: 2026-02-14
 ---
 # Forkbird Kitchen
 ## What it is
 A delivery-only ghost kitchen brand offering chef-driven plant-based meals in five US metros: San Francisco, New York, Los Angeles, Seattle, and Chicago. Launch operating model is direct-to-consumer through our own iOS/Android app and the major third-party marketplaces (DoorDash, UberEats, Grubhub).
 ## Who it's for
 Urban professionals aged 28–45 who eat plant-based meals at least three times a week, value chef-driven food over chain alternatives, and order delivery 4+ times monthly. Initial geographic focus is dense neighborhoods within 3-mile delivery radii of partner kitchens.
 We are not building for: families with children (different ticket size and ordering pattern), occasional plant-based eaters (price sensitivity too high for our positioning), or office lunch (different time-of-day operation).
 ## Why it wins
 Three things are deliberately stacked:
 1. **Chef partnerships, not chef-as-marketing.** Each metro has a named chef (with prior fine-dining or notable plant-based credit) who designs the rotating menu and earns equity in that metro's P&L. They are not endorsers; they are operators.
 2. **Ingredient sourcing standards.** Published per-dish: where it came from, how it was farmed, what portion of cost it represents. No dish ships if we can't source within 200 miles for ≥60% of ingredient weight. This is auditable, not marketing copy.
 3. **Speed without cars.** Average ticket-to-door is 28 minutes from order placement, achieved by tight delivery radii and dense order density per kitchen. Long delivery erodes plant-based texture more than animal protein — speed is product, not logistics.
 ## Operating model
 Five kitchens, one per metro, each leased space inside an existing food-prep facility. No customer-facing storefronts. App orders go through our stack; marketplace orders pass through their stacks. Menu rotates every six weeks per chef.
 Pricing tier: $14–$22 per entrée before delivery. We are deliberately at chef-driven positioning, not value positioning.
 ## What's known
 - Demand validated through three pop-up dinners in SF and NY (Q4 2025). 480 covers, 78% repeat intent based on post-event survey.
 - Operating partner identified in each metro. Leases signed for SF, NY, LA. Seattle and Chicago in negotiation.
 - Three of five chefs signed; two in active conversations.
 ## What's unknown
 - Whether ingredient-sourcing transparency is a differentiator at point of sale (in-app) or only in marketing. Our hypothesis is "both" but we have not tested in-app.
 - Marketplace economics. DoorDash takes 15–30% depending on tier; we are modeling the lower tier but have not negotiated.
 - Whether the 3-mile radius holds outside SF/NY (lower density in LA/Chicago).
 ## Risks
 - Chef churn. If a metro chef leaves, the metro brand loses its anchor. Mitigation: equity vesting over 24 months, named-chef terms in operating agreement.
 - Sourcing cost volatility. 60% local-within-200-miles can spike with weather/supply disruption. We have not modeled the worst case.
 - Marketplace dependency. If DoorDash terms shift adversely, our blended margin is at risk. We are deliberately building the owned-app channel to reduce this dependency.
 ## Success criteria for first 12 months
 - 4 of 5 metros operating profitably at the unit level (kitchen + chef + delivery economics) by month 9
 - 30% of orders through owned app (vs. marketplaces) by month 12
 - Chef retention 100% through year 1
--- a/evals/bmm-skills/bmad-product-brief/files/forkbird-brief/decision-log.md
+++ b/evals/bmm-skills/bmad-product-brief/files/forkbird-brief/decision-log.md
@ -0,0 +1,27 @@
 # Decision Log — Forkbird Kitchen
 ## 2026-01-08
 - **Brand position: chef-driven, premium plant-based.** Considered value tier; rejected for MVP. Premium positioning is the wedge against marketplace generic plant-based.
 ## 2026-01-12
 - **Five-metro launch: SF, NY, LA, Seattle, Chicago.** Considered three-metro start; rejected as not enough density to test the chef-equity model meaningfully.
 - **Ghost kitchen, no storefront.** Storefronts ruled out — capex too high for MVP, dilutes the speed advantage.
 ## 2026-01-19
 - **Pricing tier $14–$22 per entrée.** Modeled against three competitor sets: chain plant-based, fine-dining plant-based delivery, generic mid-tier delivery. Sits cleanly above chain, below fine-dining.
 - **Chef equity in metro P&L.** Rejected flat fee + revenue share alternative; equity creates the operator incentive we want.
 ## 2026-01-26
 - **Rejected B2B catering segment for MVP.** Different operational rhythm and customer; would split attention at launch and risk degrading both consumer and B2B execution. Revisit in year 2 if consumer foundation is solid. (Discussion: 2 hours; chef partners weighed in against splitting focus; CFO modeled the dilution effect on consumer kitchen utilization.)
 - **Rejected subscription model for MVP.** Operationally expensive at planned scale; revisit once kitchen utilization stabilizes.
 ## 2026-02-02
 - **Sourcing standard: 60% within 200 miles, published per-dish.** Considered weaker thresholds (50% / 250 miles); rejected as not differentiating enough to be worth publishing. The number has to be defensible.
 - **Marketplace channel mix: own app + DoorDash + UberEats + Grubhub.** Considered own-app only; rejected as too slow on demand acquisition. Considered marketplaces only; rejected — own app is critical to long-term margin.
 ## 2026-02-09
 - **Six-week menu rotation per chef.** Considered four-week (more freshness) and eight-week (more operational stability). Six is the compromise; reassess after first two cycles.
 - **Marketing budget: 60% acquisition / 40% brand.** Rejected pure-acquisition because chef-driven positioning needs brand-level signal that paid acquisition alone won't carry.
 ## 2026-02-14
 - **Brief finalized for Series A inputs.** Status moved to final.
--- a/evals/bmm-skills/bmad-product-brief/files/forkbird-brief/distillate.md
+++ b/evals/bmm-skills/bmad-product-brief/files/forkbird-brief/distillate.md
@ -0,0 +1,28 @@
 # Forkbird Kitchen (Distillate)
 **What:** Delivery-only ghost kitchen brand serving chef-driven plant-based meals across five US metros (SF, NYC, LA, Seattle, Chicago) via own app and marketplaces (DoorDash, UberEats, Grubhub).
 **Audience:** Urban professionals 28–45 who eat plant-based 3+ times/week and order delivery 4+ times monthly.
 **Differentiation (deliberately stacked):**
 - Named chef per metro with equity in metro P&L (operator, not endorser)
 - Auditable per-dish sourcing: ≥60% ingredient weight within 200 miles
 - 28-min average ticket-to-door via tight 3-mile delivery radii
 **Operating model:** Five leased ghost-kitchen spaces, one per metro. Menu rotates every six weeks per chef. Pricing $14–$22 per entrée before delivery.
 **Validated:**
 - 480 covers across three SF/NY pop-ups (Q4 2025), 78% repeat intent
 - Three of five chefs signed; LA/SF/NY leases signed
 - Three of five operating partners identified
 **Open:**
 - Whether per-dish sourcing transparency moves conversion in-app (untested)
 - Marketplace economics (DoorDash terms unconfirmed)
 - 3-mile radius outside high-density metros (LA/Chicago)
 **Scope explicitly excluded for MVP:** B2B/corporate catering, subscription, retail/grocery, lower-priced value tier. All revisit-able in year 2.
 **Key risks:** chef churn, sourcing cost volatility, marketplace dependency.
 **Y1 success criteria:** 4/5 metros unit-profitable by month 9; 30% orders through own app by month 12; 100% chef retention.
--- a/evals/bmm-skills/bmad-product-brief/files/meridian-mobility-report.md
+++ b/evals/bmm-skills/bmad-product-brief/files/meridian-mobility-report.md
@ -0,0 +1,116 @@
 # E-Mobility Market Report 2026
 **Prepared by:** Meridian Insights
 **Date:** Q2 2026
 **Coverage:** North America, with comparative reference to EU markets
 **Engagement code:** MI-2026-EMOB-007
 ---
 ## Executive Summary
 The e-mobility category continues a multi-year structural shift from "alternative transportation" to mainstream mobility infrastructure. North American unit volume across e-bikes, e-scooters, and connected safety hardware grew 18% year-over-year in 2025, against a 6% growth rate for traditional bicycles. Three macro factors are durably reshaping the category: regulatory clarity at the state level (29 US states now have explicit e-bike classifications, up from 14 in 2022), insurance industry interest in telematics-style risk pricing, and a generational shift in commuting preferences among the 28-44 cohort.
 This report covers seven segments of the broader e-mobility landscape: e-bike retail, e-scooter regulation, bike-share systems, charging infrastructure, smart helmet hardware, and grid-integration trends. Findings are synthesized from 142 stakeholder interviews, 18 retailer site visits, government regulatory filings, and proprietary point-of-sale data from 4,200 specialty retail outlets.
 ---
 ## Methodology
 Quantitative data was sourced from Meridian's proprietary Mobility Retail Panel (MRP), which aggregates POS data from independent specialty retailers and select chain operators. Where panel data is incomplete or lagging, we supplemented with manufacturer-reported shipment volumes and customs/import filings. Qualitative findings draw on 142 interviews conducted between November 2025 and March 2026 with retailers, fleet operators, regulators, manufacturers, and end users.
 Helmet category sizing uses a separate methodology described in Section 8, blending CPSC compliance filings, manufacturer disclosures, and a sample purchase-intent survey of 3,400 cyclists.
 ---
 ## Section 3: Market Sizing — Total E-Mobility
 The North American e-mobility market reached an estimated $14.7B in retail volume in 2025, up from $12.5B in 2024. The largest segment by volume is e-bikes at $7.2B, followed by e-scooter retail at $2.8B (excluding shared-fleet operations), bike-share and dockless mobility services at $2.1B, charging infrastructure at $1.8B, and connected safety hardware at $0.8B.
 Compound annual growth rate (CAGR) forecasts through 2030 vary substantially by segment. We forecast 14% CAGR for e-bikes, 6% for e-scooters (decelerating as the regulatory regime stabilizes), 9% for bike-share, 22% for charging infrastructure (driven by both bike and scooter charging), and 31% for connected safety hardware (off a smaller base). Vehicle-to-grid (V2G) integration is too early to forecast reliably; we treat it as an emerging segment.
 ---
 ## Section 4: E-Bike Market Deep Dive
 E-bikes represent the largest single segment by retail value. The 2025 unit mix favored Class 1 (pedal-assist, max assisted speed 20 mph) at 58% of units, Class 2 (throttle, max 20 mph) at 24%, and Class 3 (pedal-assist, max 28 mph) at 18%. Class 3 is the fastest-growing classification on a unit basis, driven by suburban commuter demand.
 Manufacturer concentration shifted in 2025. The top 10 brands by unit volume now hold 64% of the market, up from 51% in 2022 — consolidation that mirrors patterns seen in the traditional bicycle market in the early 2000s. Specialized, Trek, and Cannondale (operating their respective electric sub-brands) represent the top three. Direct-to-consumer brands (Rad Power, Lectric, Aventon) collectively hold approximately 19% of retail value.
 Retail channel split favored independent specialty bike shops at 47% of unit volume, with direct-to-consumer at 28%, big-box retail at 17%, and e-commerce marketplaces (Amazon, Walmart.com) at 8%. The independent specialty channel commands a price premium of approximately 22% over comparable D2C alternatives, attributed to in-store fitting, post-sale service relationships, and higher-margin component upgrades.
 Notable trends in 2025: cargo e-bike sub-segment grew 41% YoY (small base, dense urban geographies); battery range claims continue to drift upward with manufacturer claims of 60+ mile range becoming standard for $2,500+ price points; bottom-bracket motor placement (mid-drive) gained share over hub-drive in the $3,000+ tier.
 ---
 ## Section 5: E-Scooter Regulatory Landscape
 The North American e-scooter regulatory environment matured significantly during 2024-2025 after several years of municipal experimentation and reactive policymaking. Forty-one US cities now operate under what we classify as "stable" regulatory regimes (defined as: explicit operating permit framework, defined sidewalk/bike-lane rules, helmet provisions, and revenue-share or fee structures with the city). This is up from 19 cities in 2022.
 The regulatory shift has compressed operator margins. Permit fees and per-trip surcharges in major markets (Los Angeles, Chicago, Atlanta, Denver) range from $0.15 to $0.42 per trip, against average ride revenue of $5.40. Several major operators have exited markets where permit economics have proven unviable; Lime exited five secondary US markets in 2025 citing exactly this reason.
 Helmet requirements remain inconsistent. Thirteen US states require helmets for riders under 18 only; seven require them for all riders; the rest leave it to municipalities. Enforcement is widely acknowledged to be minimal even where mandates exist. EU markets are substantially stricter, with mandatory helmet provisions in France, Germany, and Italy applying to all e-scooter riders.
 Insurance treatment is also fragmenting. Five US states have classified e-scooters as "motor vehicles" requiring liability coverage, raising the floor on operating costs for shared-fleet providers. Most states still treat them as bicycles for insurance purposes.
 ---
 ## Section 6: Bike-Share and Dockless Mobility
 Docked bike-share systems (Citi Bike, Divvy, Bluebikes, Capital Bikeshare) continue stable, slow growth. Capital Bikeshare reported 5.1M trips in 2025 (5% growth); Citi Bike reported 38M (8% growth). Docked systems benefit from station infrastructure that creates predictability for riders and meters demand-side adoption.
 Dockless bike-share (without fixed stations) is largely consolidated; the experimentation phase ended in 2023. Lyft operates the dominant national network through its acquired bike-share division, with regional players in select markets. Operating economics for dockless are structurally weaker than docked due to vehicle redistribution costs, vandalism rates, and the absence of station-driven advertising revenue.
 A notable trend is the convergence of bike-share and dockless e-bike subscription models. Several operators now offer monthly memberships that include unlimited 30-minute trips on dockless e-bikes within a service zone. Adoption is concentrated in dense urban cores where car-free lifestyles are practical.
 ---
 ## Section 7: Charging Infrastructure Trends
 Charging infrastructure for e-bikes and e-scooters has emerged as a meaningful sub-segment, growing 28% in 2025. The dominant form factor remains residential at-home wall chargers (87% of installed base), but commercial charging — at workplaces, transit stations, and apartment buildings — is the fastest-growing sub-segment.
 Standardization remains a constraint. Battery interfaces have not converged; Bosch, Shimano, and various proprietary systems coexist. The European Union's USB-C mandate for portable electronics has not yet extended to e-mobility; industry observers expect regulatory pressure to follow within 3-5 years.
 Workplace charging is increasingly common in tech and creative-industry employers; we estimate 31% of large urban employers in tech-heavy metros now offer workplace e-bike charging, up from 12% in 2022. Apartment buildings lag — 7% of class-A multifamily properties offer common-area charging, with retrofit cost cited as the primary barrier.
 Public charging at transit hubs (subway/light rail stations) remains a stated priority across most major metro transit authorities, but actual installation lags policy commitments significantly. Funding fragmentation and permitting delays are the consistently cited bottlenecks.
 ---
 ## Section 8: Smart Helmet Category
 The connected safety hardware category — colloquially "smart helmets" — is the smallest segment we cover by retail value but has the strongest growth profile. The North American smart helmet market reached $810M in retail value in 2025, up from $480M in 2023, representing a 30% CAGR. We forecast $2.4B by 2030, contingent on the resolution of two open questions detailed below.
 **Category definition.** We define "smart helmets" as helmets that include at least one connected safety feature: turn signals (typically wireless-controlled), braking lights (auto-activated via accelerometer), crash detection (auto-notification to emergency contacts on detected impact), or integrated navigation/audio (bone-conduction speakers, often paired with smartphone apps). Helmets with passive integrated lighting only (no connectivity) are excluded from this category and tracked under traditional helmet retail.
 **Key players.** The category remains fragmented; no single manufacturer commands more than 15% market share. Top five by 2025 retail volume: Lumos Helmet (US, market leader at ~14% share with strong DTC presence), Sena Technologies (Korea, intercom heritage, ~11%), Coros (US/China, multi-sport, ~9%), Specialized ANGi (US, premium tier at ~7%), and POC Aid (Sweden, premium safety positioning at ~6%). Approximately 30 smaller brands hold the remaining share.
 **Crash detection technology.** Two architectures dominate: single-accelerometer crash detection (lower cost, higher false-positive rate) and multi-sensor fusion (accelerometer + gyroscope + GPS movement signature, lower false-positive rate but higher BOM cost). Insurance industry sources indicate that multi-sensor systems are likely to become a baseline requirement for any insurance discount programs, given that single-accelerometer systems triggered roughly 1 false alert per 47 hours of riding in our test panel.
 **Regulatory landscape.** Smart helmets sit at the intersection of two regulatory regimes: the Consumer Product Safety Commission's bicycle helmet standard (16 CFR 1203, governing impact protection) and the Federal Communications Commission's regulation of intentional radiators (governing the radio components for Bluetooth/cellular). Compliance with both is non-trivial. Eight smart helmet brands have had FCC Part 15 violations issued since 2023, typically for emissions exceeding limits during compliance testing. EU markets additionally require EN 1078 certification for the helmet shell; this is widely held but adds 3-5 months to a typical product development timeline.
 **Insurance industry interest.** Major auto insurers (State Farm, Progressive, Geico, Nationwide) are actively piloting telematics-style discount programs for cyclists who use connected safety helmets. The proposed structure mirrors auto-insurance "good driver" discount frameworks, with discounts of 5-15% on cycling-specific insurance riders or umbrella policies. As of Q1 2026, three insurers have public pilot programs and one (Progressive) has announced general availability for 2027. This could materially accelerate category adoption if discounts materialize at the upper end of the proposed range.
 **Distribution.** D2C dominates at 58% of retail value, reflecting the still-emerging category and the absence of strong channel inventory in independent bike shops. The specialty bike shop channel is growing rapidly (up from 12% to 22% of retail value over 2023-2025) as the category gains category-management attention from major distributors. Big-box channels (REI, Dick's Sporting Goods) are present but shallow in selection — typically 4-8 SKUs versus 40+ in dedicated specialty.
 **Open questions for the segment.** Our growth forecast is conditioned on (a) the proportion of insurers that follow Progressive into general availability of connected-safety discounts; (b) whether multi-sensor crash detection becomes a category baseline (lifting ASP) or remains a premium-tier feature; and (c) whether the current high false-positive rate of single-accelerometer systems triggers a consumer backlash that suppresses category trust before insurance discounts arrive. The downside scenario produces a 2030 category size of $1.4B versus our base-case $2.4B.
 ---
 ## Section 9: Vehicle-to-Grid Integration
 Vehicle-to-grid (V2G) integration of e-bike and e-scooter batteries is an emerging area, but practical commercial deployment is years away. The thesis is that fleet-scale dockless e-bikes and e-scooters represent meaningful aggregate battery capacity that could participate in demand-response markets, particularly in deregulated electricity markets.
 Several technical preconditions must be met: standardized battery interfaces (currently absent), bidirectional charging hardware (rare), aggregator software stack (early-stage), and regulatory clarity on energy market participation by mobility fleets (pre-policy). We treat this as a watch item for 2028+ rather than a current investable theme.
 ---
 ## Section 10: Outlook
 Our base-case forecast for North American e-mobility is $22.5B by 2030, with the e-bike segment reaching $11.8B (the largest), connected safety hardware reaching $2.4B (the fastest-growing in percentage terms), and charging infrastructure reaching $4.2B (driven by commercial and multifamily retrofit demand). Bike-share and dockless mobility plateau in the $2.5-3.0B range as urban density limits adoption ceilings.
 The largest single uncertainty in this forecast is the trajectory of insurance industry adoption of connected-safety telematics, which could accelerate or substantially constrain the smart helmet segment and, secondarily, influence rider behavior across the broader category. We will revisit forecasts in our Q4 2026 update.
 ---
 *This report is prepared for the exclusive use of Meridian Insights subscribers. Reproduction or external distribution without written permission is prohibited.*
--- a/evals/bmm-skills/bmad-product-brief/files/mossridge-brief/addendum.md
+++ b/evals/bmm-skills/bmad-product-brief/files/mossridge-brief/addendum.md
@ -0,0 +1,41 @@
 # Addendum — Mossridge Tool Lending Library
 ## Options considered
 ### Paid lending model (rejected)
 Considered charging a nominal per-loan fee ($2–$5) to cover replacement and maintenance. Rejected as inconsistent with library mission of free access. Board has previously stated free access is non-negotiable for core services. A donation jar at checkout was proposed as a soft alternative; deferred.
 ### Hardware store partnership (considered, deferred)
 Mossridge Hardware (the store committing in-kind donations) offered to host a satellite lending point. Considered; deferred to year 2. The integration adds operational complexity (split inventory, cross-location tracking) we are not equipped for at launch. Reasonable to revisit once the main location is established.
 ### Mobile lending van (rejected)
 Proposed by a board member to serve outlying areas. Rejected for MVP — capital cost ($35K+ for vehicle + outfitting) exceeds the entire grant. Could be a year-three expansion if demand validates.
 ### Skills classes alongside tool loans (deferred)
 Considered offering "how to use a power drill" classes as a value-add. Deferred — interesting but distinct programming, not part of the lending service's MVP scope. Adult Services Librarian is interested in piloting separately.
 ## Reference programs reviewed
 - Berkeley Tool Lending Library (operating since 1979, ~3,000 tools, 250+ daily loans). Funded as a city service.
 - Oakland Tool Lending Library (operating since 2000, smaller catalog, library-staffed).
 - Toronto Tool Library (nonprofit, member-supported, paid model — different funding architecture).
 Direct correspondence with Berkeley TLL staff (March 2026) suggested:
 - Theft has been low (~2% annually) due to library card requirement and community norms
 - The biggest sustainability risk has been staff hours, not tool replacement
 - Most successful programs have a paid coordinator role, not pure volunteer
 ## Potential expansion (year 2+)
 - Hardware store satellite location
 - Specialty tool categories: woodworking, automotive, sewing
 - Skills classes paired with relevant tool checkouts
 - Seed/cuttings library co-located in spring/summer
 ## Insurance and liability — current state
 Library counsel (Town of Mossridge legal department) has been consulted informally. Formal opinion pending. Existing policy covers patrons in the building; coverage for tool use off-premises is the open question. Awaiting written response before submitting grant application.
--- a/evals/bmm-skills/bmad-product-brief/files/mossridge-brief/brief.md
+++ b/evals/bmm-skills/bmad-product-brief/files/mossridge-brief/brief.md
@ -0,0 +1,57 @@
 ---
 title: Mossridge Public Library — Tool Lending Library Proposal
 status: final
 created: 2026-04-30
 updated: 2026-04-30
 ---
 # Tool Lending Library at Mossridge Public Library
 ## What we're proposing
 A free tool-lending service operated out of the Mossridge Public Library, modeled on similar programs in Berkeley, Oakland, and Toronto. Cardholders borrow hand and power tools (drills, saws, ladders, sanders, plumbing snakes, gardening tools) for up to seven days, free of charge.
 ## Why now
 Mossridge residents face rising costs of home maintenance and DIY supplies. Anecdotally, demand for community-shared resources is high — staff have fielded "do you lend tools?" requests for years. A tool library extends the library's mission of equitable access to information and skill-building into the practical-skills domain.
 ## Who it serves
 Mossridge residents with active library cards. Primary audience: single-family homeowners doing their own home repairs, renters making minor improvements with landlord permission, hobbyist woodworkers and gardeners. Estimated 8,000 households in the library's service area.
 ## Service design
 - **Catalog:** Approximately 200 tools to start, prioritizing the most-requested categories (drilling, cutting, sanding, ladders, garden).
 - **Loan period:** Seven days, one renewal allowed if no holds.
 - **Borrower requirements:** Active library card, signed liability waiver, completed safety briefing for power tools.
 - **Location:** Library basement, currently underutilized storage. Accessible by elevator.
 - **Hours:** Tuesday–Saturday during library hours; tools returned via after-hours drop slot when closed.
 ## Funding
 - ARPA infrastructure grant: $42,000 (anticipated, application pending)
 - Friends of the Mossridge Library matching funds: $10,000 (committed)
 - In-kind tool donations from Mossridge Hardware (committed in principle)
 Year-one operating cost is estimated at $48,000, primarily tool purchase, maintenance supplies, and shelving/storage retrofit. Ongoing cost (year two and beyond) projected at $12,000 annually for replacement tools and consumables.
 ## Operations
 The service will be run by trained library volunteers, supervised by the Adult Services Librarian. Volunteer training program to be developed in partnership with Mossridge Vocational Center. Estimated 4–6 active volunteers needed at any given time, with a roster of 12–15 trained volunteers to provide coverage.
 ## Risks
 - **Theft and loss.** Tools are valuable and portable. Mitigation: deposit on power tools (refundable), card-required checkout, photo documentation at loan and return.
 - **Liability.** Borrower waivers will be required; the library's existing insurance policy is being reviewed for coverage.
 - **Demand uncertainty.** We do not yet know the actual borrowing volume the service will see.
 ## Success criteria
 - Launch by Q3 2027 with a catalog of 200 tools.
 - 300 unique borrowers in the first year of operation.
 - Zero serious injury incidents.
 - Tool loss rate under 5% per year.
 ## What we're asking
 Board approval to proceed with the ARPA grant application and finalize the service design for fall 2027 launch.
--- a/evals/bmm-skills/bmad-product-brief/files/mossridge-brief/decision-log.md
+++ b/evals/bmm-skills/bmad-product-brief/files/mossridge-brief/decision-log.md
@ -0,0 +1,29 @@
 # Decision Log — Mossridge Tool Lending Library
 ## 2026-03-04
 - **Pursuing the project.** Adult Services Librarian + Library Director agreed there's enough informal demand signal (years of "do you lend tools?" inquiries) to investigate seriously. Acknowledged that informal inquiries are not the same as validated demand.
 ## 2026-03-11
 - **Reference programs to study: Berkeley, Oakland, Toronto.** Selected based on size, longevity, and accessibility of operational data.
 ## 2026-03-25
 - **Initial scope: hand and power tools only.** Rejected including specialty categories (sewing, electronics test gear, automotive) for MVP. Reason: staff expertise and storage. Revisit year 2.
 - **Free model.** Confirmed — paid model rejected as inconsistent with library mission. Donation jar approved as soft revenue.
 ## 2026-04-01
 - **Volunteer-run model.** Selected to keep ongoing operating costs low. Acknowledged risk: Berkeley correspondence flagged staff-hours as the biggest sustainability concern in similar programs. Plan to revisit at year-one review.
 ## 2026-04-08
 - **Funding architecture: ARPA grant + Friends matching + in-kind donations.** Considered municipal budget request; rejected as too slow (next budget cycle is 18 months out). Grant is faster but requires fall 2027 launch deadline.
 ## 2026-04-15
 - **Launch timing: Q3 2027.** Driven by ARPA grant deadline, not by service-readiness analysis. Acknowledged this is grant-driven, not user-driven, timing.
 - **Year-one target: 300 unique borrowers.** Set by analogy to comparable programs scaled to Mossridge population. No local validation underlying this number.
 ## 2026-04-22
 - **Hardware store satellite deferred to year 2.** Operational complexity exceeds our launch capacity.
 - **Liability: pending formal opinion from town legal.** Borrower waiver in draft.
 ## 2026-04-30
 - **Brief finalized for board meeting.** Status moved to final.
 - **Open items acknowledged for board discussion:** demand validation method, volunteer sustainability, written legal opinion on off-premises tool use coverage.
--- a/evals/bmm-skills/bmad-product-brief/files/pantry-bridge-interviews.md
+++ b/evals/bmm-skills/bmad-product-brief/files/pantry-bridge-interviews.md
@ -0,0 +1,90 @@
 # Pantry Bridge — Customer Research Transcripts
 **Project:** Pantry Bridge meal-kit concept exploration
 **Research firm:** In-house
 **Round:** Discovery interviews, March 2026
 **Format:** 45-minute semi-structured interviews, video; excerpts below are lightly edited for length and clarity
 The four interviews below cover four distinct potential customer segments. We are sharing all four for context, though the team's current product hypothesis targets one specific segment.
 ---
 ## Interview 1 — Susan, 38, working parent
 **Household:** Two kids (ages 6 and 9), spouse works full-time, both parents work demanding office jobs. Suburban Chicago.
 **Susan:** "Honestly, the question is just — can I get dinner on the table by 6:30 without it being chicken nuggets again? My kids don't eat anything green unless we play games about it. My husband and I both have late meetings sometimes. We've tried HelloFresh, we've tried Blue Apron, we tried Home Chef. They all kind of work, and they all kind of don't.
 The thing that breaks them for us is the prep time. The boxes say 30 minutes but you need to add 10-15 to actually get it done. By Wednesday night I don't have 45 minutes. So we end up using the boxes on weekends and ordering takeout three nights a week, which is the opposite of what the boxes are supposed to do.
 If you really wanted to crack it for families like ours: pre-chopped vegetables, sauces that are actually finished and not 'whisk these eight things together.' I'll pay more for less prep. And the recipe books need to read like the kid is going to eat it — not like 'spicy harissa-rubbed cauliflower steaks.'
 Portion sizing — most kits send way too much for our family. We're a family of four but the kids each eat about 60% of a meal. We end up with leftovers that go bad. Better sizing would help."
 **Interviewer:** What about price?
 **Susan:** "We spend $250-350 a week on groceries currently and probably another $200 on takeout. So a meal kit that replaces three nights of takeout could be $200 a month and we'd still come out ahead. Most kits are priced fine; it's the time that breaks them."
 ---
 ## Interview 2 — Marcus, 21, college student
 **Household:** Junior at state university, off-campus apartment shared with two roommates, kitchen has a microwave, a stovetop, and a half-broken oven. Limited budget.
 **Marcus:** "I'm probably the wrong person for this conversation, no offense. I'm not really a meal-kit person. My food situation is, like, dining hall meal plan when I can use it, and the rest is whatever's cheap and fast. Trader Joe's frozen stuff. Eggs. Pasta. Costco runs with my roommates once a month.
 I tried a meal kit when my mom signed me up as a 'starting college' gift. It was nice, but it was $80 a week for two people, which is way out of budget. And honestly, the thing they don't get is that I don't have time at 7 PM to cook. I have time at 11 PM. I want to grab something on my way back from the library and not think.
 If you're trying to do meal kits for college students — and I don't really think you should — but if you were, the price has to be like $5 a meal. And it has to be food that survives in a fridge for two weeks because we don't shop on a weekly schedule. We shop when we run out.
 Snacks matter more to us than meals, actually. Like, the moment when I'm desperate is 10 PM in the library, not 7 PM. Solve that and I might pay attention."
 **Interviewer:** Do you have any dietary restrictions?
 **Marcus:** "I'm vegetarian, sort of. I eat fish. So pescatarian I guess. But mostly because meat is expensive."
 ---
 ## Interview 3 — Eleanor, 71, retired, lives alone
 **Household:** Widow, lives alone in the same single-family home she's been in for 36 years. Suburban Cleveland. Two adult children live out of state. Drives during the day but no longer at night.
 **Eleanor:** "I'll tell you what I miss. I miss cooking for someone. My husband Walter passed five years ago this June, and the hardest thing — well, not the hardest, but one of them — is that I don't really cook anymore. I cook eggs. I cook a piece of fish. I open a can of soup more often than I'd like to admit. I used to make Sunday dinners that would feed eight people. Now I eat standing up at the counter half the time.
 The grocery store is genuinely difficult. I drive there, I park in the back of the lot because I can usually find a spot, and then it's a long walk in. I get tired by the time I'm in the dairy aisle. Carrying the bags from the car to the kitchen — that's a project. My daughter wants me to use grocery delivery and I've tried, but the apps are all designed for someone twenty years younger than me. Tiny buttons, asking me to click through six screens to add a single tomato. I get frustrated and give up.
 What I would actually want — and I've thought about this — is meals for one person. Real portions. Not a frozen TV dinner. Not 'serves four, freeze the rest.' I have a freezer full of leftovers I'll never eat. Just one good meal that I can heat up or finish cooking, that tastes like food I would have made.
 I'm watching my sodium because of my blood pressure. Watching sugar too — borderline diabetic, my doctor calls it. So I read labels carefully. The frozen meals you can buy in stores are loaded with both. I'd pay more for less of both, if I trusted that the labels were accurate.
 The other thing — and please put this in your notes — is that I'm careful about who I let into my house and what I sign up for. There are scams. My friend Marian got taken for $4,000 last year. So if some company asks for my information, I want to know who they are. I want a real customer service number with a real person. I want it to feel like a real business, not a flashy app.
 I don't want it to feel like 'old-people food.' That's an important thing. The Meals on Wheels program in our township is wonderful but it's clearly designed for people who are sicker than I am. I'm not sick. I just live alone and grocery shopping is a lot."
 **Interviewer:** What would the ideal experience look like?
 **Eleanor:** "Someone delivers good food, in real portions, made with the kind of ingredients I would have used. I can heat it up or finish it. It doesn't taste like a hospital. The packaging is something I can actually open without a knife. I get a phone call once in a while from a person, not a robot. The price is reasonable — I'm on a fixed income but I can spend on things that matter. Eating well matters."
 ---
 ## Interview 4 — Dimitri, 44, Director of Food Services, mid-size hospital
 **Organization:** 340-bed hospital, food service operates patient meals, staff cafeteria, and a small retail café. Reports to the COO.
 **Dimitri:** "I'm probably also not who you should be talking to, but happy to share. We don't buy meal kits. We buy ingredients in institutional volumes from Sysco and US Foods primarily, with some specialty buys for dietary restrictions. We feed about 1,800 people a day across patients, staff, and visitors.
 What I deal with that you might find interesting is the patient diet matrix. We have to produce meals that meet specific medical requirements — renal diets, cardiac diets, diabetic diets, dysphagia textures, allergen-free, religious restrictions. Each patient gets a tray that meets their specific orders. It's complex.
 If a meal kit company wanted to play in our world, they'd be selling to me at the institutional level — bulk pricing, multi-year contracts, ability to deliver consistent specs across thousands of meals. That's not really a 'meal kit' anymore; that's wholesale food service.
 Now, where I might be a buyer in a different sense: my staff cafeteria. We're trying to compete with grab-and-go culture. If you produced ready-to-heat meals targeting our staff demographic — nurses, doctors, techs, who are working 12-hour shifts and want real food, not a sandwich — I might pay attention. But the price point would have to make sense for institutional buying, and you'd need to integrate with our existing food safety protocols.
 For consumer meal kits, I'm probably not your customer. We did try one when my wife and I were both working through COVID, and we let the subscription lapse after about three months. Fine product, just didn't fit our patterns."
 ---
 ## Note from the research lead
 These four interviews were selected to represent the range of segments we've considered. The team's working hypothesis after this round is that the older-adult-living-alone segment is the strongest fit for the Pantry Bridge concept — distinctive needs, acknowledged friction with current options, willingness to pay for quality, and a meaningful unmet need around portion sizing and trust. Working parent segment is well-served by existing competitors. College student segment is too price-sensitive. Institutional segment is a different business entirely.
 The brief should target the older-adult segment based on the Eleanor interview specifically.
--- a/evals/bmm-skills/bmad-product-brief/files/q2-brainstorm.md
+++ b/evals/bmm-skills/bmad-product-brief/files/q2-brainstorm.md
@ -0,0 +1,101 @@
 # Q2 Brainstorm — Hatchet & Loop Studio
 **Date:** 2026-04-15
 **Present:** Mira, Devon, Sofia, Theo
 Annual Q2 ideation. We're hunting for our next side-project-that-could-become-a-product. Format: 10 minutes wild ideas, 3 minutes per idea on quick takes, then we vote on one to dig into.
 ## Round 1: Everything goes
 (10 minutes, no filtering. We just throw stuff out.)
 - A weather app that tracks your mood alongside the forecast (Devon)
 - Meditation chime that learns your sleep cycle and chimes only at the right wake-window (Theo)
 - A podcasting tool for non-podcasters — like, you record voice notes and it auto-edits and posts (Sofia)
 - Craft beer subscription with detailed brewer notes you can read while drinking (Mira)
 - AI sommelier app that tells you what wine to buy at Trader Joe's based on a photo (Theo)
 - Office-plant-care subscription with auto-replacement when one dies (Devon)
 - Neighborhood ride coordinator — like a private Uber pool for one neighborhood (Mira)
 - Neighborhood compost coordinator — connect people with food scraps to people with active compost piles (Sofia)
 - Cookbook app where you click "I'll cook this Tuesday" and it auto-generates the shopping list and sends it to your delivery service (Devon)
 - AR home staging — point your phone at a room and it shows you what it would look like with different furniture (Theo)
 ## Round 2: Quick takes
 ### Weather + mood
 Devon: "I'd use it." Sofia thinks the data correlation isn't strong enough to be useful — interesting concept but the science doesn't support a product. Park.
 ### Sleep-cycle meditation chime
 Theo's pitch — exists already (Sleep Cycle, etc.). Differentiation would be the chime, which is hardware. Out of scope for a software-first studio.
 ### Podcasting for non-podcasters
 Sofia: "There are like fifty of these." She's right. Skip.
 ### Craft beer subscription
 Mira admits this is mostly her wanting it for herself. We're not in the logistics business. Skip.
 ### AI sommelier
 Theo: "The model would have to be incredibly good at label recognition." Sofia: "And there's already Vivino." Skip.
 ### Office-plant-care subscription
 Devon: "I worked at a place that had this. They were always sad plants." Operational nightmare, low margin. Skip.
 ### Neighborhood ride coordinator
 Mira: "Saturated. Lyft and Uber both have pool features. Uber Neighborhood was a thing and they killed it." Skip.
 ### Neighborhood compost coordinator
 Sofia: "Hear me out. Cities are mandating organic waste separation but most apartments don't have a composting option. People in single-family homes often have active compost piles and would love more material. There's a missing match-making layer." General agreement this is more interesting than the others. Theo: "How do we make money?" Sofia: "Eventually a small fee on the compost-pile-host side, but for MVP just free and prove the demand." Group lights up. We agree to dig into this in Round 3.
 ### Cookbook → shopping list
 Devon's pitch. Already exists (Mealime, Plan to Eat). Skip.
 ### AR home staging
 Theo: "IKEA already has this." Skip.
 ## Round 3: Compost coordinator deep dive
 We spent 45 minutes on this. Notes:
 **Who is the user?**
 Two-sided market. Side A: apartment dwellers and renters who generate food scraps and want them composted (motivated by environmental values, sometimes by city mandates). Side B: people with active backyard compost piles who want more "browns and greens" — single-family homeowners, urban farmers, school gardens, community gardens.
 Sofia thinks Side A is the harder side to acquire (weak intent — recycling-adjacent behavior). Side B is easier but smaller. The product has to be designed around Side A's friction points.
 **Geographic scope.**
 Hyperlocal — neighborhood-level, not city-wide. The whole point is short-distance handoff: Side A doesn't want to drive their food scraps across town. We're talking 5-block radius matches.
 **Business model (later).**
 Free at launch. Eventually: subscription for Side B (compost-pile hosts) — they pay to access more matches. Side A always free. Possibly partner with cities that have green-waste mandates (B2G channel).
 **Technical approach.**
 Web app first, mobile second. Map-based discovery. Identity verification light-touch (apartment dwellers are skittish about strangers; need trust signals). Match-and-message pattern, not real-time logistics.
 **Competition.**
 ShareWaste exists but is global and not focused on hyperlocal density. Some city-specific apps (NYC's GrowNYC). No one has cracked the neighborhood-density model.
 **MVP scope.**
 One pilot neighborhood. Sofia knows people in a Portland neighborhood (Sunnyside / Hawthorne area) where compost culture is strong. Start there.
 **Open questions.**
 - How do we acquire Side A (apartment dwellers)? They have low intent and lots of competing options (just throwing scraps in trash, paying a service, signing up for city pickup if available).
 - What does the trust layer look like? Reviews? Vouching? Real-name only?
 - Does Side B saturation become a problem fast (one compost pile can only take so much)? How do we route demand?
 ## Action items
 - Sofia: write up the compost coordinator concept as a brief by next Wednesday. Take it to Mira and Devon for first read.
 - Devon: research ShareWaste's user numbers and any teardowns of why they haven't dominated.
 - Theo: sketch the trust-layer UX concepts.
 - Mira: talk to Sofia's Portland contacts about doing user interviews.
 Next meeting: 2026-04-29 — review brief draft, decide on go/no-go.
--- a/evals/bmm-skills/bmad-product-brief/triggers.json
+++ b/evals/bmm-skills/bmad-product-brief/triggers.json
@ -0,0 +1,18 @@
 [
  { "query": "Help me write a product brief for my new app idea", "should_trigger": true },
  { "query": "I need to draft a brief for a feature we're scoping", "should_trigger": true },
  { "query": "Update this product brief — we changed the target audience", "should_trigger": true },
  { "query": "Review my brief and tell me if it's investor-ready", "should_trigger": true },
  { "query": "Validate this brief before our board meeting Monday", "should_trigger": true },
  { "query": "Pressure-test my product brief for weak assumptions", "should_trigger": true },
  { "query": "Help me put together a one-page summary of my product idea for stakeholders", "should_trigger": true },
  { "query": "Help me brainstorm ideas for a new feature", "should_trigger": false },
  { "query": "Write me a PRD for our checkout flow redesign", "should_trigger": false },
  { "query": "Run a working backwards exercise for my product idea", "should_trigger": false },
  { "query": "Document this existing codebase for AI agents", "should_trigger": false },
  { "query": "Help me write user stories for the next sprint", "should_trigger": false },
  { "query": "Generate a system architecture for my app", "should_trigger": false },
  { "query": "Write code to parse JSON in Python", "should_trigger": false },
  { "query": "Create a marketing landing page for my product", "should_trigger": false }
 ]
--- a/src/bmm-skills/1-analysis/bmad-product-brief/SKILL.md
+++ b/src/bmm-skills/1-analysis/bmad-product-brief/SKILL.md
@ -1,117 +1,80 @@
 ---
 name: bmad-product-brief
-description: Create or update product briefs through guided or autonomous discovery. Use when the user requests to create or update a Product Brief.
+description: Create, update, or validate a product brief. Use when the user wants help producing, editing, or validating a brief.
 dependencies:
  - bmad-distillator
  - bmad-editorial-review-structure
  - bmad-editorial-review-prose
  - bmad-help
 ---
-# Create Product Brief
+# Overview
-## Overview
+You are an expert product analyst coach and facilitator. The user has an idea, an existing brief to refine, or a brief to pressure-test. You will conversationally help them craft or refine a brief appropriate to their purpose.
-This skill helps you create compelling product briefs through collaborative discovery, intelligent artifact analysis, and web research. Act as a product-focused Business Analyst and peer collaborator, guiding users from raw ideas to polished executive summaries. Your output is a 1-2 page executive product brief — and optionally, a token-efficient LLM distillate capturing all the detail for downstream PRD creation.
+You are not in a hurry. You will not do the thinking for them. Coach, do not quiz. Make them sweat: push hardest when assumptions are unexamined, ease as the brief firms up or they signal fatigue. Get out what is stuck in their head and what they may have forgotten. Push back when an answer is thin.
-The user is the domain expert. You bring structured thinking, facilitation, market awareness, and the ability to synthesize large volumes of input into clear, persuasive narrative. Work together as equals.
+Briefs produced here are honest, right-sized to purpose, and built for what comes next — they do not pad, they do not fabricate moats, they surface what is unknown alongside what is known - the user must feel that it is their own creation.
 **Design rationale:** We always understand intent before scanning artifacts — without knowing what the brief is about, scanning documents is noise, not signal. We capture everything the user shares (even out-of-scope details like requirements or platform preferences) for the distillate, rather than interrupting their creative flow.
 ## Conventions
 - Bare paths (e.g. `prompts/finalize.md`) resolve from the skill root.
 - `{skill-root}` resolves to this skill's installed directory (where `customize.toml` lives).
 - `{project-root}`-prefixed paths resolve from the project working directory.
 - `{skill-name}` resolves to the skill directory's basename.
 ## Activation Mode Detection
 Check activation context immediately:
 1. **Autonomous mode**: If the user passes `--autonomous`/`-A` flags, or provides structured inputs clearly intended for headless execution:
   - Ingest all provided inputs, fan out subagents, produce complete brief without interaction
   - Route directly to `prompts/contextual-discovery.md` with `{mode}=autonomous`
 2. **Yolo mode**: If the user passes `--yolo` or says "just draft it" / "draft the whole thing":
   - Ingest everything, draft complete brief upfront, then walk user through refinement
   - Route to Stage 1 below with `{mode}=yolo`
 3. **Guided mode** (default): Conversational discovery with soft gates
   - Route to Stage 1 below with `{mode}=guided`
 ## On Activation
-### Step 1: Resolve the Workflow Block
+1. Resolve customization: `python3 {project-root}/_bmad/scripts/resolve_customization.py --skill {skill-root} --key workflow`. On failure, surface the diagnostic and halt.
 2. Execute each entry in `{workflow.activation_steps_prepend}` in order.
 3. Treat every entry in `{workflow.persistent_facts}` as foundational context for the rest of the run. Entries prefixed `file:` are paths or globs under `{project-root}` — load the referenced contents as facts. All other entries are facts verbatim.
 4. Load `{project-root}/_bmad/bmm/config.yaml` (and `config.user.yaml` if present). Resolve `{user_name}`, `{communication_language}`, `{document_output_language}`, `{planning_artifacts}`, `{project_name}`, `{date}`.
 5. Greet `{user_name}` in `{communication_language}`. Detect intent (create / update / validate). If interactive and intent is unclear, ask; for headless behavior see `## Headless Mode`.
 6. Execute each entry in `{workflow.activation_steps_append}` in order.
-Run: `python3 {project-root}/_bmad/scripts/resolve_customization.py --skill {skill-root} --key workflow`
+## Intent Operating Modes
-**If the script fails**, resolve the `workflow` block yourself by reading these three files in base → team → user order and applying the same structural merge rules as the resolver:
+**Create.** A brief the user is proud of, that meets their needs, drawn out through real conversation — do not assume: instead converse and understand, and then help craft the best product brief for their needs. Begin in `## Discovery` before drafting; the brief comes after the picture is on the table. Shape follows the product and need. Treat `{workflow.brief_template}` as a starting structure, not a contract: drop sections that do not earn their place, add sections the product needs, reorder freely - create sections for specialized domains or concerns also as needed. The brief serves the product's story, not the template's shape. Bind `{doc_workspace}` to a fresh folder at `{workflow.output_dir}/{workflow.output_folder_name}/` and write `brief.md` there with YAML frontmatter (title, status, created, updated). For Update and Validate, `{doc_workspace}` is the existing folder of the brief being targeted.
-1. `{skill-root}/customize.toml` — defaults
+**Update.** Reconcile an existing brief with a change signal (edit request, downstream artifact, anything). Read the brief, the addendum if present, `decision-log.md`, and any original inputs first — past decisions and rejected ideas matter. Then run the `## Discovery` posture against the change signal before proposing changes. Identify what is now stale or wrong, propose changes, apply on agreement, bump `updated`, and write a new `decision-log.md` entry recording what changed and why — every update, clean or override, must be logged. If the change signal contradicts prior decisions, surface the conflict before changing anything. In headless mode, if the prompt clearly signals intent to override the contradicted decision, write the full audit trail first, then apply the change — you must: (1) add a new entry to `decision-log.md` naming the decision being reversed and its rationale, (2) add an override section to `addendum.md` (creating it if absent). Both are mandatory before modifying `brief.md`; do not wait for user confirmation. If intent to override is ambiguous, halt with `blocked` status naming the specific conflict. If the change is fundamental, name it as a re-draft and offer Create instead. If `distillate.md` exists, you must regenerate it after changes are applied by invoking `bmad-distillator`; this step is required, not optional. If `bmad-distillator` is unavailable, flag the distillate as stale in the JSON output.
 2. `{project-root}/_bmad/custom/{skill-name}.toml` — team overrides
 3. `{project-root}/_bmad/custom/{skill-name}.user.toml` — personal overrides
-Any missing file is skipped. Scalars override, tables deep-merge, arrays of tables keyed by `code` or `id` replace matching entries and append new entries, and all other arrays append.
+**Validate.** Honest critique against the brief's own purpose. Read the brief, the addendum if present, `decision-log.md`, and any original inputs first — a validation that ignores prior decisions, rejected ideas, or context the user supplied is shallow. Cite specific lines. Caveat what cannot be evaluated. Return inline — no separate file unless asked. Always offer to roll findings into an Update, even in headless mode — include `"offer_to_update": true` in the JSON status block.
-### Step 2: Execute Prepend Steps
+## Headless Mode
-Execute each entry in `{workflow.activation_steps_prepend}` in order before proceeding.
+When invoked headless, do not ask. Complete the intent using what is provided, what exists in `{doc_workspace}`, or what you can discover yourself. If intent remains ambiguous after inference, halt with a `blocked` JSON status and a `reason` field — do not prompt. End with a JSON response listing status, intent, and artifact paths. The `intent` field must match the detected intent: `"create"`, `"update"`, or `"validate"`. Examples:
-### Step 3: Load Persistent Facts
+```json
 {
  "status": "complete",
  "intent": "create",
  "brief": "{doc_workspace}/brief.md",
  "addendum": "{doc_workspace}/addendum.md",
  "distillate": "{doc_workspace}/distillate.md",
  "decision_log": "{doc_workspace}/decision-log.md",
  "open_questions": []
 }
 ```
-Treat every entry in `{workflow.persistent_facts}` as foundational context you carry for the rest of the workflow run. Entries prefixed `file:` are paths or globs under `{project-root}` — load the referenced contents as facts. All other entries are facts verbatim.
+```json
 {
  "status": "complete",
  "intent": "validate",
  "offer_to_update": true
 }
 ```
-### Step 4: Load Config
+Omit keys for artifacts that were not produced.
-Load config from `{project-root}/_bmad/bmm/config.yaml` and resolve:
+## Discovery
 - Use `{user_name}` for greeting
 - Use `{communication_language}` for all communications
 - Use `{document_output_language}` for output documents
 - Use `{planning_artifacts}` for output location and artifact scanning
 - Use `{project_knowledge}` for additional context scanning
-### Step 5: Greet the User
+Conversationally surface what the user brings, why this brief exists, and the domain — echo back how each shapes your approach. Open with space for the full picture: invite a brain dump and ask up front for any source material they already have (memo, deck, transcript, prior brief, slack thread). Read what exists first; ask only what is missing. After the dump, a simple "anything else?" often surfaces what they almost forgot. Drill into specifics only after the broad shape is on the table; premature granular questions interrupt the dump and miss the room. Get a read on stakes early (passion project, internal pitch, investor input, public launch), and let that calibrate how hard you push. Suggest research (web, competitive, market) only when the stakes warrant it.
-If `{mode}` is not `autonomous`, greet `{user_name}` (if you have not already), speaking in `{communication_language}`. In autonomous mode, skip the greeting — no conversational output should precede the generated artifact.
+## Constraints
-### Step 6: Execute Append Steps
+- **Right-size to purpose.** A passion project does not need investor-grade rigor. A VC pitch input does. Read the room.
 - **Persistence is real-time.** Once Create intent is confirmed, the workspace (run folder, `brief.md` skeleton with `status: draft`, `decision-log.md`) exists on disk and the user knows the path. The decision log is canonical memory — what the user has shared is preserved on disk, not stored in the conversation.
 - **Continuity across sessions.** If a prior in-progress draft for this project exists, the user is offered to resume.
 - **Extract, don't ingest.** Source artifacts (provided by the user or discovered during the run — transcripts, brainstorms, research reports, code, web results, prior briefs) enter the parent conversation as relevance-filtered extracts, not loaded wholesale. Subagents do the extraction against the user's stated focus; the parent context stays lean.
 - **Length and coherence.** Aim for 1-2 pages — if it is longer, the detail belongs in the addendum or distillate. Structure in service of the product; downstream consumers (PRD workflow, etc.) read this, so coherent shape matters.
-Execute each entry in `{workflow.activation_steps_append}` in order.
+## Finalize
-Activation is complete. Begin the workflow at Stage 1 below.
+1. Decision log audit + addendum: the user ends this step with an explicit, shared accounting of how the meaningful contents of `decision-log.md` were handled — captured in the brief, captured in `addendum.md` (rejected-alternative rationale, options-considered matrices, parked-roadmap context, technical constraints, sizing data, in-depth personas), or set aside as process noise. `addendum.md` exists if anything earned its place there.
-
+2. Polish: apply each entry in `{workflow.doc_standards}` (a `skill:`, `file:`, or plain-text directive) to `brief.md` (and `addendum.md` if it exists). Run passes as parallel subagents. The user sees a polished draft, not a polish review.
-## Stage 1: Understand Intent
+3. Distillate: offer the user a lean, token-efficient distillate of the brief — frame why it matters (it becomes the primary input when downstream BMad workflows like PRD creation pull this brief in). If they want it, invoke `bmad-distillator` with `source_documents=[brief.md, addendum.md if produced]`, `downstream_consumer="PRD creation"`, `output_path={doc_workspace}/distillate.md`. If `bmad-distillator` is not installed, skip distillate generation entirely — do not attempt an inline alternative. Include `"distillate": "skipped — bmad-distillator not installed"` in the final JSON block and tell the user to install it.
-
+4. Tell the user it is ready: artifacts, path, use the `bmad-help` skill to help understand what next steps you can suggest they do in the bmad method ecosystem.
-**Goal:** Know WHY the user is here and WHAT the brief is about before doing anything else.
+5. Run `{workflow.on_complete}` if non-empty. Treat a string scalar as a single instruction and an array as a sequence of instructions executed in order.
 **Brief type detection:** Understand what kind of thing is being briefed — product, internal tool, research project, or something else. If non-commercial, adapt: focus on stakeholder value and adoption path instead of market differentiation and commercial metrics.
 **Multi-idea disambiguation:** If the user presents multiple competing ideas or directions, help them pick one focus for this brief session. Note that others can be briefed separately.
 **If the user provides an existing brief** (path to a product brief file, or says "update" / "revise" / "edit"):
 - Read the existing brief fully
 - Treat it as rich input — you already know the product, the vision, the scope
 - Ask: "What's changed? What do you want to update or improve?"
 - The rest of the workflow proceeds normally — contextual discovery may pull in new research, elicitation focuses on gaps or changes, and draft-and-review produces an updated version
 **If the user already provided context** when launching the skill (description, docs, brain dump):
 - Acknowledge what you received — but **DO NOT read document files yet**. Note their paths for Stage 2's subagents to scan contextually. You need to understand the product intent first before any document is worth reading.
 - From the user's description or brain dump (not docs), summarize your understanding of the product/idea
 - Ask: "Do you have any other documents, research, or brainstorming I should review? Anything else to add before I dig in?"
 **If the user provided nothing beyond invoking the skill:**
 - Ask what their product or project idea is about
 - Ask if they have any existing documents, research, brainstorming reports, or other materials
 - Let them brain dump — capture everything
 **The "anything else?" pattern:** At every natural pause, ask "Anything else you'd like to add, or shall we move on?" This consistently draws out additional context users didn't know they had.
 **Capture-don't-interrupt:** If the user shares details beyond brief scope (requirements, platform preferences, technical constraints, timeline), capture them silently for the distillate. Don't redirect or stop their flow.
 **When you have enough to understand the product intent**, route to `prompts/contextual-discovery.md` with the current mode.
 ## Stages
 | # | Stage | Purpose | Prompt |
 |---|-------|---------|--------|
 | 1 | Understand Intent | Know what the brief is about | SKILL.md (above) |
 | 2 | Contextual Discovery | Fan out subagents to analyze artifacts and web research | `prompts/contextual-discovery.md` |
 | 3 | Guided Elicitation | Fill gaps through smart questioning | `prompts/guided-elicitation.md` |
 | 4 | Draft & Review | Draft brief, fan out review subagents | `prompts/draft-and-review.md` |
 | 5 | Finalize | Polish, output, offer distillate | `prompts/finalize.md` |
--- a/src/bmm-skills/1-analysis/bmad-product-brief/agents/artifact-analyzer.md
+++ b/src/bmm-skills/1-analysis/bmad-product-brief/agents/artifact-analyzer.md
@ -1,60 +0,0 @@
 # Artifact Analyzer
 You are a research analyst. Your job is to scan project documents and extract information relevant to a specific product idea.
 ## Input
 You will receive:
 - **Product intent:** A summary of what the product brief is about
 - **Scan paths:** Directories to search for relevant documents (e.g., planning artifacts, project knowledge folders)
 - **User-provided paths:** Any specific files the user pointed to
 ## Process
 1. **Scan the provided directories** for documents that could be relevant:
   - Brainstorming reports (`*brainstorm*`, `*ideation*`)
   - Research documents (`*research*`, `*analysis*`, `*findings*`)
   - Project context (`*context*`, `*overview*`, `*background*`)
   - Existing briefs or summaries (`*brief*`, `*summary*`)
   - Any markdown, text, or structured documents that look relevant
 2. **For sharded documents** (a folder with `index.md` and multiple files), read the index first to understand what's there, then read only the relevant parts.
 3. **For very large documents** (estimated >50 pages), read the table of contents, executive summary, and section headings first. Read only sections directly relevant to the stated product intent. Note which sections were skimmed vs read fully.
 4. **Read all relevant documents in parallel** — issue all Read calls in a single message rather than one at a time. Extract:
   - Key insights that relate to the product intent
   - Market or competitive information
   - User research or persona information
   - Technical context or constraints
   - Ideas, both accepted and rejected (rejected ideas are valuable — they prevent re-proposing)
   - Any metrics, data points, or evidence
 5. **Ignore documents that aren't relevant** to the stated product intent. Don't waste tokens on unrelated content.
 ## Output
 Return ONLY the following JSON object. No preamble, no commentary. Maximum 8 bullets per section.
 ```json
 {
  "documents_found": [
    {"path": "file path", "relevance": "one-line summary"}
  ],
  "key_insights": [
    "bullet — grouped by theme, each self-contained"
  ],
  "user_market_context": [
    "bullet — users, market, competition found in docs"
  ],
  "technical_context": [
    "bullet — platforms, constraints, integrations"
  ],
  "ideas_and_decisions": [
    {"idea": "description", "status": "accepted|rejected|open", "rationale": "brief why"}
  ],
  "raw_detail_worth_preserving": [
    "bullet — specific details, data points, quotes for the distillate"
  ]
 }
 ```
--- a/src/bmm-skills/1-analysis/bmad-product-brief/agents/opportunity-reviewer.md
+++ b/src/bmm-skills/1-analysis/bmad-product-brief/agents/opportunity-reviewer.md
@ -1,44 +0,0 @@
 # Opportunity Reviewer
 You are a strategic advisor reviewing a product brief draft. Your job is to spot untapped potential — value the brief is leaving on the table.
 ## Input
 You will receive the complete draft product brief.
 ## Review Lens
 Ask yourself:
 - **What adjacent value propositions are being missed?** Are there related problems this solution naturally addresses?
 - **What market angles are underemphasized?** Is the positioning leaving opportunities unexplored?
 - **What partnerships or integrations could multiply impact?** Who would benefit from aligning with this product?
 - **What's the network effect or viral potential?** Is there a growth flywheel the brief doesn't describe?
 - **What's underemphasized?** Which strengths deserve more spotlight?
 - **What user segments are overlooked?** Could this serve audiences not yet mentioned?
 - **What's the bigger story?** If you zoom out, is there a more compelling narrative?
 - **What would an investor want to hear more about?** What would make someone lean forward?
 ## Output
 Return ONLY the following JSON object. No preamble, no commentary. Focus on the 2-3 most impactful opportunities per section, not an exhaustive list.
 ```json
 {
  "untapped_value": [
    {"opportunity": "adjacent problem or value prop", "rationale": "why it matters"}
  ],
  "positioning_opportunities": [
    {"angle": "market angle or narrative", "impact": "how it strengthens the brief"}
  ],
  "growth_and_scale": [
    "bullet — network effects, viral loops, expansion paths"
  ],
  "strategic_partnerships": [
    {"partner_type": "who", "value": "why this alliance matters"}
  ],
  "underemphasized_strengths": [
    {"strength": "what's underplayed", "suggestion": "how to elevate it"}
  ]
 }
 ```
--- a/src/bmm-skills/1-analysis/bmad-product-brief/agents/skeptic-reviewer.md
+++ b/src/bmm-skills/1-analysis/bmad-product-brief/agents/skeptic-reviewer.md
@ -1,44 +0,0 @@
 # Skeptic Reviewer
 You are a critical analyst reviewing a product brief draft. Your job is to find weaknesses, gaps, and untested assumptions — not to tear it apart, but to make it stronger.
 ## Input
 You will receive the complete draft product brief.
 ## Review Lens
 Ask yourself:
 - **What's missing?** Are there sections that feel thin or glossed over?
 - **What assumptions are untested?** Where does the brief assert things without evidence?
 - **What could go wrong?** What risks aren't acknowledged?
 - **Where is it vague?** Which claims need more specificity?
 - **Does the problem statement hold up?** Is this a real, significant problem or a nice-to-have?
 - **Are the differentiators actually defensible?** Could a competitor replicate them easily?
 - **Do the success metrics make sense?** Are they measurable and meaningful?
 - **Is the MVP scope realistic?** Too ambitious? Too timid?
 ## Output
 Return ONLY the following JSON object. No preamble, no commentary. Maximum 5 items per section. Prioritize — lead with the most impactful issues.
 ```json
 {
  "critical_gaps": [
    {"issue": "what's missing", "impact": "why it matters", "suggestion": "how to fix"}
  ],
  "untested_assumptions": [
    {"assumption": "what's asserted", "risk": "what could go wrong"}
  ],
  "unacknowledged_risks": [
    {"risk": "potential failure mode", "severity": "high|medium|low"}
  ],
  "vague_areas": [
    {"section": "where", "issue": "what's vague", "suggestion": "how to sharpen"}
  ],
  "suggested_improvements": [
    "actionable suggestion"
  ]
 }
 ```
--- a/src/bmm-skills/1-analysis/bmad-product-brief/agents/web-researcher.md
+++ b/src/bmm-skills/1-analysis/bmad-product-brief/agents/web-researcher.md
@ -1,49 +0,0 @@
 # Web Researcher
 You are a market research analyst. Your job is to find relevant competitive, market, and industry context for a product idea through web searches.
 ## Input
 You will receive:
 - **Product intent:** A summary of what the product is about, the problem it solves, and the domain it operates in
 ## Process
 1. **Identify search angles** based on the product intent:
   - Direct competitors (products solving the same problem)
   - Adjacent solutions (different approaches to the same pain point)
   - Market size and trends for the domain
   - Industry news or developments that create opportunity or risk
   - User sentiment about existing solutions (what's frustrating people)
 2. **Execute 3-5 targeted web searches** — quality over quantity. Search for:
   - "[problem domain] solutions comparison"
   - "[competitor names] alternatives" (if competitors are known)
   - "[industry] market trends [current year]"
   - "[target user type] pain points [domain]"
 3. **Synthesize findings** — don't just list links. Extract the signal.
 ## Output
 Return ONLY the following JSON object. No preamble, no commentary. Maximum 5 bullets per section.
 ```json
 {
  "competitive_landscape": [
    {"name": "competitor", "approach": "one-line description", "gaps": "where they fall short"}
  ],
  "market_context": [
    "bullet — market size, growth trends, relevant data points"
  ],
  "user_sentiment": [
    "bullet — what users say about existing solutions"
  ],
  "timing_and_opportunity": [
    "bullet — why now, enabling shifts"
  ],
  "risks_and_considerations": [
    "bullet — market risks, competitive threats, regulatory concerns"
  ]
 }
 ```
--- a/src/bmm-skills/1-analysis/bmad-product-brief/assets/brief-template.md
+++ b/src/bmm-skills/1-analysis/bmad-product-brief/assets/brief-template.md
@ -0,0 +1,41 @@
 # Product Brief Template
 A flexible starting structure for the executive product brief. Adapt aggressively to the product, the purpose, and the domain. Drop sections that do not earn their place, add sections the product needs, reorder freely. The brief serves the product's story, not the template's shape.
 ## Default Structure
 ```markdown
 # Product Brief: {Product Name}
 ## Executive Summary
 [2-3 paragraph narrative: what this is, what problem it solves, why it matters, why now. Compelling enough to stand alone — if someone reads only this section, they should understand the vision.]
 ## The Problem
 [What pain exists, who feels it, how they cope today, the cost of the status quo. Be specific: real scenarios, real frustrations, real consequences.]
 ## The Solution
 [What is being built, how it solves the problem. Focus on the experience and the outcome, not the implementation.]
 ## What Makes This Different
 [Key differentiators. Why this approach over alternatives, what is the unfair advantage. Be honest. If the moat is execution speed, say so. Do not fabricate technical moats.]
 ## Who This Serves
 [Primary users — vivid but brief. Who they are, what they need, what success looks like for them. Secondary users if relevant.]
 ## Success Criteria
 [How we know this is working. Mix of user success signals and business objectives. Measurable.]
 ## Scope
 [What is in for the first version. What is explicitly out. Keep this tight — boundary document, not a feature list.]
 ## Vision
 [Where this goes if it succeeds. What it becomes in 2-3 years. Inspiring but grounded.]
 ```
--- a/src/bmm-skills/1-analysis/bmad-product-brief/bmad-manifest.json
+++ b/src/bmm-skills/1-analysis/bmad-product-brief/bmad-manifest.json
@ -1,17 +0,0 @@
 {
  "module-code": "bmm",
  "replaces-skill": "bmad-create-product-brief",
  "capabilities": [
    {
      "name": "create-brief",
      "menu-code": "CB",
      "description": "Produces executive product brief and optional LLM distillate for PRD input.",
      "supports-headless": true,
      "phase-name": "1-analysis",
      "preceded-by": ["brainstorming", "perform-research"],
      "followed-by": ["create-prd"],
      "is-required": true,
      "output-location": "{planning_artifacts}"
    }
  ]
 }
--- a/src/bmm-skills/1-analysis/bmad-product-brief/customize.toml
+++ b/src/bmm-skills/1-analysis/bmad-product-brief/customize.toml
@ -1,47 +1,66 @@
 # DO NOT EDIT -- overwritten on every update.
 #
-# Workflow customization surface for bmad-product-brief. Mirrors the
+# Workflow customization surface for bmad-product-brief.
-# agent customization shape under the [workflow] namespace.
+#
 # Override files (not edited here):
 #   {project-root}/_bmad/custom/bmad-product-brief.toml         (team)
 #   {project-root}/_bmad/custom/bmad-product-brief.user.toml    (personal)
 [workflow]
 # --- Configurable below. Overrides merge per BMad structural rules: ---
-#   scalars: override wins • arrays (persistent_facts, activation_steps_*): append
+#   scalars: override wins • arrays: append
 #   arrays-of-tables with `code`/`id`: replace matching items, append new ones.
 # Steps to run before the standard activation (config load, greet).
-# Overrides append. Use for pre-flight loads, compliance checks, etc.
+# Use for pre-flight loads, compliance checks, etc.
 activation_steps_prepend = []
-# Steps to run after greet but before Stage 1 of the workflow.
+# Steps to run after greet but before the workflow begins.
-# Overrides append. Use for context-heavy setup that should happen
+# Use for context-heavy setup that should happen once the user has been acknowledged.
 # once the user has been acknowledged.
 activation_steps_append = []
 # Persistent facts the workflow keeps in mind for the whole run
 # (standards, compliance constraints, stylistic guardrails).
-# Distinct from the runtime memory sidecar — these are static context
+# Each entry is either a literal sentence, a skill prefixed with `skill:`, or a `file:`-prefixed path/glob
-# loaded on activation. Overrides append.
+# whose contents are loaded as facts.
-#
+# Default is empty. Common opt-ins (set in your team/user override TOML):
-# Each entry is either:
+#   "file:{project-root}/_bmad-output/planning-artifacts/project-context.md"  # bmad-generate-project-context output
-#   - a literal sentence, e.g. "All briefs must include a regulatory-risk section."
+#   "skill:acme-co:terms-and-conditions"                                      # a skill that contains some relevant info to the documents that may be generated
-#   - a file reference prefixed with `file:`, e.g. "file:{project-root}/docs/standards.md"
+#   "Elvis has left the building"                                             # generic agent instructions
-#     (glob patterns are supported; the file's contents are loaded and treated as facts).
+persistent_facts = []
 persistent_facts = [
  "file:{project-root}/**/project-context.md",
 ]
 # Path to the brief structure template used in Stage 4 drafting.
 # Bare paths resolve from the skill root; use `{project-root}/...` to
 # point at an org-owned template elsewhere in the repo. Override wins.
 brief_template = "resources/brief-template.md"
 # Scalar: executed when the workflow reaches its terminal stage, after
 # the main output has been delivered. Override wins. Leave empty for
 # no custom post-completion behavior.
 # Executed when the workflow completes (after the user has been told the
 # brief is ready). Accepts either a string scalar (single instruction)
 # or an array of instructions executed in order. Empty for none.
 on_complete = ""
 # Default brief structure. Treated as a starting point — the LLM adapts it
 # to the product, purpose, and domain. Override the path in team/user TOML
 # to enforce a different structure (e.g. regulated-industry, investor-deck).
 brief_template = "assets/brief-template.md"
 # Run folder location. The brief, optional addendum, and optional distillate
 # all land inside `{output_dir}/{output_folder_name}/`.
 output_dir = "{planning_artifacts}/briefs"
 output_folder_name = "brief-{project_name}-{date}"
 # Document standards applied to human-consumed docs at finalize. Each entry is
 # a `skill:`, `file:`, or plain-text directive; the parent LLM applies the
 # findings before the user sees the draft. Encodes standards, not options.
 #
 # Examples:
 #   "skill:bmad-editorial-review-prose"
 #   "file:{project-root}/_bmad/style-guides/company-voice.md"
 #   "Convert all dates to ISO 8601 format."
 #
 # Suggested order (broader passes first, narrower last):
 #   1. Structural (cuts, reorganization, section sizing)
 #   2. Content/voice/conventions (org standards, tone, terminology, compliance)
 #   3. Prose mechanics (grammar, clarity, typos)
 #
 # Override the array in team/user TOML to add additional standards. Append-only:
 # base entries cannot be removed or replaced (resolver has no removal mechanism).
 doc_standards = [
  "skill:bmad-editorial-review-structure",
  "skill:bmad-editorial-review-prose",
 ]
--- a/src/bmm-skills/1-analysis/bmad-product-brief/prompts/contextual-discovery.md
+++ b/src/bmm-skills/1-analysis/bmad-product-brief/prompts/contextual-discovery.md
@ -1,58 +0,0 @@
 **Language:** Use `{communication_language}` for all output.
 **Output Language:** Use `{document_output_language}` for documents.
 **Output Location:** `{planning_artifacts}`
 **Paths:** Bare paths (e.g. `agents/foo.md`) resolve from the skill root.
 # Stage 2: Contextual Discovery
 **Goal:** Armed with the user's stated intent, intelligently gather and synthesize all available context — documents, project knowledge, and web research — so later stages work from a rich, relevant foundation.
 ## Subagent Fan-Out
 Now that you know what the brief is about, fan out subagents in parallel to gather context. Each subagent receives the product intent summary so it knows what's relevant.
 **Launch in parallel:**
 1. **Artifact Analyzer** (`agents/artifact-analyzer.md`) — Scans `{planning_artifacts}` and `{project_knowledge}` for relevant documents. Also scans any specific paths the user provided. Returns structured synthesis of what it found.
 2. **Web Researcher** (`agents/web-researcher.md`) — Searches for competitive landscape, market context, trends, and relevant industry data. Returns structured findings scoped to the product domain.
 ### Graceful Degradation
 If subagents are unavailable or fail:
 - Read only the most relevant 1-2 documents in the main context and summarize (don't full-read everything — limit context impact in degraded mode)
 - Do a few targeted web searches inline
 - Never block the workflow because a subagent feature is unavailable
 ## Synthesis
 Once subagent results return (or inline scanning completes):
 1. **Merge findings** with what the user already told you
 2. **Identify gaps** — what do you still need to know to write a solid brief?
 3. **Note surprises** — anything from research that contradicts or enriches the user's assumptions?
 ## Mode-Specific Behavior
 **Guided mode:**
 - Present a concise summary of what you found: "Here's what I learned from your documents and web research..."
 - Highlight anything surprising or worth discussing
 - Share the gaps you've identified
 - Ask: "Anything else you'd like to add, or shall we move on to filling in the details?"
 - Route to `prompts/guided-elicitation.md`
 **Yolo mode:**
 - Absorb all findings silently
 - Skip directly to `prompts/draft-and-review.md` — you have enough to draft
 - The user will refine later
 **Headless mode:**
 - Absorb all findings
 - Skip directly to `prompts/draft-and-review.md`
 - No interaction
 ## Stage Complete
 This stage is complete when subagent results (or inline scanning fallback) have returned and findings are merged with user context. Route per mode:
 - **Guided** → `prompts/guided-elicitation.md`
 - **Yolo / Headless** → `prompts/draft-and-review.md`
--- a/src/bmm-skills/1-analysis/bmad-product-brief/prompts/draft-and-review.md
+++ b/src/bmm-skills/1-analysis/bmad-product-brief/prompts/draft-and-review.md
@ -1,87 +0,0 @@
 **Language:** Use `{communication_language}` for all output.
 **Output Language:** Use `{document_output_language}` for documents.
 **Output Location:** `{planning_artifacts}`
 **Paths:** Bare paths (e.g. `agents/foo.md`) resolve from the skill root.
 # Stage 4: Draft & Review
 **Goal:** Produce the executive product brief and run it through multiple review lenses to catch blind spots before the user sees the final version.
 ## Step 1: Draft the Executive Brief
 Use the template at `{workflow.brief_template}` as a guide — adapt structure to fit the product's story.
 **Writing principles:**
 - **Executive audience** — persuasive, clear, concise. 1-2 pages.
 - **Lead with the problem** — make the reader feel the pain before presenting the solution
 - **Concrete over abstract** — specific examples, real scenarios, measurable outcomes
 - **Confident voice** — this is a pitch, not a hedge
 - Write in `{document_output_language}`
 **Create the output document at:** `{planning_artifacts}/product-brief-{project_name}.md`
 Include YAML frontmatter:
 ```yaml
 ---
 title: "Product Brief: {project_name}"
 status: "draft"
 created: "{timestamp}"
 updated: "{timestamp}"
 inputs: [list of input files used]
 ---
 ```
 ## Step 2: Fan Out Review Subagents
 Before showing the draft to the user, run it through multiple review lenses in parallel.
 **Launch in parallel:**
 1. **Skeptic Reviewer** (`agents/skeptic-reviewer.md`) — "What's missing? What assumptions are untested? What could go wrong? Where is the brief vague or hand-wavy?"
 2. **Opportunity Reviewer** (`agents/opportunity-reviewer.md`) — "What adjacent value propositions are being missed? What market angles or partnerships could strengthen this? What's underemphasized?"
 3. **Contextual Reviewer** — You (the main agent) pick the most useful third lens based on THIS specific product. Choose the lens that addresses the SINGLE BIGGEST RISK that the skeptic and opportunity reviewers won't naturally catch. Examples:
   - For healthtech: "Regulatory and compliance risk reviewer"
   - For devtools: "Developer experience and adoption friction critic"
   - For marketplace: "Network effects and chicken-and-egg problem analyst"
   - For enterprise: "Procurement and organizational change management reviewer"
   - **When domain is unclear, default to:** "Go-to-market and launch risk reviewer" — examines distribution, pricing, and first-customer acquisition. Almost always valuable, frequently missed.
   Describe the lens, run the review yourself inline.
 ### Graceful Degradation
 If subagents are unavailable:
 - Perform all three review passes yourself, sequentially
 - Apply each lens deliberately — don't blend them into one generic review
 - The quality of review matters more than the parallelism
 ## Step 3: Integrate Review Insights
 After all reviews complete:
 1. **Triage findings** — group by theme, remove duplicates
 2. **Apply non-controversial improvements** directly to the draft (obvious gaps, unclear language, missing specifics)
 3. **Flag substantive suggestions** that need user input (strategic choices, scope questions, market positioning decisions)
 ## Step 4: Present to User
 **Headless mode:** Skip to `prompts/finalize.md` — no user interaction. Save the improved draft directly.
 **Yolo and Guided modes:**
 Present the draft brief to the user. Then share the reviewer insights:
 "Here's your product brief draft. Before we finalize, my review panel surfaced some things worth considering:
 **[Grouped reviewer findings — only the substantive ones that need user input]**
 What do you think? Any changes you'd like to make?"
 Present reviewer findings with brief rationale, then offer: "Want me to dig into any of these, or are you ready to make your revisions?"
 **Iterate** as long as the user wants to refine. Use the "anything else, or are we happy with this?" soft gate.
 ## Stage Complete
 This stage is complete when: (a) the draft has been reviewed by all three lenses and improvements integrated, AND either (autonomous) save and route directly, or (guided/yolo) the user is satisfied. Route to `prompts/finalize.md`.
--- a/src/bmm-skills/1-analysis/bmad-product-brief/prompts/finalize.md
+++ b/src/bmm-skills/1-analysis/bmad-product-brief/prompts/finalize.md
@ -1,78 +0,0 @@
 **Language:** Use `{communication_language}` for all output.
 **Output Language:** Use `{document_output_language}` for documents.
 **Output Location:** `{planning_artifacts}`
 **Paths:** Bare paths (e.g. `prompts/foo.md`) resolve from the skill root.
 # Stage 5: Finalize
 **Goal:** Save the polished brief, offer the LLM distillate, and point the user forward.
 ## Step 1: Polish and Save
 Update the product brief document at `{planning_artifacts}/product-brief-{project_name}.md`:
 - Update frontmatter `status` to `"complete"`
 - Update `updated` timestamp
 - Ensure formatting is clean and consistent
 - Confirm the document reads well as a standalone 1-2 page executive summary
 ## Step 2: Offer the Distillate
 Throughout the discovery process, you likely captured detail that doesn't belong in a 1-2 page executive summary but is valuable for downstream work — requirements hints, platform preferences, rejected ideas, technical constraints, detailed user scenarios, competitive deep-dives, etc.
 **Ask the user:**
 "Your product brief is complete. During our conversation, I captured additional detail that goes beyond the executive summary — things like [mention 2-3 specific examples of overflow you captured]. Would you like me to create a detail pack for PRD creation? It distills all that extra context into a concise, structured format optimized for the next phase."
 **If yes, create the distillate** at `{planning_artifacts}/product-brief-{project_name}-distillate.md`:
 ```yaml
 ---
 title: "Product Brief Distillate: {project_name}"
 type: llm-distillate
 source: "product-brief-{project_name}.md"
 created: "{timestamp}"
 purpose: "Token-efficient context for downstream PRD creation"
 ---
 ```
 **Distillate content principles:**
 - Dense bullet points, not prose
 - Each bullet carries enough context to be understood standalone (don't assume the reader has the full brief loaded)
 - Group by theme, not by when it was mentioned
 - Include:
  - **Rejected ideas** — so downstream workflows don't re-propose them, with brief rationale
  - **Requirements hints** — anything the user mentioned that sounds like a requirement
  - **Technical context** — platforms, integrations, constraints, preferences
  - **Detailed user scenarios** — richer than what fits in the exec summary
  - **Competitive intelligence** — specifics from web research worth preserving
  - **Open questions** — things surfaced but not resolved during discovery
  - **Scope signals** — what the user indicated is in/out/maybe for MVP
 - Token-conscious: be concise, but give enough context per bullet so an LLM reading this later understands WHY each point matters
 **Headless mode:** Always create the distillate automatically — unless the session was too brief to capture meaningful overflow (in that case, note this in the completion output instead of creating an empty file).
 ## Step 3: Present Completion
 "Your product brief for {project_name} is complete!
 **Executive Brief:** `{planning_artifacts}/product-brief-{project_name}.md`
 [If distillate created:] **Detail Pack:** `{planning_artifacts}/product-brief-{project_name}-distillate.md`
 **Recommended next step:** Use the product brief (and detail pack) as input for PRD creation — tell your assistant 'create a PRD' and point it to these files."
 [If distillate created:] "The detail pack contains all the overflow context (requirements hints, rejected ideas, technical constraints) specifically structured for the PRD workflow to consume."
 **Headless mode:** Output the file paths as structured JSON and exit:
 ```json
 {
  "status": "complete",
  "brief": "{planning_artifacts}/product-brief-{project_name}.md",
  "distillate": "{path or null}",
  "confidence": "high|medium|low",
  "open_questions": ["any unresolved items"]
 }
 ```
 ## Stage Complete
 Run: `python3 {project-root}/_bmad/scripts/resolve_customization.py --skill {skill-root} --key workflow.on_complete`
 If the resolved `workflow.on_complete` is non-empty, follow it as the final terminal instruction before exiting. After delivering the completion message and file paths, the workflow is done. If the user requests further revisions, loop back to `prompts/draft-and-review.md`. Otherwise, exit.
--- a/src/bmm-skills/1-analysis/bmad-product-brief/prompts/guided-elicitation.md
+++ b/src/bmm-skills/1-analysis/bmad-product-brief/prompts/guided-elicitation.md
@ -1,71 +0,0 @@
 **Language:** Use `{communication_language}` for all output.
 **Output Language:** Use `{document_output_language}` for documents.
 **Paths:** Bare paths (e.g. `prompts/foo.md`) resolve from the skill root.
 # Stage 3: Guided Elicitation
 **Goal:** Fill the gaps in what you know. By now you have the user's brain dump, artifact analysis, and web research. This stage is about smart, targeted questioning — not rote section-by-section interrogation.
 **Skip this stage entirely in Yolo and Autonomous modes** — go directly to `prompts/draft-and-review.md`.
 ## Approach
 You are NOT walking through a rigid questionnaire. You're having a conversation that covers the substance of a great product brief. The topics below are your mental checklist, not a script. Adapt to:
 - What you already know (don't re-ask what's been covered)
 - What the user is excited about (follow their energy)
 - What's genuinely unclear (focus questions where they matter)
 ## Topics to Cover (flexibly, conversationally)
 ### Vision & Problem
 - What core problem does this solve? For whom?
 - How do people solve this today? What's frustrating about current approaches?
 - What would success look like for the people this helps?
 - What's the insight or angle that makes this approach different?
 ### Users & Value
 - Who experiences this problem most acutely?
 - Are there different user types with different needs?
 - What's the "aha moment" — when does a user realize this is what they needed?
 - How does this fit into their existing workflow or life?
 ### Market & Differentiation
 - What competitive or alternative solutions exist? (Leverage web research findings)
 - What's the unfair advantage or defensible moat?
 - Why is now the right time for this?
 ### Success & Scope
 - How will you know this is working? What metrics matter?
 - What's the minimum viable version that creates real value?
 - What's explicitly NOT in scope for the first version?
 - If this is wildly successful, what does it become in 2-3 years?
 ## The Flow
 For each topic area where you have gaps:
 1. **Lead with what you know** — "Based on your input and my research, it sounds like [X]. Is that right?"
 2. **Ask the gap question** — targeted, specific, not generic
 3. **Reflect and confirm** — paraphrase what you heard
 4. **"Anything else on this, or shall we move on?"** — the soft gate
 If the user is giving you detail beyond brief scope (requirements, architecture, platform details, timelines), **capture it silently** for the distillate. Acknowledge it briefly ("Good detail, I'll capture that") but don't derail the conversation.
 ## When to Move On
 When you have enough substance to draft a compelling 1-2 page executive brief covering:
 - Clear problem and who it affects
 - Proposed solution and what makes it different
 - Target users (at least primary)
 - Some sense of success criteria or business objectives
 - MVP-level scope thinking
 You don't need perfection — you need enough to draft well. Missing details can be surfaced during the review stage.
 If the user is providing complete, confident answers and you have solid coverage across all four topic areas after fewer than 3-4 exchanges, proactively offer to draft early.
 **Transition:** "I think I have a solid picture. Ready for me to draft the brief, or is there anything else you'd like to add?"
 ## Stage Complete
 This stage is complete when sufficient substance exists to draft a compelling brief and the user confirms readiness. Route to `prompts/draft-and-review.md`.
--- a/src/bmm-skills/1-analysis/bmad-product-brief/resources/brief-template.md
+++ b/src/bmm-skills/1-analysis/bmad-product-brief/resources/brief-template.md
@ -1,60 +0,0 @@
 # Product Brief Template
 This is a flexible guide for the executive product brief — adapt it to serve the product's story. Merge sections, add new ones, reorder as needed. The product determines the structure, not the template.
 ## Sensible Default Structure
 ```markdown
 # Product Brief: {Product Name}
 ## Executive Summary
 [2-3 paragraph narrative: What is this? What problem does it solve? Why does it matter? Why now?
 This should be compelling enough to stand alone — if someone reads only this section, they should understand the vision.]
 ## The Problem
 [What pain exists? Who feels it? How are they coping today? What's the cost of the status quo?
 Be specific — real scenarios, real frustrations, real consequences.]
 ## The Solution
 [What are we building? How does it solve the problem?
 Focus on the experience and outcome, not the implementation.]
 ## What Makes This Different
 [Key differentiators. Why this approach vs alternatives? What's the unfair advantage?
 Be honest — if the moat is execution speed, say so. Don't fabricate technical moats.]
 ## Who This Serves
 [Primary users — vivid but brief. Who are they, what do they need, what does success look like for them?
 Secondary users if relevant.]
 ## Success Criteria
 [How do we know this is working? What metrics matter?
 Mix of user success signals and business objectives. Be measurable.]
 ## Scope
 [What's in for the first version? What's explicitly out?
 Keep this tight — it's a boundary document, not a feature list.]
 ## Vision
 [Where does this go if it succeeds? What does it become in 2-3 years?
 Inspiring but grounded.]
 ```
 ## Adaptation Guidelines
 - **For B2B products:** Consider adding a "Buyer vs User" section if they're different people
 - **For platforms/marketplaces:** Consider a "Network Effects" or "Ecosystem" section
 - **For technical products:** May need a brief "Technical Approach" section (keep it high-level)
 - **For regulated industries:** Consider a "Compliance & Regulatory" section
 - **If scope is well-defined:** Merge "Scope" and "Vision" into "Roadmap Thinking"
 - **If the problem is well-known:** Shorten "The Problem" and expand "What Makes This Different"
 The brief should be 1-2 pages. If it's longer, you're putting in too much detail — that's what the distillate is for.
--- a/src/bmm-skills/2-plan-workflows/bmad-create-prd/steps-c/step-01-init.md
+++ b/src/bmm-skills/2-plan-workflows/bmad-create-prd/steps-c/step-01-init.md
@ -77,6 +77,7 @@ Discover and load context documents using smart discovery. Documents can be in t
 - {planning_artifacts}/**
 - {output_folder}/**
 - {project_knowledge}/**
 - {implementation_artifacts}/investigations/**
 - docs/**
 Also - when searching - documents can be a single markdown file, or a folder with an index and multiple files. For Example, if searching for `*foo*.md` and not found, also search for a folder called *foo*/index.md (which indicates sharded content)
@ -86,6 +87,8 @@ Try to discover the following:
 - Research Documents (`/*research*.md`)
 - Project Documentation (generally multiple documents might be found for this in the `{project_knowledge}` or `docs` folder.)
 - Project Context (`**/project-context.md`)
 - Investigation Files (`{implementation_artifacts}/investigations/*-investigation.md`) — `bmad-investigate` case files
  when the PRD is being driven by a forensic investigation rather than greenfield ideation.
 <critical>Confirm what you have found with the user, along with asking if the user wants to provide anything else. Only after this confirmation will you proceed to follow the loading rules</critical>
@ -120,6 +123,7 @@ Try to discover the following:
 - Product briefs: {{briefCount}} files {if briefCount > 0}✓ loaded{else}(none found){/if}
 - Research: {{researchCount}} files {if researchCount > 0}✓ loaded{else}(none found){/if}
 - Brainstorming: {{brainstormingCount}} files {if brainstormingCount > 0}✓ loaded{else}(none found){/if}
 - Investigations: {{investigationCount}} files {if investigationCount > 0}✓ loaded{else}(none found){/if}
 - Project docs: {{projectDocsCount}} files {if projectDocsCount > 0}✓ loaded (brownfield project){else}(none found - greenfield project){/if}
 **Files loaded:** {list of specific file names or "No additional documents found"}
@ -128,6 +132,10 @@ Try to discover the following:
 📋 **Note:** This is a **brownfield project**. Your existing project documentation has been loaded. In the next step, I'll ask specifically about what new features or changes you want to add to your existing system.
 {/if}
 {if investigationCount > 0}
 🔎 **Note:** Investigation files have been loaded. The evidence-graded findings (Confirmed / Deduced / Hypothesized), timeline, and fix direction are available as context while we scope requirements.
 {/if}
 Do you have any other documents you'd like me to include, or shall we continue to the next step?"
 ### 4. Present MENU OPTIONS
--- a/src/bmm-skills/2-plan-workflows/bmad-create-prd/steps-c/step-02-discovery.md
+++ b/src/bmm-skills/2-plan-workflows/bmad-create-prd/steps-c/step-02-discovery.md
@ -63,6 +63,7 @@ Read the frontmatter from `{outputFile}` to get document counts:
 - `briefCount` - Product briefs available
 - `researchCount` - Research documents available
 - `brainstormingCount` - Brainstorming docs available
 - `investigationCount` - bmad-investigate case files available
 - `projectDocsCount` - Existing project documentation
 **Announce your understanding:**
@ -71,6 +72,7 @@ Read the frontmatter from `{outputFile}` to get document counts:
 - Product briefs: {{briefCount}}
 - Research: {{researchCount}}
 - Brainstorming: {{brainstormingCount}}
 - Investigations: {{investigationCount}}
 - Project docs: {{projectDocsCount}}
 {{if projectDocsCount > 0}}This is a brownfield project - I'll focus on understanding what you want to add or change.{{else}}This is a greenfield project - I'll help you define the full product vision.{{/if}}"
--- a/src/bmm-skills/4-implementation/bmad-agent-dev/customize.toml
+++ b/src/bmm-skills/4-implementation/bmad-agent-dev/customize.toml
@ -88,3 +88,8 @@ skill = "bmad-create-story"
 code = "ER"
 description = "Party mode review of all work completed across an epic"
 skill = "bmad-retrospective"
 [[agent.menu]]
 code = "IN"
 description = "Forensic case investigation with evidence-graded findings, calibrated to the input"
 skill = "bmad-investigate"
--- a/src/bmm-skills/4-implementation/bmad-investigate/SKILL.md
+++ b/src/bmm-skills/4-implementation/bmad-investigate/SKILL.md
@ -0,0 +1,194 @@
 ---
 name: bmad-investigate
 description: Forensic case investigation with evidence-graded findings, calibrated to the input. Use when the user asks to investigate a bug, trace what caused an incident, walk through unfamiliar code, or build a mental model of a code area before working on it.
 ---
 # Investigate
 ## Overview
 Reconstruct what's happening, or what an unfamiliar area does, from the available evidence. Produce a structured case
 file another engineer can pick up cold. Calibrate continuously between defect-chasing (symptom-driven) and
 area-exploration (no symptom); the same discipline applies on both ends.
 **Args:** A ticket ID, log file path, diagnostic archive, error message, code area name, problem description, or a path
 to an existing case file. The last form resumes a prior investigation; everything else opens a new case.
 **Output:** `{implementation_artifacts}/{workflow.case_file_subdir}/{workflow.case_file_filename}`. Reference inputs
 are recorded; raw content is not read into the parent context until an outcome calls for it.
 `{slug}` is the ticket ID when one is provided, otherwise a short descriptive name agreed with the user, sanitized to
 lowercase alphanumeric with hyphens. On collision with an existing case file at the resolved path, ask whether to
 rename to `slug-YYYY-MM-DD.md` or resume the existing file (resuming routes to Outcome 0).
 After every outcome, present what was learned and pause for the user before continuing.
 ## Principles
 - **Evidence grading.**
  - **Confirmed.** Directly observed; cite `path:line`, log timestamp, or commit hash.
  - **Deduced.** Logically follows from Confirmed evidence; show the chain.
  - **Hypothesized.** Plausible but unconfirmed; state what would confirm or refute it.
 - **Stronghold first.** Anchor in one Confirmed piece of evidence and expand outward. Never start from a theory and
  hunt for support. When evidence is sparse, switch to evidence-light mode (Outcome 1 branch).
 - **Challenge the premise.** The user's description is a hypothesis, not a fact. Verify independently; if evidence
  contradicts, say so.
 - **Follow the evidence, not the narrative.** When evidence contradicts the working theory, update the theory — never
  the other way around. Resist confirmation bias even when the user is convinced.
 - **Hypotheses are never deleted.** Update Status (Open / Confirmed / Refuted) and add a Resolution. Wrong turns are
  part of the deliverable.
 - **Missing evidence is itself a finding.** Document the gap, what it would resolve, and how to obtain it.
 - **Write it down early.** Initialize the case file as soon as the slug is agreed; it is the persistent state across
  interruptions.
 - **Path:line citations** use CWD-relative format, no leading `/`, so they're clickable in IDE-embedded terminals.
 - **Delegation discipline.** When a step requires reading 5+ files or any file >10K tokens, delegate to a subagent
  that returns structured JSON only. Cite `path:line` from the result; don't re-read in the parent.
 - **Issue independent operations in parallel** (multi-grep, multi-read, parallel inventories) — one message, multiple
  tool calls.
 - **Communication.** Evidence-first language ("the evidence shows", "unconfirmed, requires X to verify"). No hedging,
  no narrative.
 ## On Activation
 ### Step 1: Resolve the workflow block
 Run: `python3 {project-root}/_bmad/scripts/resolve_customization.py --skill {skill-root} --key workflow`
 If the script fails, stop and surface the error.
 ### Step 2: Execute prepend steps
 Run each entry in `{workflow.activation_steps_prepend}` in order.
 ### Step 3: Load persistent facts
 Treat each entry in `{workflow.persistent_facts}` as foundational context. `file:` prefixes are paths or globs under
 `{project-root}` (load contents); other entries are facts verbatim.
 ### Step 4: Load config
 Load `{project-root}/_bmad/bmm/config.yaml` and resolve `{user_name}`, `{communication_language}`,
 `{document_output_language}`, `{implementation_artifacts}`, `{project_knowledge}`. If `{implementation_artifacts}` is
 unresolved, fall back to `./investigations/` and surface the fallback before initializing.
 ### Step 5: Greet
 Greet `{user_name}` in `{communication_language}`.
 ### Step 6: Execute append steps
 Run each entry in `{workflow.activation_steps_append}` in order.
 ### Step 7: Acknowledge and route
 Acknowledge the input as a reference (record paths and IDs; don't read raw content). Path to an existing case file →
 Outcome 0. Otherwise → Outcome 1.
 ## Procedure
 ### Outcome 0: Existing case is loaded and surfaced
 Read the case file. Surface, in order: open hypotheses (Status = Open) with their confirm/refute criteria; open
 backlog (Status ≠ Done); missing-evidence rows; last Conclusion with confidence. Ask which thread to pull. New
 evidence opens a new `## Follow-up: {YYYY-MM-DD}` block (append `#2`, `#3` on same-day reentry). Pause for user with the recap above; wait for direction.
 ### Outcome 1: Scope and stronghold are established
 Acknowledge each input shape — record location, scope, time window only; bulk reads happen in Outcome 2.
 - **Issue tracker ticket.** Fetch full details via available MCP tools.
 - **Diagnostic archive.** Record path, file count, time window.
 - **Log file or stack trace.** Record path and time window; only the stack frame already in the user's message is in
  scope here.
 - **Free-text description.** Capture verbatim; treat as hypothesis.
 - **Code area name** (no symptom). Record entry point.
 - **Recent commit area.** Record commit range.
 If the user arrived with a hypothesis, register it as Hypothesis #1. Find the stronghold *independently*; the user's
 hypothesis is one of the things the stronghold validates or refutes.
 Find a stronghold: a Confirmed piece of evidence (error message, function name, HTTP route, config parameter, test
 case). Anchor here.
 **Initialize `{case_file}` before branching.** The path is
 `{implementation_artifacts}/{workflow.case_file_subdir}/{workflow.case_file_filename}` with `{slug}` substituted (slug
 and collision rules in Overview). Create the file from `{workflow.case_file_template}` and fill Hand-off Brief
 (rough), Case Info, Problem Statement, initial Evidence Inventory.
 **Evidence-light branch.** When no Confirmed evidence is reachable: mark the case evidence-light in the Hand-off
 Brief; populate the Investigation Backlog with prioritized data-collection items; record "to make progress, I need one
 of: …"; pause for the user to provide evidence or authorize Outcome 2 to scan more broadly.
 Otherwise present scope, stronghold, file path, proposed approach. Pause for user with the recap above; wait for direction.
 ### Outcome 2: Evidence perimeter is mapped
 Survey the scene: inventory available evidence in parallel across these independent categories: diagnostic archives;
 issue tracker; version control; test results; static analysis; source code. For any category exceeding ~10K tokens,
 delegate to a subagent that returns a JSON manifest (paths, sizes, time windows, key fragments cited as `path:line`).
 Classify each Available, Partial, or Missing — Missing is itself a finding. Update Evidence Inventory and Investigation
 Backlog. Pause for user with the recap above; wait for direction.
 ### Outcome 3: Cause is reasoned about with discipline
 - **Trace causality.** Symptom-driven: trace backward from the symptom to producing conditions and the state that
  emerged. Exploration: trace backward from outputs (returns, side effects, messages sent) to producing conditions.
  Same technique, different anchor.
 - **Reconstruct the timeline** by cross-referencing logs, system events, version control, user observations.
 - **Form and test hypotheses.** State, identify confirming/refuting evidence, search, grade
  (Confirmed / Refuted / Open). Update Status. Never delete.
 - **Refutation pass.** Each time a hypothesis transitions toward Confirmed, actively look for refuting evidence first.
  Record the attempt in Resolution.
 - **Verify the user's premise.** If evidence contradicts, say so explicitly.
 - **Add discovered paths to the backlog.** Stay focused on the current thread.
 Update Confirmed Findings, Deduced Conclusions, Hypothesized Paths, Backlog, Timeline. Highlight contradictions to the
 original premise. Pause for user with the recap above; wait for direction.
 ### Outcome 4: Source has been traced where it matters
 Issue these first-pass scans as parallel tool calls in one message: grep for exact error strings; glob the affected
 directory for parallel implementations; `git log` for recent changes.
 Then sequentially: read the surrounding code; follow the caller chain; watch for language and process boundary
 crossings (compiled→scripts, IPC, host→device, configuration flow).
 Lean by case type:
 - **Exploration:** I/O mapping (triggers, outputs, dependencies); frequent-terms scan; control-flow filtering
  (branches, loops, error handling, state-machine transitions).
 - **Symptom-driven:** depth assessment — is the root cause reachable from local context, or is a broader area model
  required? Surface escalations; never silently expand scope. Trivial-fix assessment — off-by-one, missing null check,
  swapped argument → one-line code suggestion or draft diff in the report; non-trivial → stop at the root cause area.
 Investigation stops at the diagnosis; implementation is out of scope. Update Source Code Trace (Error origin, Trigger,
 Condition, Related files; area model when broader). Pause for user with the recap above; wait for direction.
 ### Outcome 5: Report is finalized and the hand-off is clean
 Update `{case_file}`:
 - **Hand-off Brief** rewritten to final form (3 sentences, 15-second read).
 - **Final Conclusion** with confidence: **High** (Confirmed root cause, deterministic repro), **Medium** (Deduced;
  minor uncertainty), **Low** (Hypothesized; clear data gap).
 - **Fix direction** when applicable (categorize by mechanism if multiple combine).
 - **Diagnostic steps** if uncertainty remains.
 - **Reproduction Plan** when applicable, or a verification plan for exploration cases.
 - **Status:** Active / Concluded / Blocked on evidence.
 Present the conclusion, then a concrete next-steps menu: trivial fix → `bmad-quick-dev`; scope/plan adjustment →
 `bmad-correct-course`; tracked story → `bmad-create-story`; fresh review → `bmad-code-review`. Recommend the
 highest-value action. Mitigations and workarounds are generated only on explicit request — investigation stops at the
 diagnosis. Execute `{workflow.on_complete}` if non-empty. Pause for user with the recap above; wait for direction.
 ## Follow-up Iterations
 Continue work by appending to `{case_file}` under a new `## Follow-up: {YYYY-MM-DD}` block (`#2`, `#3` on same-day
 reentry). The investigation is complete when:
 - Root cause is Confirmed.
 - Root cause is Hypothesized with a clear data gap.
 - The mental model is sufficient for the user's stated goal (exploration cases).
 - The backlog contains only items requiring unavailable evidence.
 - The user explicitly concludes.
--- a/src/bmm-skills/4-implementation/bmad-investigate/customize.toml
+++ b/src/bmm-skills/4-implementation/bmad-investigate/customize.toml
@ -0,0 +1,62 @@
 # DO NOT EDIT -- overwritten on every update.
 #
 # Workflow customization surface for bmad-investigate. Mirrors the
 # agent customization shape under the [workflow] namespace.
 [workflow]
 # --- Configurable below. Overrides merge per BMad structural rules: ---
 #   scalars: override wins • arrays (persistent_facts, activation_steps_*): append
 #   arrays-of-tables with `code`/`id`: replace matching items, append new ones.
 # Steps to run before the standard activation (config load, greet).
 # Overrides append. Use for pre-flight loads, compliance checks, etc.
 activation_steps_prepend = []
 # Steps to run after greet but before the workflow begins.
 # Overrides append. Use for context-heavy setup that should happen
 # once the user has been acknowledged.
 activation_steps_append = []
 # Persistent facts the workflow keeps in mind for the whole run.
 # Use for citation conventions (path:line vs path#L42), grading-scale
 # overrides (ITIL severity 1-5 instead of High/Medium/Low), tone
 # directives (engineering vs exec-facing), or compliance constraints
 # the case file must respect.
 # Distinct from the runtime memory sidecar — these are static context
 # loaded on activation. Overrides append.
 #
 # Each entry is either:
 #   - a literal sentence, e.g. "Use ITIL severity 1-5 instead of High/Medium/Low for confidence."
 #   - a file reference prefixed with `file:`, e.g. "file:{project-root}/docs/standards.md"
 #     (glob patterns are supported; the file's contents are loaded and treated as facts).
 persistent_facts = [
  "file:{project-root}/**/project-context.md",
 ]
 # Scalar: path to the case-file template, resolved from the skill root.
 # Override to point at an org-shaped template (compliance sections,
 # SLA fields, post-mortem hooks, ITIL fields).
 case_file_template = "references/case-file-template.md"
 # Scalar: subdirectory under {implementation_artifacts} where case files land.
 # Override for org taxonomies (forensics/, cases/, incidents/, bug-bash/).
 case_file_subdir = "investigations"
 # Scalar: filename pattern for new case files. {slug} expands to the
 # ticket ID or a short user-agreed name.
 case_file_filename = "{slug}-investigation.md"
 # Scalar: executed when the workflow finalizes the case file at Outcome 5,
 # after the conclusion is presented. Override wins. Use for post-case
 # automation: post the case to Slack/Teams, push fields back to ticketing,
 # link the case to a sprint, trigger a follow-up retro.
 # Leave empty for no custom post-completion behavior.
 on_complete = ""
--- a/src/bmm-skills/4-implementation/bmad-investigate/references/case-file-template.md
+++ b/src/bmm-skills/4-implementation/bmad-investigate/references/case-file-template.md
@ -0,0 +1,127 @@
 # Investigation: {title}
 ## Hand-off Brief
 1. **What happened.** {one-sentence problem statement, evidence-graded}
 2. **Where the case stands.** {status, last finding, what would unblock progress}
 3. **What's needed next.** {single recommended action with rationale}
 ## Case Info
 | Field            | Value                                                                      |
 | ---------------- | -------------------------------------------------------------------------- |
 | Ticket           | {ticket-id or "N/A"}                                                       |
 | Date opened      | {date}                                                                     |
 | Status           | Active                                                                     |
 | System           | {OS, version, relevant environment details}                                |
 | Evidence sources | {diagnostic archive, logs, crash dump, code, version control, etc.}        |
 ## Problem Statement
 {User-reported description; the initial claim. May be refined or contradicted by evidence.}
 ## Evidence Inventory
 | Source   | Status                          | Notes     |
 | -------- | ------------------------------- | --------- |
 | {source} | {Available / Partial / Missing} | {details} |
 ## Investigation Backlog
 | # | Path to Explore | Priority              | Status                                | Notes     |
 | - | --------------- | --------------------- | ------------------------------------- | --------- |
 | 1 | {description}   | {High / Medium / Low} | {Open / In Progress / Done / Blocked} | {context} |
 ## Timeline of Events
 | Time        | Event               | Source                | Confidence            |
 | ----------- | ------------------- | --------------------- | --------------------- |
 | {timestamp} | {event description} | {log file, commit, …} | {Confirmed / Deduced} |
 ## Confirmed Findings
 ### Finding 1: {title}
 **Evidence:** {citation — `path:line`, log timestamp, or commit hash}
 **Detail:** {description}
 ## Deduced Conclusions
 ### Deduction 1: {title}
 **Based on:** {which Confirmed Findings}
 **Reasoning:** {logical chain}
 **Conclusion:** {what follows}
 ## Hypothesized Paths
 ### Hypothesis 1: {title}
 **Status:** {Open / Confirmed / Refuted}
 **Theory:** {description}
 **Supporting indicators:** {what makes this plausible}
 **Would confirm:** {specific evidence that would prove this}
 **Would refute:** {specific evidence that would disprove this}
 **Resolution:** {when Status changes from Open, what evidence settled it}
 ## Missing Evidence
 | Gap              | Impact                               | How to Obtain   |
 | ---------------- | ------------------------------------ | --------------- |
 | {what's missing} | {what it would confirm or eliminate} | {how to get it} |
 ## Source Code Trace
 | Element       | Detail                                      |
 | ------------- | ------------------------------------------- |
 | Error origin  | {file:line, function name}                  |
 | Trigger       | {what causes this code to execute}          |
 | Condition     | {what state produces the observed behavior} |
 | Related files | {other files in the same code path}         |
 ## Conclusion
 **Confidence:** {High / Medium / Low}
 {Summary stating what is Confirmed vs. what remains Hypothesized. If a root cause is identified, state it; otherwise
 name the most promising hypothesized paths and what would resolve the remaining uncertainty.}
 ## Recommended Next Steps
 ### Fix direction
 {What needs to change and why. Categorize by mechanism when multiple issues combine.}
 ### Diagnostic
 {Steps to confirm the root cause: additional logging, targeted tests, data to collect.}
 ## Reproduction Plan
 {Setup, trigger, expected results. Scale from isolated proof to full system reproduction.}
 ## Side Findings
 Tangential observations surfaced during the investigation, evidence-graded, with citation when applicable.
 - {observation}
 ## Follow-up: {date}
 ### New Evidence
 ### Additional Findings
 ### Updated Hypotheses
 ### Backlog Changes
 ### Updated Conclusion
--- a/src/bmm-skills/module-help.csv
+++ b/src/bmm-skills/module-help.csv
@ -31,3 +31,4 @@ BMad Method,bmad-code-review,Code Review,CR,Story cycle: If issues back to DS if
 BMad Method,bmad-checkpoint-preview,Checkpoint,CK,Guided walkthrough of a change from purpose and context into details. Use for human review of commits branches or PRs.,,,4-implementation,,,false,,
 BMad Method,bmad-qa-generate-e2e-tests,QA Automation Test,QA,Generate automated API and E2E tests for implemented code. NOT for code review or story validation — use CR for that.,,,4-implementation,bmad-dev-story,,false,implementation_artifacts,test suite
 BMad Method,bmad-retrospective,Retrospective,ER,Optional at epic end: Review completed work lessons learned and next epic or if major issues consider CC.,,,4-implementation,bmad-code-review,,false,implementation_artifacts,retrospective
 BMad Method,bmad-investigate,Investigate,IN,Forensic case investigation calibrated to the input. Evidence-graded analysis with hypothesis tracking. Produces a structured case file.,,4-implementation,,,false,implementation_artifacts,investigation report
--- a/website/public/workflow-map-diagram-fr.html
+++ b/website/public/workflow-map-diagram-fr.html
@ -311,6 +311,16 @@
                        <span class="output">leçons</span>
                    </div>
                </div>
                <div class="workflow">
                    <div class="workflow-header">
                        <span class="workflow-name">investigate</span>
                        <span class="badge adhoc">à tout moment</span>
                    </div>
                    <div class="workflow-meta">
                        <div class="agent"><div class="agent-icon amelia">A</div><span class="agent-name">Amelia</span></div>
                        <span class="output">dossier de cas</span>
                    </div>
                </div>
            </div>
        </div>
    </div>
--- a/website/public/workflow-map-diagram.html
+++ b/website/public/workflow-map-diagram.html
@ -322,6 +322,16 @@
                        <span class="output">lessons</span>
                    </div>
                </div>
                <div class="workflow">
                    <div class="workflow-header">
                        <span class="workflow-name">investigate</span>
                        <span class="badge adhoc">anytime</span>
                    </div>
                    <div class="workflow-meta">
                        <div class="agent"><div class="agent-icon amelia">A</div><span class="agent-name">Amelia</span></div>
                        <span class="output">case file</span>
                    </div>
                </div>
            </div>
        </div>
    </div>