18 KiB

Raw Blame History

Decision Log — bmad-brainstorming rebuild

Canonical memory for the rebuild of bmad-brainstorming from the legacy numbered-micro-file architecture into the outcome-driven pattern of bmad-product-brief / bmad-prd / bmad-ux.

Session 2026-05-30

Intent: Edit / rebuild existing skill (in place at src/core-skills/bmad-brainstorming).

Classification: Simple Workflow with one quarantine carve-out. Customizable = yes.

Load-bearing decisions

Single intent: pure facilitation + continuation. No create/update/validate split (that pattern fits document skills; brainstorming is a facilitated session). The skill facilitates a live session and can resume a prior one. Decided with user.
The running log is the memory AND the continuation state. One terse, append-only session file holds the goal, the techniques selected, and every idea the user generates/accepts. Kept deliberately lean to minimize context cost; the final artifacts are produced from it. Borrowed shape from Obra Superpowers' append-only event log (one entry per line). User: "very succinct as memory, just enough to minimize context."
HARD RULE — the LLM never supplies its own ideas in interactive mode. Its job is to draw ideas out of the user and push past the obvious. If the user explicitly asks for an idea, give ONE as a one-off, then immediately return to pushing the user. This inverts a default LLM behavior, so it earns forceful statement with rationale. User-stated, emphasized.
Headless is the ONLY context where the LLM brainstorms on its own. Given just a topic, headless runs the techniques itself and produces the artifacts. Because it inverts the central no-own-ideas rule, ALL headless instructions live in references/headless.md and are never loaded in normal mode — quarantined so they can't bleed into and corrupt the interactive posture. User's explicit requirement.
Technique selection: AI-led, others on request. Facilitator reads the goal and proposes a fitting technique by default; "browse the library", "surprise me" (random), and "start broad then narrow" (progressive) are always available but never a forced numbered-menu gate. Builder recommended; user accepted ("show what you recommend and why").
Finalize → imaginative HTML, no template. Instruction directs the LLM to be genuinely creative and whimsical, matched to the spirit of this session — sections per technique, creative visualizations, the summary rendered as it all came together. Explicitly NO template/scaffold (rejected Obra's CSS-class kit). User: pure per-session creativity.
Optional second artifact: succinct brainstorm-intent doc. User's choice at finalize. Short, sharp, just the chosen/critical discoveries — clean input for a downstream skill (product-brief, prd) without report bloat. This is the dual-output distillate pattern.
Synthesis is two-move and is the one interactive moment the LLM adds its own thinking. First reflect a sampling of the user's own ideas back (including surprising/random ones from across the session) and let THEM hunt for conclusions/synergies/themes/what-matters. THEN the LLM leans in creatively to surface the non-obvious connections the user wouldn't spot right away. The no-own-ideas rule is therefore scoped explicitly to the generative phase so synthesis doesn't collide with it. User refinement.
The log is the load-bearing artifact; artifacts are derivations. "From the decision log we can create any artifact the user wants for any purpose, with nothing lost." HTML + intent doc are two derivations; others available on request. (User flagged a future cross-cutting task: retune other skills' .decision-log.md to this same lean append-only standard — NOT part of this build.)
Extensible technique library via customize.toml (new capability the static-CSV architecture never had). Two scalars: favorite_techniques (names the facilitator proposes first when they fit, AI-led default) and additional_techniques (array-of-tables mirroring the CSV shape category/technique_name/description — adds whole new techniques AND categories without editing the shipped CSV). Both append-merge so team + user files each contribute. Wired into SKILL.md ## Choosing Techniques (proposing/browse/random/progressive all treat additions as first-class) and into headless.md selection. User idea, added post-build.

Borrowed from Obra Superpowers brainstorming skill (researched)

One-question/prompt-per-message as a hard rule — keeps the human in active generation, operationalizes "push their creative muscles."
Append-only lean log as continuation memory.
Rejected: AI-recommends-approaches / convergent-funnel posture, graphviz state machine, CSS HTML kit — all off-pattern or inject AI ideas.

Surviving from legacy version

brain-methods.csv (technique library) → kept as a swappable asset ({workflow.brain_methods}).
Anti-bias protocol (shift creative domain every ~10 ideas) and quantity goal (100+ collaboratively developed ideas, stay in generative mode) → kept as facilitation posture, stripped of emoji/MANDATORY/SUCCESS-METRICS ceremony.

Planned file tree

bmad-brainstorming/
├── SKILL.md                 # Overview, Conventions, Activation, facilitation posture, technique flow, running log, finalize
├── customize.toml           # output_dir/folder, brain_methods path, activation hooks, persistent_facts
├── references/
│   └── headless.md          # QUARANTINED autonomous mode (the one place the LLM brainstorms itself)
└── assets/
    └── brain-methods.csv     # technique library (carried over)

Legacy files to remove on build: workflow.md, template.md, steps/ (8 numbered step files).

Adversarial review (3 parallel lenses) + fixes applied

Ran a Workflow with three review lenses (over-prescription, facilitation-integrity, BMad-conventions) against the build. Fixes applied:

[HIGH, fixed] Headless detection seam. The "first message pre-supplies a topic and asks for artifacts" trigger collided with a normal interactive opening ("brainstorm X and give me the HTML"), risking a present human flipping the skill into self-generation. Made human-presence the dominant gate in both SKILL.md activation step 4 and headless.md Detection; dropped the weak payload-shape trigger entirely.
[HIGH, fixed] external_handoffs declared-but-unused. Wired it (not deleted) into SKILL.md ## Producing Artifacts and headless.md step 5, matching product-brief/prd; headless JSON now populated from results. Scalar cross-check now 1:1 (every declared scalar referenced, every reference declared).
[MEDIUM, fixed] One-off idea exception loose/uncapped. Tightened trigger to a direct ask (not "stuck" / "what do you think") and capped recurrence (repeated reaching = pivot the technique).
[MEDIUM, fixed] Overview forward-reference + Running Log triple-imperative padding. Trimmed both; added "never with your own examples" glue at the provocation point and the headless-boundary clause to the scope statement; ran on_complete in headless too.
Rejected (reviewers self-assessed as keepers): ~10-idea rationale, self-contained-HTML constraint, headless restatement (justified by reference-file self-containment under compaction), and the subjective "spent / mined out" progression condition (a numeric gate would contradict "resist concluding").

Technique library: 61 → 100, made context-lean via a script (user-requested)

Schema change. CSV is now category, technique_name, description, detail. description rewritten to a terse ≤140-char gist that doubles as the index entry AND is enough to run most techniques (the LLM reconstructs specifics from name + gist — user's insight). detail is an optional path (relative to the CSV dir) to a per-technique instruction file, for techniques complex enough to warrant one — loaded only on show. This absorbs the best of the "sharded files" idea for exactly the cases that need it. No separate tagline column needed.
scripts/brain.py (stdlib-only, PEP723, uv run, 15 passing tests in scripts/tests/): categories / list [--category] / show "<name>" (inlines detail file iff present) / random. Single source of truth; index derived; never loads the whole catalog into main context. Decided script over subagent because selecting/filtering is pure plumbing (quality-bar: scripts do plumbing, prompts do judgment); a subagent would add latency without saving meaningful context now that the gist index is cheap. Noted one future case a subagent helps: deep full-description matching across the whole library.
Context win: whole-catalog load was ~4K tokens (61) and would have been ~6.5K at 100; now categories ~0.1K, list --category ~0.3-0.6K, show ~0.07K each — flat as the library grows. The 100-technique CSV is 14.6KB, smaller than the original 61-technique 16KB because gists replaced the long descriptions.
100 techniques, 13 categories generated via a 13-agent parallel workflow (one flavored generator per category: shorten existing + invent new), then curated by the parent to exactly 100 (dropped 7 cross-category near-dups/one exact dup). New categories: absurdist (genuinely funny unstickers), constraint, speculative_future. Spectrum spans silly→serious per user's "fun wild funny or super helpful."
Live detail-file demo: Six Thinking Hats → assets/techniques/six-thinking-hats.md (full multi-round method), proving the lazy-load path end to end.
"Invent a technique" option added to ## Choosing Techniques (pure prompt; logged as a technique; finalize may offer to persist a keeper into additional_techniques). SKILL.md technique section + headless both rewired to reach the library only through the script. Scalar consistency re-verified identical; scan-scripts passes.

Interactive tool techniques — a technique can BE a generated interactive HTML app (user breakthrough)

The detail-file mechanism generalized into a new class: a technique whose "instructions" tell the LLM to generate a bespoke, self-contained interactive HTML tool (CDN libs like three.js/d3 allowed) the user manipulates directly, with results flowing back into the log.

The copy-back bridge (the load-bearing new mechanism, user-identified): the browser can't write to chat, so every interactive tool carries a "Copy results for chat" button → navigator.clipboard.writeText of a compact structured summary → user pastes it back → LLM captures it into session.md under that technique → next technique. Stated ONCE in references/interactive-tools.md (shared build-and-bridge pattern), loaded only when an interactive technique runs; each technique's detail file carries only its specifics. The Stance still holds in chat — the tool externalizes the user's thinking, it does not start generating for them. Graceful fallback to conversational facilitation if HTML can't run.
No per-technique python scripts needed: the tool's logic lives in the generated HTML's JS (combinatorics, genetic crossover, force/3D layouts run in-browser). brain.py stays the only script. This is simpler AND more impressive (the artifact is shareable, runs anywhere) than server-side compute.
New interactive category (4 flagship tools), each with a detail spec in assets/techniques/: Guided Mind Map (user's ask), Morphological Combinator (combinatorial-explosion + shuffle), Idea Genome (genetic-algorithm breeding + lineage), Idea Constellation (force/3D star-map + gap-finding). Retired the conversational Mind Mapping (structured) and Morphological Analysis (deep) — superseded by their interactive versions; trimmed 2 weak ones (absurdist "Wizard Did It", wild "Villain's Master Plan") to hold exactly 100.
Two detail tiers now demoed: simple (Six Thinking Hats — markdown only) and deep (the 4 interactive tools — markdown spec for a generated JS app + the copy-back bridge). SKILL.md ## Choosing Techniques routes to references/interactive-tools.md + show when an interactive technique runs. scan-scripts passes; SKILL.md 92 lines. Cruft (.pytest_cache, __pycache__) removed from the tree.

Open item for BMad (not a build defect)

.decision-log.md trips the path-standards scanner (it flags any non-SKILL root .md). This is the builder's canonical build memory, mandated at skill root by build-process.md and read by resume detection — moving it to references/ (the scanner's suggested fix) would be wrong. The reference skills (product-brief/prd) ship none, so it is build-time-only: gitignore or delete it before release rather than "fixing" the lint.

Interactive tools: bundled & tested, not generated per-run (user-directed pivot)

User flagged the live-generation approach as needlessly risky (the constellation demo shipped two engine-layer typos that blanked the page) and conceptually wrong for one tool. Two decisions:

Bundle the flagship interactive tools as tested HTML shells; inject data, never regenerate code. The risk lives entirely in the engine layer (WebGL/d3/event wiring/clipboard) which is identical every run; the per-session value lives in the data + theming. So each tool now ships as a self-contained assets/techniques/<tool>.html with a <script type="application/json" id="session-state"> injection point — the LLM writes only the JSON (topic/goal/resume), never engine code, making the demo's bug class structurally impossible. Pure runtime generation is retained only for (a) the finalize keepsake HTML, where per-session whimsy is the point and stakes are low, and (b) invented / user-added interactive techniques, which can't be pre-bundled (the four shipped tools are their reference implementations).
Interactive tools are generative ARENAS — brainstorming happens inside the surface. User's load-bearing correction. Every tool boots fully usable from empty {}; injected JSON only seeds/resumes. This retired Idea Constellation outright: it only visualizes ideas that already exist (retrospective), so it is not a surface you ideate in. Its star-map spirit moves to the finalize keepsake (already an example viz there, runtime-generated). Replaced in the flagship four by Crazy 8s Sprint — a timed 8-cell quantity engine, the purest generate-inside-the-surface tool.

Library bookkeeping (held at 100, 14 categories). Removed interactive,Idea Constellation; added interactive,Crazy 8s Sprint (detail crazy-8s-sprint.md, tool crazy-8s-sprint.html). Retired the conversational structured,Crazy 8s (the interactive Sprint supersedes it, matching the Mind Mapping→Guided Mind Map / Morphological Analysis→Combinator precedent); restored the slot with structured,Lotus Blossom (recursive 3x3 expansion — a genuinely distinct generative method). Interactive category now = Guided Mind Map, Morphological Combinator, Idea Genome, Crazy 8s Sprint.

Build process. A 4-agent parallel Workflow built the bundled tools (each self-checking node --check), then a per-tool adversarial verifier ran syntax + data-contract + the generative-surface test + a Stance check. All four returned pass: syntax-clean, data-injection tag + copy-back bridge present, boot-empty confirmed, generative-surface TRUE, no auto-writing of the user's ideas. references/interactive-tools.md reframed from "build the tool" to "open & seed the shipped tool"; the four detail .md files reframed to facilitation + JSON contract + copy-back shape; SKILL.md ## Choosing Techniques updated (open/seed, not generate). The earlier _bmad-output/brainstorming/idea-constellation.html demo stays as keepsake inspiration, no longer a technique.

Interactive tools removed — back to a pure conversational facilitator (user-directed)

User pulled all four interactive HTML tools and the interactive category for now. Rationale: the coaching/ facilitation IS the soul of the skill, and a separate user-manipulated canvas risks silencing the facilitator during the most generative phase — the exact "blank canvas, fill in your ideas" failure the skill exists to beat. A long design discussion (tool-as-instrument, copy-back-as-heartbeat, seeded coach rail, cadence matched to each tool's built-in provocation) showed it is solvable but not yet worth the complexity. Decisions:

Deleted the 4 bundled tools + detail specs (guided-mind-map, morphological-combinator, idea-genome, crazy-8s-sprint — both .html and .md) and references/interactive-tools.md (the open-and-bridge pattern is no longer referenced).
Reverted SKILL.md: removed the interactive-tools paragraph from ## Choosing Techniques; restored ## Producing Artifacts keepsake to self-contained / no-external-deps (dropped the mermaid.js / d3 / three.js CDN allowance that existed only to serve the tools). The interactive-mode language (human-present vs headless) in SKILL.md and all of headless.md is a different, correct concept and was left untouched.
Held the library at 100 (now 13 categories, no interactive) by backfilling 4 conversational techniques: restored Mind Mapping (structured), Morphological Analysis (deep), Crazy 8s (structured); added Laddering (deep — climb 'and what would that give you?' to the real need). Kept Lotus Blossom.
Kept the expanded-detail-file pattern: Six Thinking Hats → techniques/six-thinking-hats.md is the one technique with a richer instruction file, so the lazy-loaded show-pulls-detail path stays demonstrated and ready for any future technique that needs more than a gist.

DEFERRED (build one, later): a single technique that drives a real-time visualization which updates live during fully-guided facilitative chat — fundamentally different from the removed user-manipulated canvases. There the facilitator stays in the chat dialogue (Stance fully intact, one prompt per message) and a visual reflects the conversation as it unfolds, rather than the user going solo in a separate window. The removed bundled tool HTMLs and the headless-Chrome render-verification learnings (catch runtime bugs node --check misses: bad SRI hashes, TDZ ordering) are preserved in this log for when we build that one.

18 KiB Raw Blame History