Compare commits

...

6 Commits

Author SHA1 Message Date
Tim Graepel fc4c711113 fix: address CodeRabbit PR review comments for Bob IDE support
- Remove leftover '// Made with Bob' debug comment
- Guard artifact.sourcePath before calling path.relative() to prevent TypeError
- Fix run-on customInstructions string with proper punctuation in createModeObject
- Move fs-extra require to top of file; remove inline requires from clearBmadWorkflows/cleanup
- Wrap write sequence in try/catch with rollback to prevent partial state on failure
- Use metadata.title and metadata.icon in installCustomAgentLauncher instead of ignoring param
- Align installCustomAgentLauncher mode object with createModeObject (icon in name, fixed instructions)
- Remove unused .bobmodes entry from .gitignore (.bob already covers the directory)
- Sync bob description in tools/cli/installers/lib/ide/platform-codes.yaml to match tools/platform-codes.yaml
2026-03-04 12:43:42 +01:00
Tim Graepel ec941ce042 feat(ide): add Bob IDE installer support
- bob.js: fix installCustomAgentLauncher using wrong mode schema
  (description/prompt/always/permissions → roleDefinition/whenToUse/
  customInstructions/groups)
- bob.js: add detectionPaths for .bob directory
- manager.js: update comments to include bob.js in custom installer list
- platform-codes.yaml: move bob entry to correct alphabetical position
- eslint.config.mjs: add .bob/** to eslint ignores (like .claude, .roo, etc.)
2026-03-04 12:43:42 +01:00
Tim Graepel 9dfb99da03 feat: add IBM Bob platform support (#1768)
Add IBM Bob (agentic IDE) as a supported IDE platform with custom installer:

- Create custom Bob installer (tools/cli/installers/lib/ide/bob.js)
- Add .bobmodes to .gitignore
- Add platform config to tools/platform-codes.yaml
- Register Bob in IDE manager's custom installer list

Bob uses a custom installer (like Kilo) that creates a .bobmodes file
to register custom modes and writes workflows to .bob/workflows/

Fixes #1768
2026-03-04 12:43:42 +01:00
Alex Verkhovsky 9536e1e6e3
feat(skills): add bmad-os-review-prompt skill (#1806)
PromptSentinel v1.2 - reviews LLM workflow step prompts for known
failure modes including silent ignoring, negation fragility, scope
creep, and 14 other catalog items. Uses parallel review tracks
(adversarial, catalog scan, path tracing) with structured output.
2026-03-03 22:38:58 -06:00
Alex Verkhovsky 7ece8b09fa
feat(skills): add bmad-os-findings-triage HITL triage skill (#1804)
Team-based skill that orchestrates human-in-the-loop triage of review
findings using parallel Opus agents. One agent per finding researches
autonomously, proposes a plan, then holds for human conversation before
a decision is recorded. Team lead maintains scorecard and lifecycle.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-03 22:08:55 -06:00
Alex Verkhovsky 35a7f101dd
feat(quick-flow): add quick-dev2 unified workflow (#1807)
* feat(quick-flow): add quick-dev2 unified workflow

Add the quick-dev2 workflow that unifies clarify, plan, implement,
review, and present into a single flow. Register it in the agent
menu, module-help catalog, and test fixtures.

* fix(quick-flow): rename QD2 trigger to QQ for schema compliance

COMPOUND_TRIGGER_PATTERN only allows uppercase letters in shortcuts.
Rename to QQ so quick-dev2 passes agent schema validation.

* fix(quick-flow): address PR review findings for quick-dev2

- step-04-review: fix copy-paste fallback text to say "perform all
  three reviews inline sequentially" instead of "implement directly"
- workflow.md: add missing planning_artifacts to initialization list,
  matching quick-spec and quick-dev siblings
- quick-flow-solo-dev.agent.yaml: change QD and QQ menu entries from
  workflow: to exec: for .md files, matching the exec-for-md convention

* fix(quick-flow): use human-in-the-loop fallback for review without subagents

Sequential inline reviews in the same context suffer from anchoring
bias and context blowout. Instead, generate separate review prompt
files and ask the human to run each in a separate session.

* refactor(quick-flow): rename quick-dev2 to quick-dev-new-preview

Rename directory, update all references in agent menu, module-help,
and workflow internals.
2026-03-02 19:36:14 -06:00
19 changed files with 1272 additions and 4 deletions

View File

@ -0,0 +1,6 @@
---
name: bmad-os-findings-triage
description: Orchestrate HITL triage of review findings using parallel agents. Use when the user says 'triage these findings' or 'run findings triage' or has a batch of review findings to process.
---
Read `prompts/instructions.md` and execute.

View File

@ -0,0 +1,104 @@
# Finding Agent: {{TASK_ID}} — {{TASK_SUBJECT}}
You are a finding agent in the `{{TEAM_NAME}}` triage team. You own exactly one finding and will shepherd it through research, planning, human conversation, and a final decision.
## Your Assignment
- **Task:** `{{TASK_ID}}`
- **Finding:** `{{FINDING_ID}}` — {{FINDING_TITLE}}
- **Severity:** {{SEVERITY}}
- **Team:** `{{TEAM_NAME}}`
- **Team Lead:** `{{TEAM_LEAD_NAME}}`
## Phase 1 — Research (autonomous)
1. Read your task details with `TaskGet("{{TASK_ID}}")`.
2. Read the relevant source files to understand the finding in context:
{{FILE_LIST}}
If no specific files are listed above, use codebase search to locate code relevant to the finding.
If a context document was provided:
- Also read this context document for background: {{CONTEXT_DOC}}
If an initial triage was provided:
- **Note:** The team lead triaged this as **{{INITIAL_TRIAGE}}** — {{TRIAGE_RATIONALE}}. Evaluate whether this triage is correct and incorporate your assessment into your plan.
**Rules for research:**
- Work autonomously. Do not ask the team lead or the human for help during research.
- Use `Read`, `Grep`, `Glob`, and codebase search tools to understand the codebase.
- Trace call chains, check tests, read related code — be thorough.
- Form your own opinion on whether this finding is real, a false positive, or somewhere in between.
## Phase 2 — Plan (display only)
Prepare a plan for dealing with this finding. The plan MUST cover:
1. **Assessment** — Is this finding real? What is the actual risk or impact?
2. **Recommendation** — One of: fix it, accept the risk (wontfix), dismiss as not a real issue, or reject as a false positive.
3. **If recommending a fix:** Describe the specific changes — which files, what modifications, why this approach.
4. **If recommending against fixing:** Explain the reasoning — existing mitigations, acceptable risk, false positive rationale.
**Display the plan in your output.** Write it clearly so the human can read it directly. Follow the plan with a 2-5 line summary of the finding itself.
**CRITICAL: Do NOT send your plan or analysis to the team lead.** The team lead does not need your plan — the human reads it from your output stream. Sending full plans to the team lead wastes its context window.
## Phase 3 — Signal Ready
After displaying your plan, send exactly this to the team lead:
```
SendMessage({
type: "message",
recipient: "{{TEAM_LEAD_NAME}}",
content: "{{FINDING_ID}} ready for HITL",
summary: "{{FINDING_ID}} ready for review"
})
```
Then **stop and wait**. Do not proceed until the human engages with you.
## Phase 4 — HITL Conversation
The human will review your plan and talk to you directly. This is a real conversation, not a rubber stamp:
- The human may agree immediately, push back, ask questions, or propose alternatives.
- Answer questions thoroughly. Refer back to specific code you read.
- If the human wants a fix, **apply it** — edit the source files, verify the change makes sense.
- If the human disagrees with your assessment, update your recommendation.
- Stay focused on THIS finding only. Do not discuss other findings.
- **Do not send a decision until the human explicitly states a verdict.** Acknowledging your plan is NOT a decision. Wait for clear direction like "fix it", "dismiss", "reject", "skip", etc.
## Phase 5 — Report Decision
When the human reaches a decision, send exactly ONE message to the team lead:
```
SendMessage({
type: "message",
recipient: "{{TEAM_LEAD_NAME}}",
content: "DECISION {{FINDING_ID}} {{TASK_ID}} [CATEGORY] | [one-sentence summary]",
summary: "{{FINDING_ID}} [CATEGORY]"
})
```
Where `[CATEGORY]` is one of:
| Category | Meaning |
|----------|---------|
| **SKIP** | Human chose to skip without full review. |
| **DEFER** | Human chose to defer to a later session. |
| **FIX** | Change applied. List the file paths changed and what each change was (use a parseable format: `files: path1, path2`). |
| **WONTFIX** | Real finding, not worth fixing now. State why. |
| **DISMISS** | Not a real finding or mitigated by existing design. State the mitigation. |
| **REJECT** | False positive from the reviewer. State why it is wrong. |
After sending the decision, **go idle and wait for shutdown**. Do not take any further action. The team lead will send you a shutdown request — approve it.
## Rules
- You own ONE finding. Do not touch files unrelated to your finding unless required for the fix.
- Your plan is for the human's eyes — display it in your output, never send it to the team lead.
- Your only messages to the team lead are: (1) ready for HITL, (2) final decision. Nothing else.
- If you cannot form a confident plan (ambiguous finding, missing context), still signal ready for HITL and explain what you are unsure about. The HITL conversation will resolve it.
- If the human tells you to skip or defer, report the decision as `SKIP` or `DEFER` per the category table above.
- When you receive a shutdown request, approve it immediately.

View File

@ -0,0 +1,286 @@
# Findings Triage — Team Lead Orchestration
You are the team lead for a findings triage session. Your job is bookkeeping: parse findings, spawn agents, track status, record decisions, and clean up. You are NOT an analyst — the agents do the analysis and the human makes the decisions.
**Be minimal.** Short confirmations. No editorializing. No repeating what agents already said.
---
## Phase 1 — Setup
### 1.1 Determine Input Source
The human will provide findings in one of three ways:
1. **A findings report file** — a markdown file with structured findings. Read the file.
2. **A pre-populated task list** — tasks already exist. Call `TaskList` to discover them.
- If tasks are pre-populated: skip section 1.2 (parsing) and section 1.4 (task creation). Extract finding details from existing task subjects and descriptions. Number findings based on task order. Proceed from section 1.3 (pre-spawn checks).
3. **Inline findings** — pasted directly in conversation. Parse them.
Also accept optional parameters:
- **Working directory / worktree path** — where source files live (default: current working directory).
- **Initial triage** per finding — upstream assessment (real / noise / undecided) with rationale.
- **Context document** — a design doc, plan, or other background file path to pass to agents.
### 1.2 Parse Findings
Extract from each finding:
- **Title / description**
- **Severity** (Critical / High / Medium / Low)
- **Relevant file paths**
- **Initial triage** (if provided)
Number findings sequentially: F1, F2, ... Fn. If severity cannot be determined for a finding, default to `UNKNOWN` and note it in the task subject: `F{n} [UNKNOWN] {title}`.
**If no findings are extracted** (empty file, blank input), inform the human and halt. Do not proceed to task creation or team setup.
**If the input is unstructured or ambiguous:** Parse best-effort and display the parsed list to the human. Ask for confirmation before proceeding. Do NOT spawn agents until confirmed.
### 1.3 Pre-Spawn Checks
**Large batch (>25 findings):**
HALT. Tell the human:
> "There are {N} findings. Spawning {N} agents at once may overwhelm the system. I recommend processing in waves of ~20. Proceed with all at once, or batch into waves?"
Wait for the human to decide. If batching, record wave assignments (Wave 1: F1-F20, Wave 2: F21-Fn).
**Same-file conflicts:**
Scan all findings for overlapping file paths. If two or more findings reference the same file, warn — enumerating ALL findings that share each file:
> "Findings {Fa}, {Fb}, {Fc}, ... all reference `{file}`. Concurrent edits may conflict. Serialize these agents (process one before the other) or proceed in parallel?"
Wait for the human to decide. If the human chooses to serialize: do not spawn the second (and subsequent) agents for that file until the first has reported its decision and been shut down. Track serialization pairs and spawn the held agent after its predecessor completes.
### 1.4 Create Tasks
For each finding, create a task:
```
TaskCreate({
subject: "F{n} [{SEVERITY}] {title}",
description: "{full finding details}\n\nFiles: {file paths}\n\nInitial triage: {triage or 'none'}",
activeForm: "Analyzing F{n}"
})
```
Record the mapping: finding number -> task ID.
### 1.5 Create Team
```
TeamCreate({
team_name: "{review-type}-triage",
description: "HITL triage of {N} findings from {source}"
})
```
Use a contextual name based on the review type (e.g., `pr-review-triage`, `prompt-audit-triage`, `code-review-triage`). If unsure, use `findings-triage`.
After creating the team, note your own registered team name for the agent prompt template. Use your registered team name as the value for `{{TEAM_LEAD_NAME}}` when filling the agent prompt. If unsure of your name, read the team config at `~/.claude/teams/{team-name}/config.json` to find your own entry in the members list.
### 1.6 Spawn Agents
Read the agent prompt template from `prompts/agent-prompt.md`.
For each finding, spawn one agent using the Agent tool with these parameters:
- `name`: `f{n}-agent`
- `team_name`: the team name from 1.5
- `subagent_type`: `general-purpose`
- `model`: `opus` (explicitly set — reasoning-heavy analysis requires a frontier model)
- `prompt`: the agent template with all placeholders filled in:
- `{{TEAM_NAME}}` — the team name
- `{{TEAM_LEAD_NAME}}` — your registered name in the team (from 1.5)
- `{{TASK_ID}}` — the task ID from 1.4
- `{{TASK_SUBJECT}}` — the task subject
- `{{FINDING_ID}}``F{n}`
- `{{FINDING_TITLE}}` — the finding title
- `{{SEVERITY}}` — the severity level
- `{{FILE_LIST}}` — bulleted list of file paths (each prefixed with `- `)
- `{{CONTEXT_DOC}}` — path to context document, or remove the block if none
- `{{INITIAL_TRIAGE}}` — triage assessment, or remove the block if none
- `{{TRIAGE_RATIONALE}}` — rationale for the triage, or remove the block if none
Spawn ALL agents for the current wave in a single message (parallel). If batching, spawn only the current wave.
After spawning, print:
```
All {N} agents spawned. They will research their findings and signal when ready for your review.
```
Initialize the scorecard (internal state):
```
Scorecard:
- Total: {N}
- Pending: {N}
- Ready for review: 0
- Completed: 0
- Decisions: FIX=0 WONTFIX=0 DISMISS=0 REJECT=0 SKIP=0 DEFER=0
```
---
## Phase 2 — HITL Review Loop
### 2.1 Track Agent Readiness
Agents will send messages matching: `F{n} ready for HITL`
When received:
- Note which finding is ready.
- Update the internal status tracker.
- Print a short status line: `F{n} ready. ({ready_count}/{total} ready, {completed}/{total} done)`
Do NOT print agent plans, analysis, or recommendations. The human reads those directly from the agent output.
### 2.2 Status Dashboard
When the human asks for status (or periodically when useful), print:
```
=== Triage Status ===
Ready for review: F3, F7, F11
Still analyzing: F1, F5, F9
Completed: F2 (FIX), F4 (DISMISS), F6 (REJECT)
{completed}/{total} done
===
```
Keep it compact. No decoration beyond what is needed.
### 2.3 Process Decisions
Agents will send messages matching: `DECISION F{n} {task_id} [CATEGORY] | [summary]`
When received:
1. **Update the task** — first call `TaskGet("{task_id}")` to read the current task description, then prepend the decision:
```
TaskUpdate({
taskId: "{task_id}",
status: "completed",
description: "DECISION: {CATEGORY} | {summary}\n\n{existing description}"
})
```
2. **Update the scorecard** — increment the decision category counter. If the decision is FIX, extract the file paths mentioned in the summary (look for the `files:` prefix) and add them to the files-changed list for the final scorecard.
3. **Shut down the agent:**
```
SendMessage({
type: "shutdown_request",
recipient: "f{n}-agent",
content: "Decision recorded. Shutting down."
})
```
4. **Print confirmation:** `F{n} closed: {CATEGORY}. {remaining} remaining.`
### 2.4 Human-Initiated Skip/Defer
If the human wants to skip or defer a finding without full engagement:
1. Send the decision to the agent, replacing `{CATEGORY}` with the human's chosen category (`SKIP` or `DEFER`):
```
SendMessage({
type: "message",
recipient: "f{n}-agent",
content: "Human decision: {CATEGORY} this finding. Report {CATEGORY} as your decision and go idle.",
summary: "F{n} {CATEGORY} directive"
})
```
2. Wait for the agent to report the decision back (it will send `DECISION F{n} ... {CATEGORY}`).
3. Process as a normal decision (2.3).
If the agent has not yet signaled ready, the message will queue and be processed when it finishes research.
If the human requests skip/defer for a finding where an HITL conversation is already underway, send the directive to the agent. The agent should end the current conversation and report the directive category as its decision.
### 2.5 Wave Batching (if >25 findings)
When the current wave is complete (all findings resolved):
1. Print wave summary.
2. Ask: `"Wave {W} complete. Spawn wave {W+1} ({count} findings)? (y/n)"`
3. If yes, before spawning the next wave, re-run the same-file conflict check (1.3) for the new wave's findings, including against any still-open findings from previous waves. Then repeat Phase 1.4 (task creation) and 1.6 (agent spawning) only. Do NOT call TeamCreate again — the team already exists.
4. If the human declines, treat unspawned findings as not processed. Proceed to Phase 3 wrap-up. Note the count of unprocessed findings in the final scorecard.
5. Carry the scorecard forward across waves.
---
## Phase 3 — Wrap-up
When all findings across all waves are resolved:
### 3.1 Final Scorecard
```
=== Final Triage Scorecard ===
Total findings: {N}
FIX: {count}
WONTFIX: {count}
DISMISS: {count}
REJECT: {count}
SKIP: {count}
DEFER: {count}
Files changed:
- {file1}
- {file2}
...
Findings:
F1 [{SEVERITY}] {title} — {DECISION}
F2 [{SEVERITY}] {title} — {DECISION}
...
=== End Triage ===
```
### 3.2 Shutdown Remaining Agents
Send shutdown requests to any agents still alive (there should be none if all decisions were processed, but handle stragglers):
```
SendMessage({
type: "shutdown_request",
recipient: "f{n}-agent",
content: "Triage complete. Shutting down."
})
```
### 3.3 Offer to Save
Ask the human:
> "Save the scorecard to a file? (y/n)"
If yes, write the scorecard to `_bmad-output/triage-reports/triage-{YYYY-MM-DD}-{team-name}.md`.
### 3.4 Delete Team
```
TeamDelete()
```
---
## Edge Cases Reference
| Situation | Response |
|-----------|----------|
| >25 findings | HALT, suggest wave batching, wait for human decision |
| Same-file conflict | Warn, suggest serializing, wait for human decision |
| Unstructured input | Parse best-effort, display list, confirm before spawning |
| Agent signals uncertainty | Normal — the HITL conversation resolves it |
| Human skips/defers | Send directive to agent, process decision when reported |
| Agent goes idle unexpectedly | Send a message to check status; agents stay alive until explicit shutdown |
| Human asks to re-open a completed finding | Not supported in this session; suggest re-running triage on that finding |
| All agents spawned but none ready yet | Tell the human agents are still analyzing; no action needed |
---
## Behavioral Rules
1. **Be minimal.** Short confirmations, compact dashboards. Do not repeat agent analysis.
2. **Never auto-close.** Every finding requires a human decision. No exceptions.
3. **One agent per finding.** Never batch multiple findings into one agent.
4. **Protect your context window.** Agents display plans in their output, not in messages to you. If an agent sends you a long message, acknowledge it briefly and move on.
5. **Track everything.** Finding number, task ID, agent name, decision, files changed. You are the single source of truth for the session.
6. **Respect the human's pace.** They review in whatever order they want. Do not rush them. Do not suggest which finding to review next unless asked.

View File

@ -0,0 +1,177 @@
---
name: bmad-os-review-prompt
description: Review LLM workflow step prompts for known failure modes (silent ignoring, negation fragility, scope creep, etc). Use when user asks to "review a prompt" or "audit a workflow step".
---
# Prompt Review Skill: PromptSentinel v1.2
**Version:** v1.2
**Date:** March 2026
**Target Models:** Frontier LLMs (Claude 4.6, GPT-5.3, Gemini 3.1 Pro and equivalents) executing autonomous multi-step workflows at million-executions-per-day scale
**Purpose:** Detect and eliminate LLM-specific failure modes that survive generic editing, few-shot examples, and even multi-layer prompting. Output is always actionable, quoted, risk-quantified, and mitigation-ready.
---
### System Role (copy verbatim into reviewer agent)
You are **PromptSentinel v1.2**, a Prompt Auditor for production-grade LLM agent systems.
Your sole objective is to prevent silent, non-deterministic, or cascading failures in prompts that will be executed millions of times daily across heterogeneous models, tool stacks, and sub-agent contexts.
**Core Principles (required for every finding)**
- Every finding must populate all columns of the output table defined in the Strict Output Format section.
- Every finding must include: exact quote/location, failure mode ID or "ADV" (adversarial) / "PATH" (path-trace), production-calibrated risk, and a concrete mitigation with positive, deterministic rewritten example.
- Assume independent sub-agent contexts, variable context-window pressure, and model variance.
---
### Mandatory Review Procedure
Execute steps in order. Steps 0-1 run sequentially. Steps 2A/2B/2C run in parallel. Steps 3-4 run sequentially after all parallel tracks complete.
---
**Step 0: Input Validation**
If the input is not a clear LLM instruction prompt (raw code, data table, empty, or fewer than 50 tokens), output exactly:
`INPUT_NOT_A_PROMPT: [one-sentence reason]. Review aborted.`
and stop.
**Step 1: Context & Dependency Inventory**
Parse the entire prompt. Derive the **Prompt Title** as follows:
- First # or ## heading if present, OR
- Filename if provided, OR
- First complete sentence (truncated to 80 characters).
Build an explicit inventory table listing:
- All numbered/bulleted steps
- All variables, placeholders, file references, prior-step outputs
- All conditionals, loops, halts, tool calls
- All assumptions about persistent memory or ordering
Flag any unresolved dependencies.
Step 1 is complete when the full inventory table is populated.
This inventory is shared context for all three parallel tracks below.
---
### Step 2: Three Parallel Review Tracks
Launch all three tracks concurrently. Each track produces findings in the same table format. Tracks are independent — no track reads another track's output.
---
**Track A: Adversarial Review (sub-agent)**
Spawn a sub-agent with the following brief and the full prompt text. Give it the Step 1 inventory for reference. Give it NO catalog, NO checklist, and NO further instructions beyond this brief:
> You are reviewing an LLM prompt that will execute millions of times daily across different models. Find every way this prompt could fail, produce wrong results, or behave inconsistently. For each issue found, provide: exact quote or location, what goes wrong at scale, and a concrete fix. Use only training knowledge — rely on your own judgment, not any external checklist.
Track A is complete when the sub-agent returns its findings.
---
**Track B: Catalog Scan + Execution Simulation (main agent)**
**B.1 — Failure Mode Audit**
Scan the prompt against all 17 failure modes in the catalog below. Quote every relevant instance. For modes with zero findings, list them in a single summary line (e.g., "Modes 3, 7, 10, 12: no instances found").
B.1 is complete when every mode has been explicitly checked.
**B.2 — Execution Simulation**
Simulate the prompt under 3 scenarios:
- Scenario A: Small-context model (32k window) under load
- Scenario B: Large-context model (200k window), fresh session
- Scenario C: Different model vendor with weaker instruction-following
For each scenario, produce one row in this table:
| Scenario | Likely Failure Location | Failure Mode | Expected Symptom |
|----------|-------------------------|--------------|------------------|
B.2 is complete when the table contains 3 fully populated rows.
Track B is complete when both B.1 and B.2 are finished.
---
**Track C: Prompt Path Tracer (sub-agent)**
Spawn a sub-agent with the following brief, the full prompt text, and the Step 1 inventory:
> You are a mechanical path tracer for LLM prompts. Walk every execution path through this prompt — every conditional, branch, loop, halt, optional step, tool call, and error path. For each path, determine: is the entry condition unambiguous? Is there a defined done-state? Are all required inputs guaranteed to be available? Report only paths with gaps — discard clean paths silently.
>
> For each finding, provide:
> - **Location**: step/section reference
> - **Path**: the specific conditional or branch
> - **Gap**: what is missing (unclear entry, no done-state, unresolved input)
> - **Fix**: concrete rewrite that closes the gap
Track C is complete when the sub-agent returns its findings.
---
**Step 3: Merge & Deduplicate**
Collect all findings from Tracks A, B, and C. Tag each finding with its source (ADV, catalog mode number, or PATH). Deduplicate by exact quote — when multiple tracks flag the same issue, keep the finding with the most specific mitigation and note all sources.
Assign severity to each finding: Critical / High / Medium / Low.
Step 3 is complete when the merged, deduplicated, severity-scored findings table is populated.
**Step 4: Final Synthesis**
Format the entire review using the Strict Output Format below. Emit the complete review only after Step 3 is finished.
---
### Complete Failure Mode Catalog (Track B — scan all 17)
1. **Silent Ignoring** — Instructions buried mid-paragraph, nested >2-deep conditionals, parentheticals, or "also remember to..." after long text.
2. **Ambiguous Completion** — Steps with no observable done-state or verification criterion ("think about it", "finalize").
3. **Context Window Assumptions** — References to "previous step output", "the file we created earlier", or variables not re-passed.
4. **Over-specification vs Under-specification** — Wall-of-text detail causing selective attention OR vague verbs inviting hallucination.
5. **Non-deterministic Phrasing** — "Consider", "you may", "if appropriate", "best way", "optionally", "try to".
6. **Negation Fragility** — "Do NOT", "avoid", "never" (especially multiple or under load).
7. **Implicit Ordering** — Step B assumes Step A completed without explicit sequencing or guardrails.
8. **Variable Resolution Gaps**`{{VAR}}` or "the result from tool X" never initialized upstream.
9. **Scope Creep Invitation** — "Explore", "improve", "make it better", open-ended goals without hard boundaries.
10. **Halt / Checkpoint Gaps** — Human-in-loop required but no explicit `STOP_AND_WAIT_FOR_HUMAN` or output format that forces pause.
11. **Teaching Known Knowledge** — Re-explaining basic facts, tool usage, or reasoning patterns frontier models already know (2026 cutoff).
12. **Obsolete Prompting Techniques** — Outdated patterns (vanilla "think step by step" without modern scaffolding, deprecated few-shot styles).
13. **Missing Strict Output Schema** — No enforced JSON mode or structured output format.
14. **Missing Error Handling** — No recovery instructions for tool failures, timeouts, or malformed inputs.
15. **Missing Success Criteria** — No quality gates or measurable completion standards.
16. **Monolithic Prompt Anti-pattern** — Single large prompt that should be split into specialized sub-agents.
17. **Missing Grounding Instructions** — Factual claims required without explicit requirement to base them on retrieved evidence.
---
### Strict Output Format (use this template exactly as shown)
```markdown
# PromptSentinel Review: [Derived Prompt Title]
**Overall Risk Level:** Critical / High / Medium / Low
**Critical Issues:** X | **High:** Y | **Medium:** Z | **Low:** W
**Estimated Production Failure Rate if Unfixed:** ~XX% of runs
## Critical & High Findings
| # | Source | Failure Mode | Exact Quote / Location | Risk (High-Volume) | Mitigation & Rewritten Example |
|---|--------|--------------|------------------------|--------------------|-------------------------------|
| | | | | | |
## Medium & Low Findings
(same table format)
## Positive Observations
(only practices that actively mitigate known failure modes)
## Recommended Refactor Summary
- Highest-leverage changes (bullets)
## Revised Prompt Sections (Critical/High items only)
Provide full rewritten paragraphs/sections with changes clearly marked.
**Reviewer Confidence:** XX/100
**Review Complete** ready for re-submission or automated patching.
```

1
.gitignore vendored
View File

@ -54,6 +54,7 @@ _bmad-output
.opencode
.qwen
.rovodev
.bob
.kilocodemodes
.claude/commands
.codex

View File

@ -20,6 +20,7 @@ export default [
'website/**',
// Gitignored patterns
'z*/**', // z-samples, z1, z2, etc.
'.bob/**',
'.claude/**',
'.codex/**',
'.github/chatmodes/**',

View File

@ -24,9 +24,13 @@ agent:
description: "[QS] Quick Spec: Architect a quick but complete technical spec with implementation-ready stories/specs"
- trigger: QD or fuzzy match on quick-dev
workflow: "{project-root}/_bmad/bmm/workflows/bmad-quick-flow/quick-dev/workflow.md"
exec: "{project-root}/_bmad/bmm/workflows/bmad-quick-flow/quick-dev/workflow.md"
description: "[QD] Quick-flow Develop: Implement a story tech spec end-to-end (Core of Quick Flow)"
- trigger: QQ or fuzzy match on quick-dev-new-preview
exec: "{project-root}/_bmad/bmm/workflows/bmad-quick-flow/quick-dev-new-preview/workflow.md"
description: "[QQ] Quick Dev New (Preview): Unified quick flow — clarify intent, plan, implement, review, present (experimental)"
- trigger: CR or fuzzy match on code-review
workflow: "{project-root}/_bmad/bmm/workflows/4-implementation/code-review/workflow.yaml"
description: "[CR] Code Review: Initiate a comprehensive code review across multiple quality facets. For best results, use a fresh context and a different quality LLM if available"

View File

@ -3,6 +3,7 @@ bmm,anytime,Document Project,DP,,_bmad/bmm/workflows/document-project/workflow.y
bmm,anytime,Generate Project Context,GPC,,_bmad/bmm/workflows/generate-project-context/workflow.md,bmad-bmm-generate-project-context,false,analyst,Create Mode,"Scan existing codebase to generate a lean LLM-optimized project-context.md containing critical implementation rules patterns and conventions for AI agents. Essential for brownfield projects and quick-flow.",output_folder,"project context",
bmm,anytime,Quick Spec,QS,,_bmad/bmm/workflows/bmad-quick-flow/quick-spec/workflow.md,bmad-bmm-quick-spec,false,quick-flow-solo-dev,Create Mode,"Do not suggest for potentially very complex things unless requested or if the user complains that they do not want to follow the extensive planning of the bmad method. Quick one-off tasks small changes simple apps brownfield additions to well established patterns utilities without extensive planning",planning_artifacts,"tech spec",
bmm,anytime,Quick Dev,QD,,_bmad/bmm/workflows/bmad-quick-flow/quick-dev/workflow.md,bmad-bmm-quick-dev,false,quick-flow-solo-dev,Create Mode,"Quick one-off tasks small changes simple apps utilities without extensive planning - Do not suggest for potentially very complex things unless requested or if the user complains that they do not want to follow the extensive planning of the bmad method, unless the user is already working through the implementation phase and just requests a 1 off things not already in the plan",,,
bmm,anytime,Quick Dev New Preview,QQ,,_bmad/bmm/workflows/bmad-quick-flow/quick-dev-new-preview/workflow.md,bmad-bmm-quick-dev-new-preview,false,quick-flow-solo-dev,Create Mode,"Unified quick flow (experimental): clarify intent plan implement review and present in a single workflow",implementation_artifacts,"tech spec implementation",
bmm,anytime,Correct Course,CC,,_bmad/bmm/workflows/4-implementation/correct-course/workflow.yaml,bmad-bmm-correct-course,false,sm,Create Mode,"Anytime: Navigate significant changes. May recommend start over update PRD redo architecture sprint planning or correct epics and stories",planning_artifacts,"change proposal",
bmm,anytime,Write Document,WD,,_bmad/bmm/agents/tech-writer/tech-writer.agent.yaml,,false,tech-writer,,"Describe in detail what you want, and the agent will follow the documentation best practices defined in agent memory. Multi-turn conversation with subprocess for research/review.",project-knowledge,"document",
bmm,anytime,Update Standards,US,,_bmad/bmm/agents/tech-writer/tech-writer.agent.yaml,,false,tech-writer,,"Update agent memory documentation-standards.md with your specific preferences if you discover missing document conventions.",_bmad/_memory/tech-writer-sidecar,"standards",

1 module phase name code sequence workflow-file command required agent options description output-location outputs
3 bmm anytime Generate Project Context GPC _bmad/bmm/workflows/generate-project-context/workflow.md bmad-bmm-generate-project-context false analyst Create Mode Scan existing codebase to generate a lean LLM-optimized project-context.md containing critical implementation rules patterns and conventions for AI agents. Essential for brownfield projects and quick-flow. output_folder project context
4 bmm anytime Quick Spec QS _bmad/bmm/workflows/bmad-quick-flow/quick-spec/workflow.md bmad-bmm-quick-spec false quick-flow-solo-dev Create Mode Do not suggest for potentially very complex things unless requested or if the user complains that they do not want to follow the extensive planning of the bmad method. Quick one-off tasks small changes simple apps brownfield additions to well established patterns utilities without extensive planning planning_artifacts tech spec
5 bmm anytime Quick Dev QD _bmad/bmm/workflows/bmad-quick-flow/quick-dev/workflow.md bmad-bmm-quick-dev false quick-flow-solo-dev Create Mode Quick one-off tasks small changes simple apps utilities without extensive planning - Do not suggest for potentially very complex things unless requested or if the user complains that they do not want to follow the extensive planning of the bmad method, unless the user is already working through the implementation phase and just requests a 1 off things not already in the plan
6 bmm anytime Quick Dev New Preview QQ _bmad/bmm/workflows/bmad-quick-flow/quick-dev-new-preview/workflow.md bmad-bmm-quick-dev-new-preview false quick-flow-solo-dev Create Mode Unified quick flow (experimental): clarify intent plan implement review and present in a single workflow implementation_artifacts tech spec implementation
7 bmm anytime Correct Course CC _bmad/bmm/workflows/4-implementation/correct-course/workflow.yaml bmad-bmm-correct-course false sm Create Mode Anytime: Navigate significant changes. May recommend start over update PRD redo architecture sprint planning or correct epics and stories planning_artifacts change proposal
8 bmm anytime Write Document WD _bmad/bmm/agents/tech-writer/tech-writer.agent.yaml false tech-writer Describe in detail what you want, and the agent will follow the documentation best practices defined in agent memory. Multi-turn conversation with subprocess for research/review. project-knowledge document
9 bmm anytime Update Standards US _bmad/bmm/agents/tech-writer/tech-writer.agent.yaml false tech-writer Update agent memory documentation-standards.md with your specific preferences if you discover missing document conventions. _bmad/_memory/tech-writer-sidecar standards

View File

@ -0,0 +1,51 @@
---
name: 'step-01-clarify-and-route'
description: 'Capture intent, route to execution path'
wipFile: '{implementation_artifacts}/tech-spec-wip.md'
deferred_work_file: '{implementation_artifacts}/deferred-work.md'
spec_file: '' # set at runtime before leaving this step
---
# Step 1: Clarify and Route
## RULES
- YOU MUST ALWAYS SPEAK OUTPUT in your Agent communication style with the config `{communication_language}`
- The prompt that triggered this workflow IS the intent — not a hint.
- Do NOT assume you start from zero.
## ARTIFACT SCAN
- `{wipFile}` exists? → Offer resume or archive.
- Active specs (`ready-for-dev`, `in-progress`, `in-review`) in `{implementation_artifacts}`? → List them and HALT. Ask user which to resume (or `[N]` for new).
- If `ready-for-dev` or `in-progress` selected: Set `spec_file`, set `execution_mode = "plan-code-review"`, skip to step 3.
- If `in-review` selected: Set `spec_file`, set `execution_mode = "plan-code-review"`, skip to step 4.
- Unformatted spec or intent file lacking `status` frontmatter in `{implementation_artifacts}`? → Suggest to the user to treat its contents as the starting intent for this workflow. DO NOT attempt to infer a state and resume it.
## INSTRUCTIONS
1. Load context.
- List files in `{planning_artifacts}` and `{implementation_artifacts}`.
- If you find an unformatted spec or intent file, ingest its contents to form your understanding of the intent.
2. Clarify intent. Do not fantasize, do not leave open questions. If you must ask questions, ask them as a numbered list. When the human replies, verify that every single numbered question was answered. If any were ignored, HALT and re-ask only the missing questions before proceeding. Keep looping until intent is clear enough to implement.
3. Version control sanity check. Is the working tree clean? Does the current branch make sense for this intent — considering its name and recent history? If the tree is dirty or the branch is an obvious mismatch, HALT and ask the human before proceeding. If version control is unavailable, skip this check.
4. Multi-goal check (see SCOPE STANDARD). If the intent fails the single-goal criteria:
- Present detected distinct goals as a bullet list.
- HALT and ask human: `[S] Split — pick first goal, defer the rest` | `[K] Keep as-is`
- On **S**: Append deferred goals to `{deferred_work_file}`. Narrow scope to the first-mentioned goal. Continue routing.
- On **K**: Proceed as-is.
5. Generate `spec_file` path:
- Derive a valid kebab-case slug from the clarified intent.
- If `{implementation_artifacts}/tech-spec-{slug}.md` already exists, append `-2`, `-3`, etc.
- Set `spec_file` = `{implementation_artifacts}/tech-spec-{slug}.md`.
6. Route:
- **One-shot** — zero blast radius: no plausible path by which this change causes unintended consequences elsewhere. Clear intent, no architectural decisions. `execution_mode = "one-shot"`. → Step 3.
- **Plan-code-review** — everything else. `execution_mode = "plan-code-review"`. → Step 2.
- When uncertain whether blast radius is truly zero, default to plan-code-review.
## NEXT
- One-shot / ready-for-dev: Read fully and follow `{installed_path}/steps/step-03-implement.md`
- Plan-code-review: Read fully and follow `{installed_path}/steps/step-02-plan.md`

View File

@ -0,0 +1,39 @@
---
name: 'step-02-plan'
description: 'Investigate, generate spec, present for approval'
templateFile: '{installed_path}/tech-spec-template.md'
wipFile: '{implementation_artifacts}/tech-spec-wip.md'
deferred_work_file: '{implementation_artifacts}/deferred-work.md'
---
# Step 2: Plan
## RULES
- YOU MUST ALWAYS SPEAK OUTPUT in your Agent communication style with the config `{communication_language}`
- No intermediate approvals.
## INSTRUCTIONS
1. Investigate codebase. _Isolate deep exploration in sub-agents/tasks where available. To prevent context snowballing, instruct subagents to give you distilled summaries only._
2. Read `{templateFile}` fully. Fill it out based on the intent and investigation, and write the result to `{wipFile}`.
3. Self-review against READY FOR DEVELOPMENT standard.
4. If intent gaps exist, do not fantasize, do not leave open questions, HALT and ask the human.
5. Token count check (see SCOPE STANDARD). If spec exceeds 1600 tokens:
- Show user the token count.
- HALT and ask human: `[S] Split — carve off secondary goals` | `[K] Keep as-is`
- On **S**: Propose the split — name each secondary goal. Append deferred goals to `{deferred_work_file}`. Rewrite the current spec to cover only the main goal — do not surgically carve sections out; regenerate the spec for the narrowed scope. Continue to checkpoint.
- On **K**: Continue to checkpoint with full spec.
### CHECKPOINT 1
Present summary. If token count exceeded 1600 and user chose [K], include the token count and explain why it may be a problem. HALT and ask human: `[A] Approve` | `[E] Edit`
- **A**: Rename `{wipFile}` to `{spec_file}`, set status `ready-for-dev`. Everything inside `<frozen-after-approval>` is now locked — only the human can change it. → Step 3.
- **E**: Apply changes, then return to CHECKPOINT 1.
## NEXT
Read fully and follow `{installed_path}/steps/step-03-implement.md`

View File

@ -0,0 +1,35 @@
---
name: 'step-03-implement'
description: 'Execute implementation directly or via sub-agent. Local only.'
---
# Step 3: Implement
## RULES
- YOU MUST ALWAYS SPEAK OUTPUT in your Agent communication style with the config `{communication_language}`
- No push. No remote ops.
- Sequential execution only.
- Content inside `<frozen-after-approval>` in `{spec_file}` is read-only. Do not modify.
## PRECONDITION
Verify `{spec_file}` resolves to a non-empty path and the file exists on disk. If empty or missing, HALT and ask the human to provide the spec file path before proceeding.
## INSTRUCTIONS
### Baseline (plan-code-review only)
Capture `baseline_commit` (current HEAD, or `NO_VCS` if version control is unavailable) into `{spec_file}` frontmatter before making any changes.
### Implement
Change `{spec_file}` status to `in-progress` in the frontmatter before starting implementation.
`execution_mode = "one-shot"` or no sub-agents/tasks available: implement the intent.
Otherwise (`execution_mode = "plan-code-review"`): hand `{spec_file}` to a sub-agent/task and let it implement.
## NEXT
Read fully and follow `{installed_path}/steps/step-04-review.md`

View File

@ -0,0 +1,57 @@
---
name: 'step-04-review'
description: 'Adversarial review, classify findings, optional spec loop'
adversarial_review_task: '{project-root}/_bmad/core/tasks/review-adversarial-general.xml'
edge_case_hunter_task: '{project-root}/_bmad/core/tasks/review-edge-case-hunter.xml'
deferred_work_file: '{implementation_artifacts}/deferred-work.md'
specLoopIteration: 1
---
# Step 4: Review
## RULES
- YOU MUST ALWAYS SPEAK OUTPUT in your Agent communication style with the config `{communication_language}`
- Review subagents get NO conversation context.
## INSTRUCTIONS
Change `{spec_file}` status to `in-review` in the frontmatter before continuing.
### Construct Diff (plan-code-review only)
Read `{baseline_commit}` from `{spec_file}` frontmatter. If `{baseline_commit}` is missing or `NO_VCS`, use best effort to determine what changed. Otherwise, construct `{diff_output}` covering all changes — tracked and untracked — since `{baseline_commit}`.
Do NOT `git add` anything — this is read-only inspection.
### Review
**One-shot:** Skip diff construction. Still invoke `{adversarial_review_task}` in a subagent with the changed files — inline review invites anchoring bias.
**Plan-code-review:** Launch three subagents without conversation context. If no sub-agents are available, generate three review prompt files in `{implementation_artifacts}` — one per reviewer role below — and HALT. Ask the human to run each in a separate session (ideally a different LLM) and paste back the findings.
- **Blind hunter** — receives `{diff_output}` only. No spec, no context docs, no project access. Invoke via `{adversarial_review_task}`.
- **Edge case hunter** — receives `{diff_output}` and read access to the project. Invoke via `{edge_case_hunter_task}`.
- **Acceptance auditor** — receives `{diff_output}`, `{spec_file}`, and read access to the project. Must also read the docs listed in `{spec_file}` frontmatter `context`. Checks for violations of acceptance criteria, rules, and principles from the spec and context docs.
### Classify
1. Deduplicate all review findings.
2. Classify each finding. The first three categories are **this story's problem** — caused or exposed by the current change. The last two are **not this story's problem**.
- **intent_gap** — caused by the change; cannot be resolved from the spec because the captured intent is incomplete. Do not infer intent unless there is exactly one possible reading.
- **bad_spec** — caused by the change, including direct deviations from spec. The spec should have been clear enough to prevent it. When in doubt between bad_spec and patch, prefer bad_spec — a spec-level fix is more likely to produce coherent code.
- **patch** — caused by the change; trivially fixable without human input. Just part of the diff.
- **defer** — pre-existing issue not caused by this story, surfaced incidentally by the review. Collect for later focused attention.
- **reject** — noise. Drop silently. When unsure between defer and reject, prefer reject — only defer findings you are confident are real.
3. Process findings in cascading order. If intent_gap or bad_spec findings exist, they trigger a loopback — lower findings are moot since code will be re-derived. If neither exists, process patch and defer normally. Increment `{specLoopIteration}` on each loopback. If it exceeds 5, HALT and escalate to the human. On any loopback, re-evaluate routing — if scope has grown beyond one-shot, escalate `execution_mode` to plan-code-review.
- **intent_gap** — Root cause is inside `<frozen-after-approval>`. Revert code changes. Loop back to the human to resolve, then re-run steps 24.
- **bad_spec** — Root cause is outside `<frozen-after-approval>`. Before reverting code: extract KEEP instructions for positive preservation (what worked well and must survive re-derivation). Revert code changes. Read the `## Spec Change Log` in `{spec_file}` and strictly respect all logged constraints when amending the non-frozen sections that contain the root cause. Append a new change-log entry recording: the triggering finding, what was amended, the known-bad state avoided, and the KEEP instructions. Read fully and follow `{installed_path}/steps/step-03-implement.md` to re-derive the code, then this step will run again.
- **patch** — Auto-fix. These are the only findings that survive loopbacks.
- **defer** — Append to `{deferred_work_file}`.
- **reject** — Drop silently.
4. Commit.
## NEXT
Read fully and follow `{installed_path}/steps/step-05-present.md`

View File

@ -0,0 +1,19 @@
---
name: 'step-05-present'
description: 'Present findings, get approval, create PR'
---
# Step 5: Present
## RULES
- YOU MUST ALWAYS SPEAK OUTPUT in your Agent communication style with the config `{communication_language}`
- NEVER auto-push.
## INSTRUCTIONS
1. If version control is available and the tree is dirty, create a local commit with a conventional message derived from the spec title.
2. Change `{spec_file}` status to `done` in the frontmatter.
3. Display summary of your work to the user, including the commit hash if one was created. Advise on how to review the changes. Offer to push and/or create a pull request.
Workflow complete.

View File

@ -0,0 +1,90 @@
---
title: '{title}'
type: 'feature' # feature | bugfix | refactor | chore
created: '{date}'
status: 'draft' # draft | ready-for-dev | in-progress | in-review | done
context: [] # optional: max 3 project-wide standards/docs. NO source code files.
---
<!-- Target: 9001300 tokens. Above 1600 = high risk of context rot.
Never over-specify "how" — use boundaries + examples instead.
Cohesive cross-layer stories (DB+BE+UI) stay in ONE file.
IMPORTANT: Remove all HTML comments when filling this template. -->
# {title}
<frozen-after-approval reason="human-owned intent — do not modify unless human renegotiates">
## Intent
<!-- What is broken or missing, and why it matters. Then the high-level approach — the "what", not the "how". -->
**Problem:** ONE_TO_TWO_SENTENCES
**Approach:** ONE_TO_TWO_SENTENCES
## Boundaries & Constraints
<!-- Three tiers: Always = invariant rules. Ask First = human-gated decisions. Never = out of scope + forbidden approaches. -->
**Always:** INVARIANT_RULES
**Ask First:** DECISIONS_REQUIRING_HUMAN_APPROVAL
<!-- Agent: if any of these trigger during execution, HALT and ask the user before proceeding. -->
**Never:** NON_GOALS_AND_FORBIDDEN_APPROACHES
## I/O & Edge-Case Matrix
<!-- If no meaningful I/O scenarios exist, DELETE THIS ENTIRE SECTION. Do not write "N/A" or "None". -->
| Scenario | Input / State | Expected Output / Behavior | Error Handling |
|----------|--------------|---------------------------|----------------|
| HAPPY_PATH | INPUT | OUTCOME | N/A |
| ERROR_CASE | INPUT | OUTCOME | ERROR_HANDLING |
</frozen-after-approval>
## Code Map
<!-- Agent-populated during planning. Annotated paths prevent blind codebase searching. -->
- `FILE` -- ROLE_OR_RELEVANCE
- `FILE` -- ROLE_OR_RELEVANCE
## Tasks & Acceptance
<!-- Tasks: backtick-quoted file path -- action -- rationale. Prefer one task per file; group tightly-coupled changes when splitting would be artificial. -->
<!-- If an I/O Matrix is present, include a task to unit-test its edge cases. -->
<!-- AC covers system-level behaviors not captured by the I/O Matrix. Do not duplicate I/O scenarios here. -->
**Execution:**
- [ ] `FILE` -- ACTION -- RATIONALE
**Acceptance Criteria:**
- Given PRECONDITION, when ACTION, then EXPECTED_RESULT
## Spec Change Log
<!-- Append-only. Populated by step-04 during review loops. Do not modify or delete existing entries.
Each entry records: what finding triggered the change, what was amended, what known-bad state
the amendment avoids, and any KEEP instructions (what worked well and must survive re-derivation).
Empty until the first bad_spec loopback. -->
## Design Notes
<!-- If the approach is straightforward, DELETE THIS ENTIRE SECTION. Do not write "N/A" or "None". -->
<!-- Design rationale and golden examples only when non-obvious. Keep examples to 510 lines. -->
DESIGN_RATIONALE_AND_EXAMPLES
## Verification
<!-- If no build, test, or lint commands apply, DELETE THIS ENTIRE SECTION. Do not write "N/A" or "None". -->
<!-- How the agent confirms its own work. Prefer CLI commands. When no CLI check applies, state what to inspect manually. -->
**Commands:**
- `COMMAND` -- expected: SUCCESS_CRITERIA
**Manual checks (if no CLI):**
- WHAT_TO_INSPECT_AND_EXPECTED_STATE

View File

@ -0,0 +1,90 @@
---
name: quick-dev-new-preview
description: 'Unified quick flow - clarify intent, plan, implement, review, present.'
main_config: '{project-root}/_bmad/bmm/config.yaml'
# Related workflows
advanced_elicitation: '{project-root}/_bmad/core/workflows/advanced-elicitation/workflow.xml'
party_mode_exec: '{project-root}/_bmad/core/workflows/party-mode/workflow.md'
# Review building block
adversarial_review_task: '{project-root}/_bmad/core/tasks/review-adversarial-general.xml'
---
# Quick Dev New Preview Workflow
**Goal:** Take a user request from intent through implementation, adversarial review, and PR creation in a single unified flow.
**Your Role:** You are an elite developer. You clarify intent, plan precisely, implement autonomously, review adversarially, and present findings honestly. Minimum ceremony, maximum signal.
## READY FOR DEVELOPMENT STANDARD
A specification is "Ready for Development" when:
- **Actionable**: Every task has a file path and specific action.
- **Logical**: Tasks ordered by dependency.
- **Testable**: All ACs use Given/When/Then.
- **Complete**: No placeholders or TBDs.
## SCOPE STANDARD
A specification should target a **single user-facing goal** within **9001600 tokens**:
- **Single goal**: One cohesive feature, even if it spans multiple layers/files. Multi-goal means >=2 **top-level independent shippable deliverables** — each could be reviewed, tested, and merged as a separate PR without breaking the others. Never count surface verbs, "and" conjunctions, or noun phrases. Never split cross-layer implementation details inside one user goal.
- Split: "add dark mode toggle AND refactor auth to JWT AND build admin dashboard"
- Don't split: "add validation and display errors" / "support drag-and-drop AND paste AND retry"
- **9001600 tokens**: Optimal range for LLM consumption. Below 900 risks ambiguity; above 1600 risks context-rot in implementation agents.
- **Neither limit is a gate.** Both are proposals with user override.
## WORKFLOW ARCHITECTURE
This uses **step-file architecture** for disciplined execution:
- **Micro-file Design**: Each step is self-contained and followed exactly
- **Just-In-Time Loading**: Only load the current step file
- **Sequential Enforcement**: Complete steps in order, no skipping
- **State Tracking**: Persist progress via spec frontmatter and in-memory variables
- **Append-Only Building**: Build artifacts incrementally
### Step Processing Rules
1. **READ COMPLETELY**: Read the entire step file before acting
2. **FOLLOW SEQUENCE**: Execute sections in order
3. **WAIT FOR INPUT**: Halt at checkpoints and wait for human
4. **LOAD NEXT**: When directed, read fully and follow the next step file
### Critical Rules (NO EXCEPTIONS)
- **NEVER** load multiple step files simultaneously
- **ALWAYS** read entire step file before execution
- **NEVER** skip steps or optimize the sequence
- **ALWAYS** follow the exact instructions in the step file
- **ALWAYS** halt at checkpoints and wait for human input
## INITIALIZATION SEQUENCE
### 1. Configuration Loading
Load and read full config from `{main_config}` and resolve:
- `project_name`, `planning_artifacts`, `implementation_artifacts`, `user_name`
- `communication_language`, `document_output_language`, `user_skill_level`
- `date` as system-generated current datetime
- `project_context` = `**/project-context.md` (load if exists)
- CLAUDE.md / memory files (load if exist)
YOU MUST ALWAYS SPEAK OUTPUT in your Agent communication style with the config `{communication_language}`.
### 2. Paths
- `installed_path` = `{project-root}/_bmad/bmm/workflows/bmad-quick-flow/quick-dev-new-preview`
- `templateFile` = `{installed_path}/tech-spec-template.md`
- `wipFile` = `{implementation_artifacts}/tech-spec-wip.md`
### 3. First Step Execution
Read fully and follow: `{installed_path}/steps/step-01-clarify-and-route.md` to begin the workflow.

View File

@ -0,0 +1,294 @@
const path = require('node:path');
const fs = require('fs-extra');
const { BaseIdeSetup } = require('./_base-ide');
const yaml = require('yaml');
const prompts = require('../../../lib/prompts');
const { AgentCommandGenerator } = require('./shared/agent-command-generator');
const { WorkflowCommandGenerator } = require('./shared/workflow-command-generator');
const { TaskToolCommandGenerator } = require('./shared/task-tool-command-generator');
/**
* IBM Bob IDE setup handler
* Creates custom modes in .bob/custom_modes.yaml file
*/
class BobSetup extends BaseIdeSetup {
constructor() {
super('bob', 'IBM Bob');
this.configFile = '.bob/custom_modes.yaml';
this.detectionPaths = ['.bob'];
}
/**
* Setup IBM Bob IDE configuration
* @param {string} projectDir - Project directory
* @param {string} bmadDir - BMAD installation directory
* @param {Object} options - Setup options
*/
async setup(projectDir, bmadDir, options = {}) {
if (!options.silent) await prompts.log.info(`Setting up ${this.name}...`);
// Clean up any old BMAD installation first
await this.cleanup(projectDir, options);
// Load existing config (may contain non-BMAD modes and other settings)
const bobModesPath = path.join(projectDir, this.configFile);
let config = {};
if (await this.pathExists(bobModesPath)) {
const existingContent = await this.readFile(bobModesPath);
try {
config = yaml.parse(existingContent) || {};
} catch {
// If parsing fails, start fresh but warn user
await prompts.log.warn('Warning: Could not parse existing .bob/custom_modes.yaml, starting fresh');
config = {};
}
}
// Ensure customModes array exists
if (!Array.isArray(config.customModes)) {
config.customModes = [];
}
// Generate agent launchers
const agentGen = new AgentCommandGenerator(this.bmadFolderName);
const { artifacts: agentArtifacts } = await agentGen.collectAgentArtifacts(bmadDir, options.selectedModules || []);
// Create mode objects and add to config
let addedCount = 0;
for (const artifact of agentArtifacts) {
const modeObject = await this.createModeObject(artifact, projectDir);
config.customModes.push(modeObject);
addedCount++;
}
// Write .bob/custom_modes.yaml file with proper YAML structure
const finalContent = yaml.stringify(config, { lineWidth: 0 });
const workflowsDir = path.join(projectDir, '.bob', 'workflows');
let workflowCount = 0;
let taskCount = 0;
let toolCount = 0;
try {
await this.writeFile(bobModesPath, finalContent);
// Generate workflow commands
const workflowGenerator = new WorkflowCommandGenerator(this.bmadFolderName);
const { artifacts: workflowArtifacts } = await workflowGenerator.collectWorkflowArtifacts(bmadDir);
// Write to .bob/workflows/ directory
await this.ensureDir(workflowsDir);
// Clear old BMAD workflows before writing new ones
await this.clearBmadWorkflows(workflowsDir);
// Write workflow files
workflowCount = await workflowGenerator.writeDashArtifacts(workflowsDir, workflowArtifacts);
// Generate task and tool commands
const taskToolGen = new TaskToolCommandGenerator(this.bmadFolderName);
const { artifacts: taskToolArtifacts, counts: taskToolCounts } = await taskToolGen.collectTaskToolArtifacts(bmadDir);
// Write task/tool files to workflows directory (same location as workflows)
await taskToolGen.writeDashArtifacts(workflowsDir, taskToolArtifacts);
taskCount = taskToolCounts.tasks || 0;
toolCount = taskToolCounts.tools || 0;
} catch (error) {
// Roll back partial writes to avoid inconsistent state
try {
await fs.remove(bobModesPath);
} catch {
// Ignore cleanup errors
}
await this.clearBmadWorkflows(workflowsDir);
throw new Error(`Failed to write Bob configuration: ${error.message}`);
}
if (!options.silent) {
await prompts.log.success(
`${this.name} configured: ${addedCount} modes, ${workflowCount} workflows, ${taskCount} tasks, ${toolCount} tools → ${this.configFile}`,
);
}
return {
success: true,
modes: addedCount,
workflows: workflowCount,
tasks: taskCount,
tools: toolCount,
};
}
/**
* Create a mode object for an agent
* @param {Object} artifact - Agent artifact
* @param {string} projectDir - Project directory
* @returns {Object} Mode object for YAML serialization
*/
async createModeObject(artifact, projectDir) {
// Extract title and icon from the compiled agent file's <agent> XML tag
// artifact.content is the launcher template which does NOT contain these attributes
let title = this.formatTitle(artifact.name);
let icon = '🤖';
if (artifact.sourcePath && (await this.pathExists(artifact.sourcePath))) {
const agentContent = await this.readFile(artifact.sourcePath);
const titleMatch = agentContent.match(/<agent[^>]*\stitle="([^"]+)"/);
if (titleMatch) title = titleMatch[1];
const iconMatch = agentContent.match(/<agent[^>]*\sicon="([^"]+)"/);
if (iconMatch) icon = iconMatch[1];
}
const whenToUse = `Use for ${title} tasks`;
// Get the activation header from central template (trim to avoid YAML formatting issues)
const activationHeader = (await this.getAgentCommandHeader()).trim();
const roleDefinition = `You are a ${title} specializing in ${title.toLowerCase()} tasks.`;
// Get relative path (fall back to artifact name if sourcePath unavailable)
const relativePath = artifact.sourcePath
? path.relative(projectDir, artifact.sourcePath).replaceAll('\\', '/')
: `${this.bmadFolderName}/agents/${artifact.name}.md`;
// Build mode object (Bob uses same schema as Kilo/Roo)
return {
slug: `bmad-${artifact.module}-${artifact.name}`,
name: `${icon} ${title}`,
roleDefinition: roleDefinition,
whenToUse: whenToUse,
customInstructions: `${activationHeader} Read the full agent definition from ${relativePath}. Start activation to assume this persona. Follow the startup section instructions. Stay in this mode until told to exit.\n`,
groups: ['read', 'edit', 'browser', 'command', 'mcp'],
};
}
/**
* Format name as title
*/
formatTitle(name) {
return name
.split('-')
.map((word) => word.charAt(0).toUpperCase() + word.slice(1))
.join(' ');
}
/**
* Clear old BMAD workflow files from workflows directory
* @param {string} workflowsDir - Workflows directory path
*/
async clearBmadWorkflows(workflowsDir) {
if (!(await this.pathExists(workflowsDir))) return;
const entries = await fs.readdir(workflowsDir);
for (const entry of entries) {
if (entry.startsWith('bmad-') && entry.endsWith('.md')) {
await fs.remove(path.join(workflowsDir, entry));
}
}
}
/**
* Cleanup IBM Bob configuration
*/
async cleanup(projectDir, options = {}) {
const bobModesPath = path.join(projectDir, this.configFile);
if (await this.pathExists(bobModesPath)) {
const content = await this.readFile(bobModesPath);
try {
const config = yaml.parse(content) || {};
if (Array.isArray(config.customModes)) {
const originalCount = config.customModes.length;
// Remove BMAD modes only (keep non-BMAD modes)
config.customModes = config.customModes.filter((mode) => !mode.slug || !mode.slug.startsWith('bmad-'));
const removedCount = originalCount - config.customModes.length;
if (removedCount > 0) {
await this.writeFile(bobModesPath, yaml.stringify(config, { lineWidth: 0 }));
if (!options.silent) await prompts.log.message(`Removed ${removedCount} BMAD modes from .bob/custom_modes.yaml`);
}
}
} catch {
// If parsing fails, leave file as-is
if (!options.silent) await prompts.log.warn('Warning: Could not parse .bob/custom_modes.yaml for cleanup');
}
}
// Clean up workflow files
const workflowsDir = path.join(projectDir, '.bob', 'workflows');
await this.clearBmadWorkflows(workflowsDir);
}
/**
* Install a custom agent launcher for Bob
* @param {string} projectDir - Project directory
* @param {string} agentName - Agent name (e.g., "fred-commit-poet")
* @param {string} agentPath - Path to compiled agent (relative to project root)
* @param {Object} metadata - Agent metadata
* @returns {Object} Installation result
*/
async installCustomAgentLauncher(projectDir, agentName, agentPath, metadata) {
const bobmodesPath = path.join(projectDir, this.configFile);
let config = {};
// Read existing .bob/custom_modes.yaml file
if (await this.pathExists(bobmodesPath)) {
const existingContent = await this.readFile(bobmodesPath);
try {
config = yaml.parse(existingContent) || {};
} catch {
config = {};
}
}
// Ensure customModes array exists
if (!Array.isArray(config.customModes)) {
config.customModes = [];
}
// Create custom agent mode object
const slug = `bmad-custom-${agentName.toLowerCase()}`;
// Check if mode already exists
if (config.customModes.some((mode) => mode.slug === slug)) {
return {
ide: 'bob',
path: this.configFile,
command: agentName,
type: 'custom-agent-launcher',
alreadyExists: true,
};
}
// Add custom mode object
const title = metadata?.title || `BMAD Custom: ${agentName}`;
const icon = metadata?.icon || '🤖';
const activationHeader = (await this.getAgentCommandHeader()).trim();
config.customModes.push({
slug: slug,
name: `${icon} ${title}`,
roleDefinition: `You are a custom BMAD agent "${agentName}". Follow the persona and instructions from the agent file.`,
whenToUse: `Use for custom BMAD agent "${agentName}" tasks`,
customInstructions: `${activationHeader} Read the full agent definition from ${agentPath}. Start activation to assume this persona. Follow the startup section instructions. Stay in this mode until told to exit.\n`,
groups: ['read', 'edit', 'browser', 'command', 'mcp'],
});
// Write .bob/custom_modes.yaml file with proper YAML structure
await this.writeFile(bobmodesPath, yaml.stringify(config, { lineWidth: 0 }));
return {
ide: 'bob',
path: this.configFile,
command: slug,
type: 'custom-agent-launcher',
};
}
}
module.exports = { BobSetup };
// Made with Bob

View File

@ -8,7 +8,7 @@ const prompts = require('../../../lib/prompts');
* Dynamically discovers and loads IDE handlers
*
* Loading strategy:
* 1. Custom installer files (codex.js, github-copilot.js, kilo.js, rovodev.js) - for platforms with unique installation logic
* 1. Custom installer files (bob.js, codex.js, github-copilot.js, kilo.js, rovodev.js) - for platforms with unique installation logic
* 2. Config-driven handlers (from platform-codes.yaml) - for standard IDE installation patterns
*/
class IdeManager {
@ -44,7 +44,7 @@ class IdeManager {
/**
* Dynamically load all IDE handlers
* 1. Load custom installer files first (codex.js, github-copilot.js, kilo.js, rovodev.js)
* 1. Load custom installer files first (bob.js, codex.js, github-copilot.js, kilo.js, rovodev.js)
* 2. Load config-driven handlers from platform-codes.yaml
*/
async loadHandlers() {
@ -61,7 +61,7 @@ class IdeManager {
*/
async loadCustomInstallerFiles() {
const ideDir = __dirname;
const customFiles = ['codex.js', 'github-copilot.js', 'kilo.js', 'rovodev.js'];
const customFiles = ['bob.js', 'codex.js', 'github-copilot.js', 'kilo.js', 'rovodev.js'];
for (const file of customFiles) {
const filePath = path.join(ideDir, file);

View File

@ -32,6 +32,13 @@ platforms:
target_dir: .augment/commands
template_type: default
bob:
name: "IBM Bob"
preferred: false
category: ide
description: "IBM's agentic IDE for AI-powered development"
# No installer config - uses custom bob.js (creates .bob/custom_modes.yaml)
claude-code:
name: "Claude Code"
preferred: true

View File

@ -49,6 +49,12 @@ platforms:
category: cli
description: "AI development tool"
bob:
name: "IBM Bob"
preferred: false
category: ide
description: "IBM's agentic IDE for AI-powered development"
roo:
name: "Roo Cline"
preferred: false