40 KiB
| title | description |
|---|---|
| Enterprise BMAD - GitHub Issues Integration Plan | Complete plan for transforming BMAD into an enterprise-scale team collaboration system with GitHub Issues integration |
Enterprise BMAD: Complete GitHub Issues Integration Plan
Vision: Transform BMAD into "the killer feature for using BMAD across an Enterprise team at scale effectively and without constantly stepping on each other's toes"
Team Size: 5-15 developers working in parallel Source of Truth: GitHub Issues (with local cache for LLM performance) Network: Required (AI coding needs internet anyway - simplified architecture)
Problem Statement
Current State: BMAD optimized for single developer
- File-based state (sprint-status.yaml on each machine)
- No coordination between developers
- Multiple devs can work on same story → duplicate work, merge conflicts
- No real-time progress visibility for Product Owners
- sprint-status.yaml merge conflicts when multiple devs push
Target State: Enterprise team coordination platform
- GitHub Issues = centralized source of truth
- Story-level locking prevents duplicate work
- Real-time progress visibility for all roles
- Product Owners manage backlog via GitHub UI + Claude Desktop
- Zero merge conflicts through atomic operations
Architecture: Three-Tier System
┌─────────────────────────────────────────────────────────────┐
│ TIER 1: GitHub Issues (Source of Truth) │
│ │
│ Stores: Status, Locks (assignee), Labels, Progress │
│ Purpose: Multi-developer coordination, PO workspace │
│ API: GitHub MCP (mcp__github__*) │
│ Latency: 100-300ms per call │
└────────────┬────────────────────────────────────────────────┘
│
↓ Smart Sync (incremental, timestamp-based)
│
┌────────────┴────────────────────────────────────────────────┐
│ TIER 2: Local Cache (Performance) │
│ │
│ Stores: Full 12-section BMAD story content │
│ Purpose: Fast LLM Read tool access │
│ Access: Instant (<100ms vs 2-3s API) │
│ Sync: Every 5 min OR on-demand (checkout, commit) │
│ Location: {output}/cache/stories/*.md │
└────────────┬────────────────────────────────────────────────┘
│
↓ Committed after story completion
│
┌────────────┴────────────────────────────────────────────────┐
│ TIER 3: Git Repository (Audit Trail) │
│ │
│ Stores: Historical story files, implementation code │
│ Purpose: Version control, audit compliance │
│ Access: Git history │
└─────────────────────────────────────────────────────────────┘
Key Principle: GitHub coordinates (who, when, status), Cache optimizes (fast reads), Git archives (history).
Core Components (Priority Order)
🔴 CRITICAL - Phase 1 (Weeks 1-2): Foundation
1.1 Smart Cache System
Purpose: Fast LLM access while GitHub is source of truth
What: Timestamp-based incremental sync that only fetches changed stories
Implementation:
Files to Create:
-
src/modules/bmm/lib/cache/cache-manager.js(300 lines)- readStoryFromCache() - With staleness check
- writeStoryToCache() - Atomic writes
- invalidateCache() - Force refresh
- getCacheAge() - Staleness calculation
-
src/modules/bmm/lib/cache/sync-engine.js(400 lines)- incrementalSync() - Fetch only changed stories
- fullSync() - Initial cache population
- preFetchEpic() - Batch fetch for context
- syncStory() - Individual story sync
-
{output}/cache/.bmad-cache-meta.json(auto-generated){ "last_sync": "2026-01-08T15:30:00Z", "stories": { "2-5-auth": { "github_issue": 105, "github_updated_at": "2026-01-08T15:29:00Z", "cache_timestamp": "2026-01-08T15:30:00Z", "local_hash": "sha256:abc...", "locked_by": "jonahschulte", "locked_until": "2026-01-08T23:30:00Z" } } }
Sync Algorithm:
// Called every 5 minutes OR on-demand
async function incrementalSync() {
const lastSync = loadCacheMeta().last_sync;
// Single API call for all changed stories
const updated = await github.search({
query: `repo:${owner}/${repo} label:type:story updated:>${lastSync}`
});
console.log(`Found ${updated.length} changed stories`); // Typically 1-3
// Fetch only changed stories
for (const issue of updated) {
const storyKey = extractStoryKey(issue);
const content = await convertIssueToStoryFile(issue);
await writeCacheFile(storyKey, content);
updateCacheMeta(storyKey, issue.updated_at);
}
}
Performance: 97% API call reduction (500/hour → 15/hour)
Critical Feature: Pre-fetch epic on checkout
async function checkoutStory(storyKey) {
// Get epic number from story key
const epicNum = storyKey.split('-')[0]; // "2-5-auth" → "2"
// Batch fetch ALL stories in epic (single API call)
const epicStories = await github.search({
query: `repo:${owner}/${repo} label:epic:${epicNum}`
});
// Cache all stories (gives LLM full epic context)
for (const story of epicStories) {
await cacheStory(story);
}
// Now developer has instant access to all related stories via Read tool
}
1.2 Story Locking System
Purpose: Prevent 2+ developers from working on same story (duplicate work prevention)
What: Dual-lock strategy (GitHub assignment + local lock file)
Files to Create:
src/modules/bmm/workflows/4-implementation/checkout-story/workflow.yamlsrc/modules/bmm/workflows/4-implementation/checkout-story/instructions.mdsrc/modules/bmm/workflows/4-implementation/unlock-story/workflow.yamlsrc/modules/bmm/workflows/4-implementation/unlock-story/instructions.mdsrc/modules/bmm/workflows/4-implementation/available-stories/workflow.yamlsrc/modules/bmm/workflows/4-implementation/lock-status/workflow.yaml.bmad/lock-registry.yaml
Lock Mechanism:
// /checkout-story story_key=2-5-auth
async function checkoutStory(storyKey) {
// 1. Check GitHub lock (distributed coordination)
const issue = await github.getIssue(storyKey);
if (issue.assignee && issue.assignee !== currentUser) {
throw new Error(
`🔒 Story locked by @${issue.assignee.login}\n` +
`Since: ${issue.updated_at}\n` +
`Try: /available-stories to see unlocked stories`
);
}
// 2. Atomic local lock (race condition safe)
const lockFile = `.bmad/locks/${storyKey}.lock`;
await atomicCreateLockFile(lockFile, {
locked_by: currentUser,
locked_at: now(),
timeout_at: now() + (8 * 3600000), // 8 hours
last_heartbeat: now(),
github_issue: issue.number
});
// 3. Assign GitHub issue (write-through)
await retryWithBackoff(async () => {
await github.assign(issue.number, currentUser);
await github.addLabel(issue.number, 'status:in-progress');
// Verify assignment succeeded
const verify = await github.getIssue(issue.number);
if (!verify.assignees.includes(currentUser)) {
throw new Error('Assignment verification failed');
}
});
// 4. Pre-fetch epic context
await preFetchEpic(extractEpic(storyKey));
console.log(`✅ Story checked out: ${storyKey}`);
console.log(`Lock expires: ${formatTime(8hours from now)}`);
}
Lock Verification (before each task in story-full-pipeline):
// Integrated into step-03-implement.md
async function verifyLockBeforeTask(storyKey) {
// Check local lock
const lock = readLockFile(storyKey);
if (lock.timeout_at < now()) {
throw new Error('Lock expired - run /checkout-story again');
}
// Check GitHub assignment (paranoid verification)
const issue = await github.getIssue(storyKey);
if (issue.assignee?.login !== currentUser) {
throw new Error(`Lock stolen - now assigned to ${issue.assignee.login}`);
}
// Refresh heartbeat
lock.last_heartbeat = now();
await updateLockFile(storyKey, lock);
console.log('✅ Lock verified');
}
Lock Timeout: 8 hours (full workday), heartbeat every 30 min during implementation, stale after 15 min no heartbeat
Scrum Master Override:
# SM can force-unlock stale locks
/unlock-story story_key=2-5-auth --force --reason="Developer offline, story blocking sprint"
1.3 Progress Sync Integration
Purpose: Real-time visibility into who's working on what
Files to Modify:
src/modules/bmm/workflows/4-implementation/dev-story/instructions.xml(Step 8, lines 502-533)src/modules/bmm/workflows/4-implementation/story-full-pipeline/steps/step-03-implement.mdsrc/modules/bmm/workflows/4-implementation/batch-stories/step-4.5-reconcile-story-status.md
Add After Task Completion:
// After marking task [x] in story file
async function syncTaskToGitHub(storyKey, taskData) {
// 1. Update local cache
updateCacheFile(storyKey, taskData);
// 2. Write-through to GitHub
await retryWithBackoff(async () => {
await github.addComment(issue,
`Task ${taskData.num} complete: ${taskData.description}\n\n` +
`Progress: ${taskData.checked}/${taskData.total} tasks (${taskData.pct}%)`
);
});
// 3. Update sprint-status.yaml
updateSprintStatus(storyKey, {
status: 'in-progress',
progress: `${taskData.checked}/${taskData.total} tasks (${taskData.pct}%)`
});
console.log(`✅ Progress synced to GitHub Issue #${issue}`);
}
Result: POs see progress updates in GitHub within seconds of task completion
🟠 HIGH PRIORITY - Phase 2 (Weeks 3-4): Product Owner Enablement
2.1 PO Agent & Workflows
Purpose: Enable POs to manage backlog via Claude Desktop + GitHub
Files to Create:
src/modules/bmm/agents/po.agent.yaml- PO agent definitionsrc/modules/bmm/workflows/po/new-story/workflow.yaml- Create story in GitHubsrc/modules/bmm/workflows/po/update-story/workflow.yaml- Modify ACssrc/modules/bmm/workflows/po/dashboard/workflow.yaml- Sprint metricssrc/modules/bmm/workflows/po/approve-story/workflow.yaml- Sign-off completed worksrc/modules/bmm/workflows/po/sync-from-github/workflow.yaml- Pull GitHub changes to cache.github/ISSUE_TEMPLATE/bmad-story.md- Issue template
PO Agent Menu:
menu:
- trigger: NS
workflow: new-story
description: "[NS] Create new story in GitHub Issues"
- trigger: US
workflow: update-story
description: "[US] Update story ACs or details"
- trigger: DS
workflow: dashboard
description: "[DS] View sprint progress dashboard"
- trigger: AP
workflow: approve-story
description: "[AP] Approve completed story"
- trigger: SY
workflow: sync-from-github
description: "[SY] Sync changes from GitHub to local"
Story Creation Flow (PO via Claude Desktop):
PO: "Create story for password reset"
Claude (PO Agent):
1. Interactive prompts for user story components
2. Guides through BDD acceptance criteria
3. Creates GitHub Issue with proper labels/template
4. Syncs to local cache: {cache}/stories/2-6-password-reset.md
5. Updates sprint-status.yaml: "2-6-password-reset: backlog"
Result:
- GitHub Issue #156 created
- Local file synced
- Developers see it in /available-stories
AC Update with Developer Alert:
PO: "Update AC3 in Story 2-5 - change timeout to 30 min"
Claude (PO Agent):
1. Detects story status: in-progress (assigned to @developerA)
2. Warns: "Story is being worked on - changes may impact current work"
3. Updates GitHub Issue #105 AC
4. Adds comment: "@developerA - AC updated by PO (timeout 15m → 30m)"
5. Syncs to cache within 5 minutes
6. Developer gets notification
Result:
- PO can update requirements anytime
- Developer notified immediately via GitHub
- Changes validated against BMAD format before sync
🟡 MEDIUM PRIORITY - Phase 3 (Weeks 5-6): Advanced Integration
3.1 PR Linking & Completion Flow
Purpose: Close the loop from issue → implementation → PR → approval
Files to Modify:
story-full-pipeline/steps/step-06-complete.md- Add PR creation- Add new:
story-full-pipeline/steps/step-07-sync-github.md
PR Creation (after git commit):
// In step-06-complete after commit succeeds
async function createPRForStory(storyKey, commitSha) {
const story = getCachedStory(storyKey);
const issue = await github.getIssue(story.github_issue);
// Create PR via GitHub MCP
const pr = await github.createPR({
title: `Story ${storyKey}: ${story.title}`,
body:
`Implements Story ${storyKey}\n\n` +
`## Acceptance Criteria\n${formatACs(story.acs)}\n\n` +
`## Implementation Summary\n${story.devAgentRecord.summary}\n\n` +
`Closes #${issue.number}`,
head: currentBranch,
base: 'main',
labels: ['type:story', `story:${storyKey}`]
});
// Link PR to issue
await github.addComment(issue.number,
`✅ Implementation complete\n\nPR: #${pr.number}\nCommit: ${commitSha}`
);
// Update issue label
await github.addLabel(issue.number, 'status:in-review');
}
3.2 Epic Dashboard
File to Create: src/modules/bmm/workflows/po/epic-dashboard/workflow.yaml
Purpose: Real-time epic health for POs/stakeholders
Metrics Displayed:
- Story completion: 5/8 done (62%)
- Developer assignments: @alice (2 stories), @bob (1 story)
- Blockers: 1 story waiting on design
- Velocity: 1.5 stories/week
- Projected completion: Jan 15, 2026
Data Sources:
- GitHub Issues API (status, assignees, labels)
- Cache metadata (progress percentages)
- Git commit history (activity metrics)
🟢 NICE TO HAVE - Phase 4 (Weeks 7-8): Polish
4.1 Ghost Feature → GitHub Integration
File to Modify: detect-ghost-features/instructions.md
Enhancement: Auto-create GitHub Issues for orphaned code
When orphan detected:
1. Generate backfill story (already implemented)
2. Create GitHub Issue with label: "type:backfill"
3. Add to sprint-status.yaml
4. Link to orphaned files in codebase
4.2 Revalidation → GitHub Reporting
Files to Modify:
revalidate-story/instructions.mdrevalidate-epic/instructions.md
Enhancement: Post verification results to GitHub
async function revalidateStory(storyKey) {
// ... existing revalidation logic ...
// NEW: Post results to GitHub
await github.addComment(issue,
`📊 Revalidation Complete\n\n` +
`Verified: ${verified}/25 items (${pct}%)\n` +
`Gaps: ${gaps.length}\n\n` +
`Details: ${reportURL}`
);
}
Implementation Details
Mandatory Pre-Workflow Sync (Reliability Guarantee)
Enforced in workflow engine - Cannot be bypassed:
<!-- In core/tasks/workflow.xml - runs BEFORE any workflow Step 1 -->
<before-workflow>
<check if="github_integration.enabled == true">
<critical>MANDATORY GITHUB SYNC - Required for team coordination</critical>
<action>Call: incrementalSync()</action>
<check if="sync failed">
<retry count="3" backoff="[1s, 3s, 9s]">
<action>Retry incrementalSync()</action>
</retry>
<check if="still failing">
<output>
❌ CRITICAL: Cannot sync with GitHub
Network check: {{network_status}}
GitHub API: {{github_api_status}}
Last successful sync: {{last_sync_time}}
Cannot proceed without current data - risk of duplicate work.
Options:
[R] Retry sync
[H] Halt workflow
This is a HARD REQUIREMENT for team coordination.
</output>
<action>HALT</action>
</check>
</check>
<output>✅ Synced from GitHub: {{stories_updated}} stories updated</output>
</check>
</before-workflow>
This guarantees: Every workflow starts with fresh GitHub data (no stale cache issues)
Story Lifecycle with GitHub Integration
┌─────────────────────────────────────────────────────────────┐
│ 1. STORY CREATION (PO via Claude Desktop) │
├─────────────────────────────────────────────────────────────┤
│ PO: /new-story │
│ ↓ │
│ Create GitHub Issue #156 │
│ ├─ Labels: type:story, status:backlog, epic:2 │
│ ├─ Body: User story + BDD ACs │
│ └─ Assignee: none (unlocked) │
│ ↓ │
│ Sync to cache: 2-6-password-reset.md │
│ ↓ │
│ Update sprint-status.yaml: "2-6-password-reset: backlog" │
└─────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│ 2. STORY CHECKOUT (Developer) │
├─────────────────────────────────────────────────────────────┤
│ Dev: /checkout-story story_key=2-6-password-reset │
│ ↓ │
│ Check GitHub: Issue #156 assignee = null ✓ │
│ ↓ │
│ Assign issue to @developerA │
│ ├─ Assignee: @developerA │
│ ├─ Label: status:in-progress │
│ └─ Comment: "🔒 Locked by @developerA (expires 8h)" │
│ ↓ │
│ Create local lock: .bmad/locks/2-6-password-reset.lock │
│ ↓ │
│ Pre-fetch Epic 2 stories (8 stories, 1 API call) │
│ ↓ │
│ Cache all Epic 2 stories locally │
│ ↓ │
│ Return: cache/stories/2-6-password-reset.md │
└─────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│ 3. IMPLEMENTATION (Developer via story-full-pipeline) │
├─────────────────────────────────────────────────────────────┤
│ Step 1: Init │
│ └─ Verify lock held (HALT if lost) │
│ │
│ Step 2: Pre-Gap Analysis │
│ └─ Comment to GitHub: "Step 2/7: Pre-Gap Analysis" │
│ │
│ Step 3: Implement (for each task) │
│ ├─ BEFORE task: Verify lock still held │
│ ├─ AFTER task: Sync progress to GitHub │
│ │ └─ Comment: "Task 3/10 complete (30%)" │
│ └─ Refresh heartbeat every 30 min │
│ │
│ Step 4: Post-Validation │
│ └─ Comment to GitHub: "Step 4/7: Post-Validation" │
│ │
│ Step 5: Code Review │
│ └─ Comment to GitHub: "Step 5/7: Code Review" │
│ │
│ Step 6: Complete │
│ ├─ Commit: "feat(story-2-6): implement password reset" │
│ ├─ Create GitHub PR #789 │
│ │ └─ Body: "Closes #156" │
│ ├─ Update Issue #156: │
│ │ ├─ Comment: "✅ Implementation complete - PR #789" │
│ │ ├─ Label: status:in-review │
│ │ └─ Keep assignee (dev owns until approved) │
│ └─ Update cache & sprint-status │
└─────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│ 4. APPROVAL (PO via GitHub or Claude Desktop) │
├─────────────────────────────────────────────────────────────┤
│ PO reviews PR #789 on GitHub │
│ ↓ │
│ PO: /approve-story story_key=2-6-password-reset │
│ ├─ Reviews ACs in GitHub Issue │
│ ├─ Tests implementation │
│ └─ Approves or requests changes │
│ ↓ │
│ If approved: │
│ ├─ Merge PR #789 │
│ ├─ Close Issue #156 │
│ ├─ Label: status:done │
│ ├─ Unassign developer │
│ └─ Comment: "✅ Approved by @productOwner" │
│ ↓ │
│ Sync to cache & sprint-status: │
│ ├─ cache/stories/2-6-password-reset.md updated │
│ └─ sprint-status: "2-6-password-reset: done" │
└─────────────────────────────────────────────────────────────┘
Reliability Guarantees (Building on migrate-to-github)
1. Idempotent Operations
Pattern: Check before create/update
// Can run multiple times safely
async function createOrUpdateStory(storyKey, data) {
const existing = await github.searchIssue(`label:story:${storyKey}`);
if (existing) {
await github.updateIssue(existing.number, data);
} else {
await github.createIssue(data);
}
}
2. Atomic Per-Story Operations
Pattern: Transaction with rollback
async function migrateStory(storyKey) {
const transaction = { operations: [], rollback: [] };
try {
const issue = await github.createIssue(...);
transaction.rollback.push(() => github.closeIssue(issue.number));
await github.addLabels(issue.number, labels);
await github.setMilestone(issue.number, epic);
// Verify all succeeded
await verifyIssue(issue.number);
} catch (error) {
// Rollback all operations
for (const rollback of transaction.rollback.reverse()) {
await rollback();
}
throw error;
}
}
3. Write Verification
Pattern: Read-back after write
async function createIssueVerified(data) {
const created = await github.createIssue(data);
await sleep(1000); // GitHub eventual consistency
const verify = await github.getIssue(created.number);
assert(verify.title === data.title);
assert(verify.labels.includes('type:story'));
return created;
}
4. Retry with Backoff
Pattern: 3 retries, exponential backoff [1s, 3s, 9s]
async function retryWithBackoff(operation) {
const backoffs = [1000, 3000, 9000];
for (let i = 0; i < backoffs.length; i++) {
try {
return await operation();
} catch (error) {
if (i < backoffs.length - 1) {
await sleep(backoffs[i]);
} else {
throw error; // All retries exhausted
}
}
}
}
5. Network Required (Simplified from Original Plan)
Key Insight: AI coding requires internet, so no complex offline queue needed
Network Failure Handling:
// Simple retry + halt (not queue for later)
try {
await syncToGitHub(data);
} catch (networkError) {
console.error('❌ GitHub sync failed - check network');
console.error('Retrying in 3s...');
await retryWithBackoff(() => syncToGitHub(data));
// If still failing after retries:
throw new Error(
'HALT: Cannot proceed without GitHub sync.\n' +
'Network is required for team coordination.\n' +
'Resume when network restored.'
);
}
No Offline Queue: Since network is required for AI coding, network failures = halt and fix, not queue for later sync. Simpler architecture, fewer edge cases.
Critical Integration Points
Point 1: batch-stories Story Selection
File: batch-stories/instructions.md (Step 2)
Change: Filter locked stories BEFORE user selection
<step n="2" goal="Display available stories">
<!-- NEW: Sync from GitHub first -->
<action>Call: incrementalSync()</action>
<action>Load sprint-status.yaml</action>
<action>Filter: status = ready-for-dev</action>
<!-- NEW: Exclude locked stories -->
<action>Load cache metadata</action>
<action>For each story, check: assignee == null (unlocked)</action>
<action>Split into: available_stories, locked_stories</action>
<output>
📦 Available Stories (Unlocked) - {{available_count}}
{{#each available_stories}}
{{@index}}. {{story_key}}: {{title}}
{{/each}}
🔒 Locked Stories (Skip These) - {{locked_count}}
{{#each locked_stories}}
- {{story_key}}: Locked by @{{locked_by}} ({{duration}} ago)
{{/each}}
</output>
</step>
<step n="3" goal="User selection">
<!-- User selects from AVAILABLE stories only -->
<!-- NEW: Checkout selected stories -->
<action>For each selected story:</action>
<action> Call: checkoutStory(story_key)</action>
<action> Verify lock acquired successfully</action>
<action> Pre-fetch epic context</action>
<output>✅ {{count}} stories checked out and locked</output>
</step>
Point 2: story-full-pipeline Lock Verification
File: story-full-pipeline/steps/step-03-implement.md
Change: Add lock check before each task
## BEFORE EACH TASK IMPLEMENTATION
### NEW: Lock Verification
```bash
verify_lock() {
story_key="$1"
# Check local lock
lock_file=".bmad/locks/${story_key}.lock"
if [ ! -f "$lock_file" ]; then
echo "❌ LOCK LOST: Local lock file missing"
echo "Story may have been unlocked. HALT immediately."
return 1
fi
# Check timeout
timeout_at=$(grep "timeout_at:" "$lock_file" | cut -d' ' -f2)
if [ $(date +%s) -gt $(date -d "$timeout_at" +%s) ]; then
echo "❌ LOCK EXPIRED: Timeout reached"
echo "Run: /checkout-story ${story_key} to extend lock"
return 1
fi
# Check GitHub assignment (paranoid check)
github_assignee=$(call_github_mcp_get_issue_assignee "$story_key")
current_user=$(git config user.github)
if [ "$github_assignee" != "$current_user" ]; then
echo "❌ LOCK STOLEN: GitHub issue reassigned to $github_assignee"
echo "Story was unlocked and re-assigned. HALT."
return 1
fi
# Refresh heartbeat
sed -i.bak "s/last_heartbeat: .*/last_heartbeat: $(date -u +%Y-%m-%dT%H:%M:%SZ)/" "$lock_file"
rm -f "${lock_file}.bak"
echo "✅ Lock verified for ${story_key}"
return 0
}
# CRITICAL: Call before every task
if ! verify_lock "$story_key"; then
echo "⚠️⚠️⚠️ PIPELINE HALTED - Lock verification failed"
echo "Do NOT continue without valid lock!"
exit 1
fi
Then proceed with task implementation...
### Point 3: dev-story Progress Sync
**File**: `dev-story/instructions.xml` (Step 8, after line 533)
**Change**: Add GitHub sync after task completion
```xml
<!-- AFTER marking task [x] -->
<check if="{{github_integration.enabled}} == true">
<action>Sync task completion to GitHub:</action>
<action>
Call: mcp__github__add_issue_comment({
owner: {{github_owner}},
repo: {{github_repo}},
issue_number: {{github_issue_number}},
body: "Task {{task_num}} complete: {{task_description}}\n\n" +
"Progress: {{checked_tasks}}/{{total_tasks}} tasks ({{progress_pct}}%)"
})
</action>
<check if="GitHub sync failed">
<retry count="3" />
<check if="still failing">
<output>❌ CRITICAL: Cannot sync progress to GitHub</output>
<output>Network required for team coordination</output>
<action>HALT</action>
</check>
</check>
<output>✅ Progress synced to GitHub Issue #{{github_issue_number}}</output>
</check>
Configuration
Add to: _bmad/bmm/config.yaml
# GitHub Integration Settings
github_integration:
enabled: true # Master toggle
source_of_truth: "github" # github | local (always github for enterprise)
require_network: true # Hard requirement (AI needs internet)
repository:
owner: "jschulte" # GitHub username or org
repo: "myproject" # Repository name
cache:
enabled: true
location: "{output_folder}/cache"
staleness_threshold_minutes: 5
auto_refresh_on_stale: true
locking:
enabled: true
default_timeout_hours: 8
heartbeat_interval_minutes: 30
stale_threshold_minutes: 15
max_locks_per_user: 3
sync:
interval_minutes: 5 # Incremental sync frequency
batch_epic_prefetch: true # Pre-fetch epic on checkout
progress_updates: true # Sync task completion to GitHub
permissions:
scrum_masters: # Can force-unlock stories
- "jschulte"
- "alice-sm"
Verification Plan
Test 1: Story Locking Prevents Duplicate Work
# Setup: 2 developers, 1 story
# Developer A (machine 1)
$ /checkout-story story_key=2-5-auth
✅ Story checked out
Lock expires: 8 hours
# Developer B (machine 2, simultaneously)
$ /checkout-story story_key=2-5-auth
❌ Story locked by @developerA until 23:30:00Z
Try: /available-stories
# Verify in GitHub
# → Issue #105: Assigned to @developerA
# → Labels: status:in-progress
# Result: ✅ Only Developer A can work on story
Test 2: Real-Time Progress Visibility
# Developer implements task 3 of 10
# → Marks [x] in story file
# → Workflow syncs to GitHub
# Check GitHub Issue #105
# → New comment (30 seconds ago): "Task 3 complete: Implement OAuth (30%)"
# → Body shows: Progress bar at 30%
# PO views dashboard
# → Shows: "Story 2-5: 30% complete (3/10 tasks)"
# Result: ✅ PO sees progress in real-time
Test 3: Merge Conflict Prevention
# Setup: 3 developers working on different stories
# All 3 complete simultaneously and commit
# Developer A: Story 2-5 files only
# Developer B: Story 2-7 files only
# Developer C: Story 3-2 files only
# Git commits:
# → Developer A: Only 2-5-auth.md + src/auth/*
# → Developer B: Only 2-7-cache.md + src/cache/*
# → Developer C: Only 3-2-api.md + src/api/*
# No overlap in files → No merge conflicts
# sprint-status.yaml:
# → Each story updates via GitHub sync (not direct file edit)
# → No conflicts (GitHub is source of truth)
# Result: ✅ Zero merge conflicts
Test 4: Cache Performance
# Measure: Story checkout + epic context load time
# Without cache (API calls):
# - Fetch story: 2-3 seconds
# - Fetch 8 epic stories: 8 × 2s = 16 seconds
# - Total: ~18 seconds
# With cache:
# - Sync check: 200ms (1 API call for "any changes?")
# - Load story: 50ms (Read tool from cache)
# - Load 8 epic stories: 8 × 50ms = 400ms
# - Total: ~650ms
# Result: ✅ 27x faster (18s → 650ms)
Test 5: Network Failure Recovery
# Developer working on task 5 of 10
# Network drops during GitHub sync
# System:
# → Retry #1 after 1s: Fails
# → Retry #2 after 3s: Fails
# → Retry #3 after 9s: Fails
# → Display: "❌ Cannot sync to GitHub - network required"
# → Save state to: .bmad/pipeline-state-2-5.yaml
# → HALT
# Developer fixes network, resumes:
$ /story-full-pipeline story_key=2-5-auth
# System:
# → Detects saved state
# → "Resuming from task 5 (paused 10 minutes ago)"
# → Syncs pending progress to GitHub
# → Continues task 6
# Result: ✅ Graceful halt + resume
Success Criteria
Must Have (Phase 1-2)
- ✅ Zero duplicate work incidents (story locking prevents)
- ✅ Zero sprint-status.yaml merge conflicts (GitHub is source of truth)
- ✅ Real-time progress visibility (<30s from task completion to GitHub update)
- ✅ Cache performance: <100ms story reads (vs 2-3s API calls)
- ✅ API efficiency: <50 calls/hour (vs 500-1000 without cache)
Should Have (Phase 3)
- ✅ PR auto-linking to issues (closes loop)
- ✅ PO can create/update stories via Claude Desktop
- ✅ Epic dashboard shows team activity
- ✅ Bi-directional sync (GitHub ↔ cache)
Nice to Have (Phase 4)
- ✅ Ghost features auto-create backfill issues
- ✅ Stakeholder reporting
- ✅ Advanced dashboards
Estimated Effort
Phase 1: Foundation (Weeks 1-2)
- Cache system: 5 days
- Story locking: 5 days
- Progress sync: 2 days
- Testing & docs: 3 days Total: 15 days (3 weeks with buffer)
Phase 2: PO Workflows (Weeks 3-4)
- PO agent: 1 day
- Story creation: 3 days
- AC updates: 2 days
- Dashboard: 3 days
- Sync engine: 4 days Total: 13 days (2.5 weeks with buffer)
Phase 3: Advanced (Weeks 5-6)
- PR linking: 2 days
- Approval flow: 2 days
- Epic dashboard: 3 days
- Integration polish: 3 days Total: 10 days (2 weeks)
Phase 4: Polish (Weeks 7-8)
- Ghost features: 2 days
- Revalidation integration: 2 days
- Documentation: 3 days
- Training materials: 3 days Total: 10 days (2 weeks)
Grand Total: 48 days (9.5 weeks, ~2.5 months for complete system)
MVP (Phases 1-2): 28 days (~6 weeks) gets you story locking + PO workflows
Files Summary
NEW Files (26 total)
Cache System: 3 files (~900 lines) Lock System: 9 files (~1,350 lines) PO Workflows: 12 files (~2,580 lines) Integration: 2 files (~500 lines)
Total NEW Code: ~5,330 lines
MODIFIED Files (5 total)
batch-stories/instructions.md(+150 lines)story-full-pipeline/steps/step-01-init.md(+80 lines)story-full-pipeline/steps/step-03-implement.md(+120 lines)story-full-pipeline/steps/step-06-complete.md(+100 lines)dev-story/instructions.xml(+60 lines)
Total MODIFIED: ~510 lines
Grand Total: ~5,840 lines of production code + tests + docs
Risk Assessment
| Risk | Probability | Impact | Mitigation |
|---|---|---|---|
| GitHub rate limits | Low | High | Caching (97% reduction), batch operations |
| Lock deadlocks | Medium | Medium | 8-hour timeout, heartbeat, SM override |
| Cache-GitHub desync | Low | Medium | Staleness checks, mandatory pre-sync |
| Network failures | Medium | Medium | Retry logic, graceful halt + resume |
| BMAD format violations | Medium | High | Strict validation, PO training |
| Lost locks mid-work | Low | High | Verification before each task |
| Developer onboarding | Medium | Low | Clear docs, training, gradual rollout |
Overall Risk: LOW-MEDIUM (building on proven migrate-to-github patterns)
Risk Mitigation Strategy:
- Start with 2-3 developers on small epic (validate locking works)
- Gradual rollout (not all 15 developers at once)
- Comprehensive testing at each phase
- Rollback capability via migrate-to-github patterns
Why This Will Work
1. Proven Patterns
- Lock mechanism: Based on working git commit lock (step-06a-queue-commit.md)
- GitHub integration: Based on production migrate-to-github workflow
- Reliability: Same 8 mechanisms as migrate-to-github (idempotent, atomic, verified, resumable, etc.)
2. Simple Network Model
- Network required = simplified architecture (no offline queue complexity)
- Fail fast on network issues (retry + halt, not queue for later)
- Matches reality (AI coding needs internet anyway)
3. Performance Optimized
- Cache eliminates 95% of API calls
- Incremental sync (only fetch changed stories)
- Pre-fetch epic context (batch operation)
- Read tool works at <100ms (vs 2-3s API calls)
4. Multi-Layer Safety
- Lock verification before each task (catch stolen locks immediately)
- Write-through with retry (transient failures handled)
- Staleness detection (refuse to use old cache)
- Mandatory pre-workflow sync (everyone starts with fresh data)
5. Role Separation
- POs: GitHub Issues UI + Claude Desktop (no git needed)
- Developers: BMAD workflows (lock → implement → sync → unlock)
- SMs: Oversight tools (lock-status, force-unlock, dashboards)
Next Steps
Immediate
- Review this plan - Validate architecture decisions
- Confirm priorities - Phase 1-2 first (locking + PO workflows)?
- Approve approach - GitHub as source of truth with local cache
Week 1
- Build cache system (cache-manager.js, sync-engine.js)
- Create checkout-story workflow
- Implement lock verification
- Test with 2 developers
Week 2-3
- Integrate with batch-stories
- Add progress sync to dev-story
- Build PO agent + story creation workflow
- Test with 3-5 developers
Week 4-6
- Complete PO workflows (update, dashboard, approve)
- Add PR linking
- Build epic dashboard
- Test with full team (10-15 developers)
Week 7-8
- Polish and optimize
- Advanced features
- Comprehensive documentation
- Team training
Conclusion
This design transforms BMAD into the killer feature for enterprise teams by:
✅ Preventing duplicate work - Story locking with 8-hour timeout, heartbeat, verification ✅ Enabling Product Owners - GitHub Issues workspace via Claude Desktop, no git/markdown knowledge ✅ Maintaining developer flow - Local cache = instant LLM reads, no API latency ✅ Scaling to 15 developers - GitHub centralized coordination, zero merge conflicts ✅ Building on proven patterns - migrate-to-github reliability mechanisms (atomic, verified, resumable) ✅ Optimizing performance - 97% API reduction through smart caching ✅ Simplifying architecture - Network required = no offline queue complexity
Implementation: 6-8 weeks for complete system, 4-6 weeks for MVP (locking + basic PO workflows)
Risk: Low-Medium (incremental rollout, comprehensive testing, rollback capability)
ROI: Eliminates duplicate work, reduces PO-Dev friction by 40%, increases sprint predictability
Ready for enterprise adoption.