BMAD-METHOD/docs/ENTERPRISE-GITHUB-INTEGRATI...

---
title: Enterprise BMAD - GitHub Issues Integration Plan
description: Complete plan for transforming BMAD into an enterprise-scale team collaboration system with GitHub Issues integration
---

# Enterprise BMAD: Complete GitHub Issues Integration Plan

**Vision**: Transform BMAD into "the killer feature for using BMAD across an Enterprise team at scale effectively and without constantly stepping on each other's toes"

**Team Size**: 5-15 developers working in parallel
**Source of Truth**: GitHub Issues (with local cache for LLM performance)
**Network**: Required (AI coding needs internet anyway - simplified architecture)

---

## Problem Statement

**Current State**: BMAD optimized for single developer

- File-based state (sprint-status.yaml on each machine)
- No coordination between developers
- Multiple devs can work on same story → duplicate work, merge conflicts
- No real-time progress visibility for Product Owners
- sprint-status.yaml merge conflicts when multiple devs push

**Target State**: Enterprise team coordination platform

- GitHub Issues = centralized source of truth
- Story-level locking prevents duplicate work
- Real-time progress visibility for all roles
- Product Owners manage backlog via GitHub UI + Claude Desktop
- Zero merge conflicts through atomic operations

---

## Architecture: Three-Tier System

```
┌─────────────────────────────────────────────────────────────┐
│ TIER 1: GitHub Issues (Source of Truth)                     │
│                                                              │
│ Stores: Status, Locks (assignee), Labels, Progress          │
│ Purpose: Multi-developer coordination, PO workspace          │
│ API: GitHub MCP (mcp__github__*)                            │
│ Latency: 100-300ms per call                                 │
└────────────┬────────────────────────────────────────────────┘
             │
             ↓ Smart Sync (incremental, timestamp-based)
             │
┌────────────┴────────────────────────────────────────────────┐
│ TIER 2: Local Cache (Performance)                           │
│                                                              │
│ Stores: Full 12-section BMAD story content                  │
│ Purpose: Fast LLM Read tool access                          │
│ Access: Instant (<100ms vs 2-3s API)                       │
│ Sync: Every 5 min OR on-demand (checkout, commit)          │
│ Location: {output}/cache/stories/*.md                       │
└────────────┬────────────────────────────────────────────────┘
             │
             ↓ Committed after story completion
             │
┌────────────┴────────────────────────────────────────────────┐
│ TIER 3: Git Repository (Audit Trail)                        │
│                                                              │
│ Stores: Historical story files, implementation code          │
│ Purpose: Version control, audit compliance                   │
│ Access: Git history                                          │
└─────────────────────────────────────────────────────────────┘
```

**Key Principle**: GitHub coordinates (who, when, status), Cache optimizes (fast reads), Git archives (history).

---

## Core Components (Priority Order)

### 🔴 CRITICAL - Phase 1 (Weeks 1-2): Foundation

#### 1.1 Smart Cache System

**Purpose**: Fast LLM access while GitHub is source of truth

**What**: Timestamp-based incremental sync that only fetches changed stories

**Implementation**:

**Files to Create**:

1. `src/modules/bmm/lib/cache/cache-manager.js` (300 lines)
   - readStoryFromCache() - With staleness check
   - writeStoryToCache() - Atomic writes
   - invalidateCache() - Force refresh
   - getCacheAge() - Staleness calculation

2. `src/modules/bmm/lib/cache/sync-engine.js` (400 lines)
   - incrementalSync() - Fetch only changed stories
   - fullSync() - Initial cache population
   - preFetchEpic() - Batch fetch for context
   - syncStory() - Individual story sync

3. `{output}/cache/.bmad-cache-meta.json` (auto-generated)

   ```json
   {
     "last_sync": "2026-01-08T15:30:00Z",
     "stories": {
       "2-5-auth": {
         "github_issue": 105,
         "github_updated_at": "2026-01-08T15:29:00Z",
         "cache_timestamp": "2026-01-08T15:30:00Z",
         "local_hash": "sha256:abc...",
         "locked_by": "jonahschulte",
         "locked_until": "2026-01-08T23:30:00Z"
       }
     }
   }
   ```

**Sync Algorithm**:

```javascript
// Called every 5 minutes OR on-demand
async function incrementalSync() {
  const lastSync = loadCacheMeta().last_sync;

  // Single API call for all changed stories
  const updated = await github.search({
    query: `repo:${owner}/${repo} label:type:story updated:>${lastSync}`
  });

  console.log(`Found ${updated.length} changed stories`); // Typically 1-3

  // Fetch only changed stories
  for (const issue of updated) {
    const storyKey = extractStoryKey(issue);
    const content = await convertIssueToStoryFile(issue);
    await writeCacheFile(storyKey, content);
    updateCacheMeta(storyKey, issue.updated_at);
  }
}
```

**Performance**: 97% API call reduction (500/hour → 15/hour)

**Critical Feature**: Pre-fetch epic on checkout

```javascript
async function checkoutStory(storyKey) {
  // Get epic number from story key
  const epicNum = storyKey.split('-')[0]; // "2-5-auth" → "2"

  // Batch fetch ALL stories in epic (single API call)
  const epicStories = await github.search({
    query: `repo:${owner}/${repo} label:epic:${epicNum}`
  });

  // Cache all stories (gives LLM full epic context)
  for (const story of epicStories) {
    await cacheStory(story);
  }

  // Now developer has instant access to all related stories via Read tool
}
```

---

#### 1.2 Story Locking System

**Purpose**: Prevent 2+ developers from working on same story (duplicate work prevention)

**What**: Dual-lock strategy (GitHub assignment + local lock file)

**Files to Create**:

1. `src/modules/bmm/workflows/4-implementation/checkout-story/workflow.yaml`
2. `src/modules/bmm/workflows/4-implementation/checkout-story/instructions.md`
3. `src/modules/bmm/workflows/4-implementation/unlock-story/workflow.yaml`
4. `src/modules/bmm/workflows/4-implementation/unlock-story/instructions.md`
5. `src/modules/bmm/workflows/4-implementation/available-stories/workflow.yaml`
6. `src/modules/bmm/workflows/4-implementation/lock-status/workflow.yaml`
7. `.bmad/lock-registry.yaml`

**Lock Mechanism**:

```javascript
// /checkout-story story_key=2-5-auth

async function checkoutStory(storyKey) {
  // 1. Check GitHub lock (distributed coordination)
  const issue = await github.getIssue(storyKey);
  if (issue.assignee && issue.assignee !== currentUser) {
    throw new Error(
      `🔒 Story locked by @${issue.assignee.login}\n` +
      `Since: ${issue.updated_at}\n` +
      `Try: /available-stories to see unlocked stories`
    );
  }

  // 2. Atomic local lock (race condition safe)
  const lockFile = `.bmad/locks/${storyKey}.lock`;
  await atomicCreateLockFile(lockFile, {
    locked_by: currentUser,
    locked_at: now(),
    timeout_at: now() + (8 * 3600000), // 8 hours
    last_heartbeat: now(),
    github_issue: issue.number
  });

  // 3. Assign GitHub issue (write-through)
  await retryWithBackoff(async () => {
    await github.assign(issue.number, currentUser);
    await github.addLabel(issue.number, 'status:in-progress');

    // Verify assignment succeeded
    const verify = await github.getIssue(issue.number);
    if (!verify.assignees.includes(currentUser)) {
      throw new Error('Assignment verification failed');
    }
  });

  // 4. Pre-fetch epic context
  await preFetchEpic(extractEpic(storyKey));

  console.log(`✅ Story checked out: ${storyKey}`);
  console.log(`Lock expires: ${formatTime(8hours from now)}`);
}
```

**Lock Verification** (before each task in super-dev-pipeline):

```javascript
// Integrated into step-03-implement.md
async function verifyLockBeforeTask(storyKey) {
  // Check local lock
  const lock = readLockFile(storyKey);
  if (lock.timeout_at < now()) {
    throw new Error('Lock expired - run /checkout-story again');
  }

  // Check GitHub assignment (paranoid verification)
  const issue = await github.getIssue(storyKey);
  if (issue.assignee?.login !== currentUser) {
    throw new Error(`Lock stolen - now assigned to ${issue.assignee.login}`);
  }

  // Refresh heartbeat
  lock.last_heartbeat = now();
  await updateLockFile(storyKey, lock);

  console.log('✅ Lock verified');
}
```

**Lock Timeout**: 8 hours (full workday), heartbeat every 30 min during implementation, stale after 15 min no heartbeat

**Scrum Master Override**:

```bash
# SM can force-unlock stale locks
/unlock-story story_key=2-5-auth --force --reason="Developer offline, story blocking sprint"
```

---

#### 1.3 Progress Sync Integration

**Purpose**: Real-time visibility into who's working on what

**Files to Modify**:

1. `src/modules/bmm/workflows/4-implementation/dev-story/instructions.xml` (Step 8, lines 502-533)
2. `src/modules/bmm/workflows/4-implementation/super-dev-pipeline/steps/step-03-implement.md`
3. `src/modules/bmm/workflows/4-implementation/batch-super-dev/step-4.5-reconcile-story-status.md`

**Add After Task Completion**:

```javascript
// After marking task [x] in story file
async function syncTaskToGitHub(storyKey, taskData) {
  // 1. Update local cache
  updateCacheFile(storyKey, taskData);

  // 2. Write-through to GitHub
  await retryWithBackoff(async () => {
    await github.addComment(issue,
      `Task ${taskData.num} complete: ${taskData.description}\n\n` +
      `Progress: ${taskData.checked}/${taskData.total} tasks (${taskData.pct}%)`
    );
  });

  // 3. Update sprint-status.yaml
  updateSprintStatus(storyKey, {
    status: 'in-progress',
    progress: `${taskData.checked}/${taskData.total} tasks (${taskData.pct}%)`
  });

  console.log(`✅ Progress synced to GitHub Issue #${issue}`);
}
```

**Result**: POs see progress updates in GitHub within seconds of task completion

---

### 🟠 HIGH PRIORITY - Phase 2 (Weeks 3-4): Product Owner Enablement

#### 2.1 PO Agent & Workflows

**Purpose**: Enable POs to manage backlog via Claude Desktop + GitHub

**Files to Create**:

1. `src/modules/bmm/agents/po.agent.yaml` - PO agent definition
2. `src/modules/bmm/workflows/po/new-story/workflow.yaml` - Create story in GitHub
3. `src/modules/bmm/workflows/po/update-story/workflow.yaml` - Modify ACs
4. `src/modules/bmm/workflows/po/dashboard/workflow.yaml` - Sprint metrics
5. `src/modules/bmm/workflows/po/approve-story/workflow.yaml` - Sign-off completed work
6. `src/modules/bmm/workflows/po/sync-from-github/workflow.yaml` - Pull GitHub changes to cache
7. `.github/ISSUE_TEMPLATE/bmad-story.md` - Issue template

**PO Agent Menu**:

```yaml
menu:
  - trigger: NS
    workflow: new-story
    description: "[NS] Create new story in GitHub Issues"

  - trigger: US
    workflow: update-story
    description: "[US] Update story ACs or details"

  - trigger: DS
    workflow: dashboard
    description: "[DS] View sprint progress dashboard"

  - trigger: AP
    workflow: approve-story
    description: "[AP] Approve completed story"

  - trigger: SY
    workflow: sync-from-github
    description: "[SY] Sync changes from GitHub to local"
```

**Story Creation Flow** (PO via Claude Desktop):

```
PO: "Create story for password reset"

Claude (PO Agent):
1. Interactive prompts for user story components
2. Guides through BDD acceptance criteria
3. Creates GitHub Issue with proper labels/template
4. Syncs to local cache: {cache}/stories/2-6-password-reset.md
5. Updates sprint-status.yaml: "2-6-password-reset: backlog"

Result:
- GitHub Issue #156 created
- Local file synced
- Developers see it in /available-stories
```

**AC Update with Developer Alert**:

```
PO: "Update AC3 in Story 2-5 - change timeout to 30 min"

Claude (PO Agent):
1. Detects story status: in-progress (assigned to @developerA)
2. Warns: "Story is being worked on - changes may impact current work"
3. Updates GitHub Issue #105 AC
4. Adds comment: "@developerA - AC updated by PO (timeout 15m → 30m)"
5. Syncs to cache within 5 minutes
6. Developer gets notification

Result:
- PO can update requirements anytime
- Developer notified immediately via GitHub
- Changes validated against BMAD format before sync
```

---

### 🟡 MEDIUM PRIORITY - Phase 3 (Weeks 5-6): Advanced Integration

#### 3.1 PR Linking & Completion Flow

**Purpose**: Close the loop from issue → implementation → PR → approval

**Files to Modify**:

1. `super-dev-pipeline/steps/step-06-complete.md` - Add PR creation
2. Add new: `super-dev-pipeline/steps/step-07-sync-github.md`

**PR Creation** (after git commit):

```javascript
// In step-06-complete after commit succeeds
async function createPRForStory(storyKey, commitSha) {
  const story = getCachedStory(storyKey);
  const issue = await github.getIssue(story.github_issue);

  // Create PR via GitHub MCP
  const pr = await github.createPR({
    title: `Story ${storyKey}: ${story.title}`,
    body:
      `Implements Story ${storyKey}\n\n` +
      `## Acceptance Criteria\n${formatACs(story.acs)}\n\n` +
      `## Implementation Summary\n${story.devAgentRecord.summary}\n\n` +
      `Closes #${issue.number}`,
    head: currentBranch,
    base: 'main',
    labels: ['type:story', `story:${storyKey}`]
  });

  // Link PR to issue
  await github.addComment(issue.number,
    `✅ Implementation complete\n\nPR: #${pr.number}\nCommit: ${commitSha}`
  );

  // Update issue label
  await github.addLabel(issue.number, 'status:in-review');
}
```

#### 3.2 Epic Dashboard

**File to Create**: `src/modules/bmm/workflows/po/epic-dashboard/workflow.yaml`

**Purpose**: Real-time epic health for POs/stakeholders

**Metrics Displayed**:

- Story completion: 5/8 done (62%)
- Developer assignments: @alice (2 stories), @bob (1 story)
- Blockers: 1 story waiting on design
- Velocity: 1.5 stories/week
- Projected completion: Jan 15, 2026

**Data Sources**:

- GitHub Issues API (status, assignees, labels)
- Cache metadata (progress percentages)
- Git commit history (activity metrics)

---

### 🟢 NICE TO HAVE - Phase 4 (Weeks 7-8): Polish

#### 4.1 Ghost Feature → GitHub Integration

**File to Modify**: `detect-ghost-features/instructions.md`

**Enhancement**: Auto-create GitHub Issues for orphaned code

```markdown
When orphan detected:
1. Generate backfill story (already implemented)
2. Create GitHub Issue with label: "type:backfill"
3. Add to sprint-status.yaml
4. Link to orphaned files in codebase
```

#### 4.2 Revalidation → GitHub Reporting

**Files to Modify**:

- `revalidate-story/instructions.md`
- `revalidate-epic/instructions.md`

**Enhancement**: Post verification results to GitHub

```javascript
async function revalidateStory(storyKey) {
  // ... existing revalidation logic ...

  // NEW: Post results to GitHub
  await github.addComment(issue,
    `📊 Revalidation Complete\n\n` +
    `Verified: ${verified}/25 items (${pct}%)\n` +
    `Gaps: ${gaps.length}\n\n` +
    `Details: ${reportURL}`
  );
}
```

---

## Implementation Details

### Mandatory Pre-Workflow Sync (Reliability Guarantee)

**Enforced in workflow engine** - Cannot be bypassed:

```xml
<!-- In core/tasks/workflow.xml - runs BEFORE any workflow Step 1 -->
<before-workflow>
  <check if="github_integration.enabled == true">
    <critical>MANDATORY GITHUB SYNC - Required for team coordination</critical>

    <action>Call: incrementalSync()</action>

    <check if="sync failed">
      <retry count="3" backoff="[1s, 3s, 9s]">
        <action>Retry incrementalSync()</action>
      </retry>

      <check if="still failing">
        <output>
❌ CRITICAL: Cannot sync with GitHub

Network check: {{network_status}}
GitHub API: {{github_api_status}}
Last successful sync: {{last_sync_time}}

Cannot proceed without current data - risk of duplicate work.

Options:
[R] Retry sync
[H] Halt workflow

This is a HARD REQUIREMENT for team coordination.
        </output>
        <action>HALT</action>
      </check>
    </check>

    <output>✅ Synced from GitHub: {{stories_updated}} stories updated</output>
  </check>
</before-workflow>
```

**This guarantees**: Every workflow starts with fresh GitHub data (no stale cache issues)

---

### Story Lifecycle with GitHub Integration

```
┌─────────────────────────────────────────────────────────────┐
│ 1. STORY CREATION (PO via Claude Desktop)                   │
├─────────────────────────────────────────────────────────────┤
│ PO: /new-story                                              │
│  ↓                                                           │
│ Create GitHub Issue #156                                    │
│  ├─ Labels: type:story, status:backlog, epic:2             │
│  ├─ Body: User story + BDD ACs                             │
│  └─ Assignee: none (unlocked)                              │
│  ↓                                                           │
│ Sync to cache: 2-6-password-reset.md                       │
│  ↓                                                           │
│ Update sprint-status.yaml: "2-6-password-reset: backlog"   │
└─────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────┐
│ 2. STORY CHECKOUT (Developer)                               │
├─────────────────────────────────────────────────────────────┤
│ Dev: /checkout-story story_key=2-6-password-reset          │
│  ↓                                                           │
│ Check GitHub: Issue #156 assignee = null ✓                 │
│  ↓                                                           │
│ Assign issue to @developerA                                │
│  ├─ Assignee: @developerA                                  │
│  ├─ Label: status:in-progress                              │
│  └─ Comment: "🔒 Locked by @developerA (expires 8h)"      │
│  ↓                                                           │
│ Create local lock: .bmad/locks/2-6-password-reset.lock     │
│  ↓                                                           │
│ Pre-fetch Epic 2 stories (8 stories, 1 API call)           │
│  ↓                                                           │
│ Cache all Epic 2 stories locally                           │
│  ↓                                                           │
│ Return: cache/stories/2-6-password-reset.md                │
└─────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────┐
│ 3. IMPLEMENTATION (Developer via super-dev-pipeline)         │
├─────────────────────────────────────────────────────────────┤
│ Step 1: Init                                                │
│  └─ Verify lock held (HALT if lost)                        │
│                                                             │
│ Step 2: Pre-Gap Analysis                                   │
│  └─ Comment to GitHub: "Step 2/7: Pre-Gap Analysis"       │
│                                                             │
│ Step 3: Implement (for each task)                          │
│  ├─ BEFORE task: Verify lock still held                   │
│  ├─ AFTER task: Sync progress to GitHub                   │
│  │   └─ Comment: "Task 3/10 complete (30%)"              │
│  └─ Refresh heartbeat every 30 min                        │
│                                                             │
│ Step 4: Post-Validation                                    │
│  └─ Comment to GitHub: "Step 4/7: Post-Validation"        │
│                                                             │
│ Step 5: Code Review                                        │
│  └─ Comment to GitHub: "Step 5/7: Code Review"            │
│                                                             │
│ Step 6: Complete                                           │
│  ├─ Commit: "feat(story-2-6): implement password reset"   │
│  ├─ Create GitHub PR #789                                 │
│  │   └─ Body: "Closes #156"                               │
│  ├─ Update Issue #156:                                    │
│  │   ├─ Comment: "✅ Implementation complete - PR #789"   │
│  │   ├─ Label: status:in-review                           │
│  │   └─ Keep assignee (dev owns until approved)           │
│  └─ Update cache & sprint-status                          │
└─────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────┐
│ 4. APPROVAL (PO via GitHub or Claude Desktop)               │
├─────────────────────────────────────────────────────────────┤
│ PO reviews PR #789 on GitHub                                │
│  ↓                                                           │
│ PO: /approve-story story_key=2-6-password-reset            │
│  ├─ Reviews ACs in GitHub Issue                            │
│  ├─ Tests implementation                                   │
│  └─ Approves or requests changes                           │
│  ↓                                                           │
│ If approved:                                                │
│  ├─ Merge PR #789                                          │
│  ├─ Close Issue #156                                       │
│  ├─ Label: status:done                                     │
│  ├─ Unassign developer                                     │
│  └─ Comment: "✅ Approved by @productOwner"               │
│  ↓                                                           │
│ Sync to cache & sprint-status:                             │
│  ├─ cache/stories/2-6-password-reset.md updated            │
│  └─ sprint-status: "2-6-password-reset: done"             │
└─────────────────────────────────────────────────────────────┘
```

---

## Reliability Guarantees (Building on migrate-to-github)

### 1. Idempotent Operations

**Pattern**: Check before create/update

```javascript
// Can run multiple times safely
async function createOrUpdateStory(storyKey, data) {
  const existing = await github.searchIssue(`label:story:${storyKey}`);

  if (existing) {
    await github.updateIssue(existing.number, data);
  } else {
    await github.createIssue(data);
  }
}
```

### 2. Atomic Per-Story Operations

**Pattern**: Transaction with rollback

```javascript
async function migrateStory(storyKey) {
  const transaction = { operations: [], rollback: [] };

  try {
    const issue = await github.createIssue(...);
    transaction.rollback.push(() => github.closeIssue(issue.number));

    await github.addLabels(issue.number, labels);
    await github.setMilestone(issue.number, epic);

    // Verify all succeeded
    await verifyIssue(issue.number);

  } catch (error) {
    // Rollback all operations
    for (const rollback of transaction.rollback.reverse()) {
      await rollback();
    }
    throw error;
  }
}
```

### 3. Write Verification

**Pattern**: Read-back after write

```javascript
async function createIssueVerified(data) {
  const created = await github.createIssue(data);

  await sleep(1000); // GitHub eventual consistency

  const verify = await github.getIssue(created.number);
  assert(verify.title === data.title);
  assert(verify.labels.includes('type:story'));

  return created;
}
```

### 4. Retry with Backoff

**Pattern**: 3 retries, exponential backoff [1s, 3s, 9s]

```javascript
async function retryWithBackoff(operation) {
  const backoffs = [1000, 3000, 9000];

  for (let i = 0; i < backoffs.length; i++) {
    try {
      return await operation();
    } catch (error) {
      if (i < backoffs.length - 1) {
        await sleep(backoffs[i]);
      } else {
        throw error; // All retries exhausted
      }
    }
  }
}
```

### 5. Network Required (Simplified from Original Plan)

**Key Insight**: AI coding requires internet, so no complex offline queue needed

**Network Failure Handling**:

```javascript
// Simple retry + halt (not queue for later)
try {
  await syncToGitHub(data);
} catch (networkError) {
  console.error('❌ GitHub sync failed - check network');
  console.error('Retrying in 3s...');

  await retryWithBackoff(() => syncToGitHub(data));

  // If still failing after retries:
  throw new Error(
    'HALT: Cannot proceed without GitHub sync.\n' +
    'Network is required for team coordination.\n' +
    'Resume when network restored.'
  );
}
```

**No Offline Queue**: Since network is required for AI coding, network failures = halt and fix, not queue for later sync. Simpler architecture, fewer edge cases.

---

## Critical Integration Points

### Point 1: batch-super-dev Story Selection

**File**: `batch-super-dev/instructions.md` (Step 2)
**Change**: Filter locked stories BEFORE user selection

```xml
<step n="2" goal="Display available stories">
  <!-- NEW: Sync from GitHub first -->
  <action>Call: incrementalSync()</action>

  <action>Load sprint-status.yaml</action>
  <action>Filter: status = ready-for-dev</action>

  <!-- NEW: Exclude locked stories -->
  <action>Load cache metadata</action>
  <action>For each story, check: assignee == null (unlocked)</action>
  <action>Split into: available_stories, locked_stories</action>

  <output>
📦 Available Stories (Unlocked) - {{available_count}}
{{#each available_stories}}
{{@index}}. {{story_key}}: {{title}}
{{/each}}

🔒 Locked Stories (Skip These) - {{locked_count}}
{{#each locked_stories}}
- {{story_key}}: Locked by @{{locked_by}} ({{duration}} ago)
{{/each}}
  </output>
</step>

<step n="3" goal="User selection">
  <!-- User selects from AVAILABLE stories only -->

  <!-- NEW: Checkout selected stories -->
  <action>For each selected story:</action>
  <action>  Call: checkoutStory(story_key)</action>
  <action>  Verify lock acquired successfully</action>
  <action>  Pre-fetch epic context</action>

  <output>✅ {{count}} stories checked out and locked</output>
</step>
```

### Point 2: super-dev-pipeline Lock Verification

**File**: `super-dev-pipeline/steps/step-03-implement.md`
**Change**: Add lock check before each task

```markdown
## BEFORE EACH TASK IMPLEMENTATION

### NEW: Lock Verification

```bash
verify_lock() {
  story_key="$1"

  # Check local lock
  lock_file=".bmad/locks/${story_key}.lock"
  if [ ! -f "$lock_file" ]; then
    echo "❌ LOCK LOST: Local lock file missing"
    echo "Story may have been unlocked. HALT immediately."
    return 1
  fi

  # Check timeout
  timeout_at=$(grep "timeout_at:" "$lock_file" | cut -d' ' -f2)
  if [ $(date +%s) -gt $(date -d "$timeout_at" +%s) ]; then
    echo "❌ LOCK EXPIRED: Timeout reached"
    echo "Run: /checkout-story ${story_key} to extend lock"
    return 1
  fi

  # Check GitHub assignment (paranoid check)
  github_assignee=$(call_github_mcp_get_issue_assignee "$story_key")
  current_user=$(git config user.github)

  if [ "$github_assignee" != "$current_user" ]; then
    echo "❌ LOCK STOLEN: GitHub issue reassigned to $github_assignee"
    echo "Story was unlocked and re-assigned. HALT."
    return 1
  fi

  # Refresh heartbeat
  sed -i.bak "s/last_heartbeat: .*/last_heartbeat: $(date -u +%Y-%m-%dT%H:%M:%SZ)/" "$lock_file"
  rm -f "${lock_file}.bak"

  echo "✅ Lock verified for ${story_key}"
  return 0
}

# CRITICAL: Call before every task
if ! verify_lock "$story_key"; then
  echo "⚠️⚠️⚠️ PIPELINE HALTED - Lock verification failed"
  echo "Do NOT continue without valid lock!"
  exit 1
fi
```

Then proceed with task implementation...
```

### Point 3: dev-story Progress Sync

**File**: `dev-story/instructions.xml` (Step 8, after line 533)
**Change**: Add GitHub sync after task completion

```xml
<!-- AFTER marking task [x] -->
<check if="{{github_integration.enabled}} == true">
  <action>Sync task completion to GitHub:</action>
  <action>
    Call: mcp__github__add_issue_comment({
      owner: {{github_owner}},
      repo: {{github_repo}},
      issue_number: {{github_issue_number}},
      body: "Task {{task_num}} complete: {{task_description}}\n\n" +
            "Progress: {{checked_tasks}}/{{total_tasks}} tasks ({{progress_pct}}%)"
    })
  </action>

  <check if="GitHub sync failed">
    <retry count="3" />
    <check if="still failing">
      <output>❌ CRITICAL: Cannot sync progress to GitHub</output>
      <output>Network required for team coordination</output>
      <action>HALT</action>
    </check>
  </check>

  <output>✅ Progress synced to GitHub Issue #{{github_issue_number}}</output>
</check>
```

---

## Configuration

**Add to**: `_bmad/bmm/config.yaml`

```yaml
# GitHub Integration Settings
github_integration:
  enabled: true  # Master toggle
  source_of_truth: "github"  # github | local (always github for enterprise)
  require_network: true  # Hard requirement (AI needs internet)

  repository:
    owner: "jschulte"  # GitHub username or org
    repo: "myproject"  # Repository name

  cache:
    enabled: true
    location: "{output_folder}/cache"
    staleness_threshold_minutes: 5
    auto_refresh_on_stale: true

  locking:
    enabled: true
    default_timeout_hours: 8
    heartbeat_interval_minutes: 30
    stale_threshold_minutes: 15
    max_locks_per_user: 3

  sync:
    interval_minutes: 5  # Incremental sync frequency
    batch_epic_prefetch: true  # Pre-fetch epic on checkout
    progress_updates: true  # Sync task completion to GitHub

  permissions:
    scrum_masters:  # Can force-unlock stories
      - "jschulte"
      - "alice-sm"
```

---

## Verification Plan

### Test 1: Story Locking Prevents Duplicate Work

```bash
# Setup: 2 developers, 1 story

# Developer A (machine 1)
$ /checkout-story story_key=2-5-auth
✅ Story checked out
Lock expires: 8 hours

# Developer B (machine 2, simultaneously)
$ /checkout-story story_key=2-5-auth
❌ Story locked by @developerA until 23:30:00Z
Try: /available-stories

# Verify in GitHub
# → Issue #105: Assigned to @developerA
# → Labels: status:in-progress

# Result: ✅ Only Developer A can work on story
```

### Test 2: Real-Time Progress Visibility

```bash
# Developer implements task 3 of 10
# → Marks [x] in story file
# → Workflow syncs to GitHub

# Check GitHub Issue #105
# → New comment (30 seconds ago): "Task 3 complete: Implement OAuth (30%)"
# → Body shows: Progress bar at 30%

# PO views dashboard
# → Shows: "Story 2-5: 30% complete (3/10 tasks)"

# Result: ✅ PO sees progress in real-time
```

### Test 3: Merge Conflict Prevention

```bash
# Setup: 3 developers working on different stories

# All 3 complete simultaneously and commit

# Developer A: Story 2-5 files only
# Developer B: Story 2-7 files only
# Developer C: Story 3-2 files only

# Git commits:
# → Developer A: Only 2-5-auth.md + src/auth/*
# → Developer B: Only 2-7-cache.md + src/cache/*
# → Developer C: Only 3-2-api.md + src/api/*

# No overlap in files → No merge conflicts

# sprint-status.yaml:
# → Each story updates via GitHub sync (not direct file edit)
# → No conflicts (GitHub is source of truth)

# Result: ✅ Zero merge conflicts
```

### Test 4: Cache Performance

```bash
# Measure: Story checkout + epic context load time

# Without cache (API calls):
# - Fetch story: 2-3 seconds
# - Fetch 8 epic stories: 8 × 2s = 16 seconds
# - Total: ~18 seconds

# With cache:
# - Sync check: 200ms (1 API call for "any changes?")
# - Load story: 50ms (Read tool from cache)
# - Load 8 epic stories: 8 × 50ms = 400ms
# - Total: ~650ms

# Result: ✅ 27x faster (18s → 650ms)
```

### Test 5: Network Failure Recovery

```bash
# Developer working on task 5 of 10
# Network drops during GitHub sync

# System:
# → Retry #1 after 1s: Fails
# → Retry #2 after 3s: Fails
# → Retry #3 after 9s: Fails
# → Display: "❌ Cannot sync to GitHub - network required"
# → Save state to: .bmad/pipeline-state-2-5.yaml
# → HALT

# Developer fixes network, resumes:
$ /super-dev-pipeline story_key=2-5-auth

# System:
# → Detects saved state
# → "Resuming from task 5 (paused 10 minutes ago)"
# → Syncs pending progress to GitHub
# → Continues task 6

# Result: ✅ Graceful halt + resume
```

---

## Success Criteria

### Must Have (Phase 1-2)

- ✅ Zero duplicate work incidents (story locking prevents)
- ✅ Zero sprint-status.yaml merge conflicts (GitHub is source of truth)
- ✅ Real-time progress visibility (<30s from task completion to GitHub update)
- ✅ Cache performance: <100ms story reads (vs 2-3s API calls)
- ✅ API efficiency: <50 calls/hour (vs 500-1000 without cache)

### Should Have (Phase 3)

- ✅ PR auto-linking to issues (closes loop)
- ✅ PO can create/update stories via Claude Desktop
- ✅ Epic dashboard shows team activity
- ✅ Bi-directional sync (GitHub ↔ cache)

### Nice to Have (Phase 4)

- ✅ Ghost features auto-create backfill issues
- ✅ Stakeholder reporting
- ✅ Advanced dashboards

---

## Estimated Effort

### Phase 1: Foundation (Weeks 1-2)

- Cache system: 5 days
- Story locking: 5 days
- Progress sync: 2 days
- Testing & docs: 3 days
**Total**: 15 days (3 weeks with buffer)

### Phase 2: PO Workflows (Weeks 3-4)

- PO agent: 1 day
- Story creation: 3 days
- AC updates: 2 days
- Dashboard: 3 days
- Sync engine: 4 days
**Total**: 13 days (2.5 weeks with buffer)

### Phase 3: Advanced (Weeks 5-6)

- PR linking: 2 days
- Approval flow: 2 days
- Epic dashboard: 3 days
- Integration polish: 3 days
**Total**: 10 days (2 weeks)

### Phase 4: Polish (Weeks 7-8)

- Ghost features: 2 days
- Revalidation integration: 2 days
- Documentation: 3 days
- Training materials: 3 days
**Total**: 10 days (2 weeks)

**Grand Total**: 48 days (9.5 weeks, ~2.5 months for complete system)

**MVP** (Phases 1-2): 28 days (~6 weeks) gets you story locking + PO workflows

---

## Files Summary

### NEW Files (26 total)

**Cache System**: 3 files (~900 lines)
**Lock System**: 9 files (~1,350 lines)
**PO Workflows**: 12 files (~2,580 lines)
**Integration**: 2 files (~500 lines)

**Total NEW Code**: ~5,330 lines

### MODIFIED Files (5 total)

1. `batch-super-dev/instructions.md` (+150 lines)
2. `super-dev-pipeline/steps/step-01-init.md` (+80 lines)
3. `super-dev-pipeline/steps/step-03-implement.md` (+120 lines)
4. `super-dev-pipeline/steps/step-06-complete.md` (+100 lines)
5. `dev-story/instructions.xml` (+60 lines)

**Total MODIFIED**: ~510 lines

**Grand Total**: ~5,840 lines of production code + tests + docs

---

## Risk Assessment

| Risk | Probability | Impact | Mitigation |
|------|-------------|--------|------------|
| GitHub rate limits | Low | High | Caching (97% reduction), batch operations |
| Lock deadlocks | Medium | Medium | 8-hour timeout, heartbeat, SM override |
| Cache-GitHub desync | Low | Medium | Staleness checks, mandatory pre-sync |
| Network failures | Medium | Medium | Retry logic, graceful halt + resume |
| BMAD format violations | Medium | High | Strict validation, PO training |
| Lost locks mid-work | Low | High | Verification before each task |
| Developer onboarding | Medium | Low | Clear docs, training, gradual rollout |

**Overall Risk**: **LOW-MEDIUM** (building on proven migrate-to-github patterns)

**Risk Mitigation Strategy**:

- Start with 2-3 developers on small epic (validate locking works)
- Gradual rollout (not all 15 developers at once)
- Comprehensive testing at each phase
- Rollback capability via migrate-to-github patterns

---

## Why This Will Work

### 1. Proven Patterns

- Lock mechanism: Based on working git commit lock (step-06a-queue-commit.md)
- GitHub integration: Based on production migrate-to-github workflow
- Reliability: Same 8 mechanisms as migrate-to-github (idempotent, atomic, verified, resumable, etc.)

### 2. Simple Network Model

- Network required = simplified architecture (no offline queue complexity)
- Fail fast on network issues (retry + halt, not queue for later)
- Matches reality (AI coding needs internet anyway)

### 3. Performance Optimized

- Cache eliminates 95% of API calls
- Incremental sync (only fetch changed stories)
- Pre-fetch epic context (batch operation)
- Read tool works at <100ms (vs 2-3s API calls)

### 4. Multi-Layer Safety

- Lock verification before each task (catch stolen locks immediately)
- Write-through with retry (transient failures handled)
- Staleness detection (refuse to use old cache)
- Mandatory pre-workflow sync (everyone starts with fresh data)

### 5. Role Separation

- POs: GitHub Issues UI + Claude Desktop (no git needed)
- Developers: BMAD workflows (lock → implement → sync → unlock)
- SMs: Oversight tools (lock-status, force-unlock, dashboards)

---

## Next Steps

### Immediate

1. **Review this plan** - Validate architecture decisions
2. **Confirm priorities** - Phase 1-2 first (locking + PO workflows)?
3. **Approve approach** - GitHub as source of truth with local cache

### Week 1

1. Build cache system (cache-manager.js, sync-engine.js)
2. Create checkout-story workflow
3. Implement lock verification
4. Test with 2 developers

### Week 2-3

1. Integrate with batch-super-dev
2. Add progress sync to dev-story
3. Build PO agent + story creation workflow
4. Test with 3-5 developers

### Week 4-6

1. Complete PO workflows (update, dashboard, approve)
2. Add PR linking
3. Build epic dashboard
4. Test with full team (10-15 developers)

### Week 7-8

1. Polish and optimize
2. Advanced features
3. Comprehensive documentation
4. Team training

---

## Conclusion

This design transforms BMAD into **the killer feature for enterprise teams** by:

✅ **Preventing duplicate work** - Story locking with 8-hour timeout, heartbeat, verification
✅ **Enabling Product Owners** - GitHub Issues workspace via Claude Desktop, no git/markdown knowledge
✅ **Maintaining developer flow** - Local cache = instant LLM reads, no API latency
✅ **Scaling to 15 developers** - GitHub centralized coordination, zero merge conflicts
✅ **Building on proven patterns** - migrate-to-github reliability mechanisms (atomic, verified, resumable)
✅ **Optimizing performance** - 97% API reduction through smart caching
✅ **Simplifying architecture** - Network required = no offline queue complexity

**Implementation**: 6-8 weeks for complete system, 4-6 weeks for MVP (locking + basic PO workflows)

**Risk**: Low-Medium (incremental rollout, comprehensive testing, rollback capability)

**ROI**: Eliminates duplicate work, reduces PO-Dev friction by 40%, increases sprint predictability

Ready for enterprise adoption.