docs: add party-mode integration and UAT implementation planning docs
- Add party-mode integration planning documentation: - README overview with benefits analysis - Context management architecture spec - File modifications specification - Add UAT workflow implementation plan covering P0/P1/P2 gaps Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
parent
d561f9d9ad
commit
2326f72803
|
|
@ -0,0 +1,598 @@
|
|||
# Context Management Deep Dive
|
||||
|
||||
**Document**: 02-context-management.md
|
||||
**Version**: 1.0.0
|
||||
**Date**: 2026-01-03
|
||||
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
This document explains how context isolation works in the epic-execute workflow and how Party Mode integration maintains this architecture while enabling multi-agent collaboration.
|
||||
|
||||
---
|
||||
|
||||
## Current Context Architecture
|
||||
|
||||
### The Shell as Orchestrator
|
||||
|
||||
The epic-execute workflow uses **shell orchestration** to create context isolation between phases. The shell script (`epic-execute.sh`) is the central coordinator that:
|
||||
|
||||
1. Reads story files from disk
|
||||
2. Builds prompt strings with story contents embedded
|
||||
3. Invokes Claude in isolated sessions
|
||||
4. Captures output and parses for completion signals
|
||||
5. Manages git staging between phases
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────────────────┐
|
||||
│ SHELL ORCHESTRATION MODEL │
|
||||
├─────────────────────────────────────────────────────────────────────────┤
|
||||
│ │
|
||||
│ epic-execute.sh (Shell - The "Memory") │
|
||||
│ │ │
|
||||
│ ├── Reads: story files, epic files, config │
|
||||
│ ├── Writes: logs, status updates │
|
||||
│ ├── Manages: git staging area │
|
||||
│ │ │
|
||||
│ └── For each phase: │
|
||||
│ ├── Build prompt string (inject file contents) │
|
||||
│ ├── Invoke: claude --dangerously-skip-permissions -p "$prompt" │
|
||||
│ ├── Capture stdout/stderr │
|
||||
│ ├── Parse for signals (COMPLETE, BLOCKED, PASSED, FAILED) │
|
||||
│ └── Decide next action based on result │
|
||||
│ │
|
||||
│ Each Claude invocation = FRESH CONTEXT (no conversation history) │
|
||||
│ │
|
||||
└─────────────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
### Why Context Isolation Matters
|
||||
|
||||
**Problem it solves**: Reviewer bias
|
||||
|
||||
If the reviewer (Context B) could see the developer's (Context A) struggles, dead-ends, and thought process, they would:
|
||||
- Be biased toward the implementation approach taken
|
||||
- Miss issues because they "understand" why shortcuts were made
|
||||
- Not simulate a real code review where reviewers see code "cold"
|
||||
|
||||
**Solution**: Each phase runs in a completely fresh Claude session with no shared conversation history.
|
||||
|
||||
---
|
||||
|
||||
## Context Transfer Mechanisms
|
||||
|
||||
Since contexts are isolated, information must flow through **persistent storage**:
|
||||
|
||||
| Mechanism | What It Carries | Direction |
|
||||
|-----------|-----------------|-----------|
|
||||
| **Git staging** | Actual code changes | Dev → Review |
|
||||
| **Story file** | Dev Agent Record, Code Review Record, Status | All phases |
|
||||
| **Prompt injection** | Story contents, context, instructions | Shell → Claude |
|
||||
| **Output parsing** | Success/failure signals | Claude → Shell |
|
||||
| **Log file** | Full Claude responses (optional) | Claude → Disk |
|
||||
|
||||
### Transfer Flow Diagram
|
||||
|
||||
```
|
||||
┌─────────────┐
|
||||
│ Shell │
|
||||
│ Orchestrator│
|
||||
└──────┬──────┘
|
||||
│
|
||||
│ 1. Read story file
|
||||
│ 2. Build prompt with contents
|
||||
│ 3. Invoke Claude
|
||||
▼
|
||||
┌─────────────┐
|
||||
│ Context A │
|
||||
│ (Dev) │
|
||||
│ │
|
||||
│ - Reads story from prompt
|
||||
│ - Writes code
|
||||
│ - Runs: git add -A
|
||||
│ - Updates story file (Dev Agent Record)
|
||||
│ - Outputs: IMPLEMENTATION COMPLETE
|
||||
│ │
|
||||
└──────┬──────┘
|
||||
│
|
||||
│ ┌─────────────────────────────────┐
|
||||
│ │ Transfer via: │
|
||||
│ │ - Git staging (code) │
|
||||
│ │ - Story file (Dev Agent Record) │
|
||||
│ └─────────────────────────────────┘
|
||||
│
|
||||
│ 4. Shell reads story file again
|
||||
│ 5. Builds new prompt
|
||||
│ 6. Invokes NEW Claude session
|
||||
▼
|
||||
┌─────────────┐
|
||||
│ Context B │
|
||||
│ (Review) │
|
||||
│ │
|
||||
│ - Reads story from prompt (includes Dev Agent Record)
|
||||
│ - Runs: git diff --staged (sees code)
|
||||
│ - Has NO memory of dev phase
|
||||
│ - Reviews "cold"
|
||||
│ - Updates story file (Code Review Record)
|
||||
│ - Outputs: REVIEW PASSED/FAILED
|
||||
│ │
|
||||
└──────┬──────┘
|
||||
│
|
||||
▼
|
||||
┌─────────────┐
|
||||
│ Shell │
|
||||
│ (Commit) │
|
||||
│ │
|
||||
│ git commit -m "..."
|
||||
└─────────────┘
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## How Party Mode Extends This
|
||||
|
||||
Party Mode adds **additional isolated contexts** at specific workflow points without breaking the isolation model.
|
||||
|
||||
### Enhanced Context Flow
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────────────────┐
|
||||
│ ENHANCED CONTEXT FLOW │
|
||||
├─────────────────────────────────────────────────────────────────────────┤
|
||||
│ │
|
||||
│ epic-execute.sh (Shell - Orchestrator) │
|
||||
│ │
|
||||
│ Per Story: │
|
||||
│ ┌─────────────┐ │
|
||||
│ │ CONTEXT 0 │ ← NEW: Party Kickoff (--party-kickoff) │
|
||||
│ │ (Kickoff) │ │
|
||||
│ │ │ Input: Story file │
|
||||
│ │ │ Output: Insights → APPENDED to story file │
|
||||
│ │ │ Signal: KICKOFF COMPLETE │
|
||||
│ └──────┬──────┘ │
|
||||
│ │ │
|
||||
│ │ Transfer: Story file now contains Kickoff Insights section │
|
||||
│ │ │
|
||||
│ ▼ │
|
||||
│ ┌─────────────┐ │
|
||||
│ │ CONTEXT A │ ← EXISTING: Dev phase │
|
||||
│ │ (Dev) │ │
|
||||
│ │ │ Input: Story file (NOW includes kickoff insights) │
|
||||
│ │ │ Output: Code staged, Dev Agent Record │
|
||||
│ │ │ Signal: IMPLEMENTATION COMPLETE/BLOCKED │
|
||||
│ └──────┬──────┘ │
|
||||
│ │ │
|
||||
│ │ Transfer: Git staging + Story file updated │
|
||||
│ │ │
|
||||
│ ▼ │
|
||||
│ ┌─────────────┐ │
|
||||
│ │ CONTEXT B │ ← MODIFIED: Standard review OR Party Review │
|
||||
│ │ (Review) │ │
|
||||
│ │ │ Input: Story file + git diff --staged │
|
||||
│ │ │ Output: Code Review Record, fixes staged │
|
||||
│ │ │ Signal: REVIEW PASSED/FAILED │
|
||||
│ └──────┬──────┘ │
|
||||
│ │ │
|
||||
│ ▼ │
|
||||
│ ┌─────────────┐ │
|
||||
│ │ Shell │ ← EXISTING: Commit │
|
||||
│ │ (Commit) │ git commit │
|
||||
│ └─────────────┘ │
|
||||
│ │
|
||||
│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ FAILURE PATH ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │
|
||||
│ │
|
||||
│ If Dev or Review outputs BLOCKED/FAILED: │
|
||||
│ ┌─────────────┐ │
|
||||
│ │ CONTEXT F │ ← NEW: Failure Analysis (--party-failure) │
|
||||
│ │ (Failure) │ │
|
||||
│ │ │ Input: Story file + failure message │
|
||||
│ │ │ Output: Analysis Record → appended to story │
|
||||
│ │ │ Signal: ANALYSIS COMPLETE + recommendation │
|
||||
│ └─────────────┘ │
|
||||
│ │
|
||||
│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ POST-EPIC ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │
|
||||
│ │
|
||||
│ After all stories: │
|
||||
│ ┌─────────────┐ │
|
||||
│ │ CONTEXT C │ ← EXISTING: UAT Generation │
|
||||
│ │ (UAT) │ │
|
||||
│ └─────────────┘ │
|
||||
│ │
|
||||
│ ┌─────────────┐ │
|
||||
│ │ CONTEXT R │ ← NEW: Party Retrospective (--party-retro) │
|
||||
│ │ (Retro) │ │
|
||||
│ │ │ Input: ALL story files + epic file (read-only) │
|
||||
│ │ │ Output: Retro doc + handoff doc (new files) │
|
||||
│ │ │ Signal: RETRO COMPLETE │
|
||||
│ └─────────────┘ │
|
||||
│ │
|
||||
└─────────────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Detailed Transfer Specifications
|
||||
|
||||
### A. Kickoff → Dev Transfer
|
||||
|
||||
**What gets transferred**: Kickoff Insights (architectural notes, implementation strategy, testing approach, identified risks)
|
||||
|
||||
**Transfer mechanism**: Kickoff context appends a new section to the story file
|
||||
|
||||
```markdown
|
||||
## Story Kickoff Insights
|
||||
|
||||
**Discussion Date**: 2026-01-03
|
||||
**Participants**: Winston (Architect), Amelia (Developer), Murat (Test Architect)
|
||||
|
||||
### Architectural Notes
|
||||
- Consider using existing auth middleware pattern from lib/auth/
|
||||
- Integration point: /api/v1/users endpoint
|
||||
- Watch for rate limiting constraints on external API
|
||||
|
||||
### Implementation Strategy
|
||||
- Extend UserService class rather than creating new
|
||||
- Reuse validation utilities from lib/validators
|
||||
- Follow repository pattern established in src/repositories/
|
||||
|
||||
### Testing Approach
|
||||
- Unit tests for service methods (Jest)
|
||||
- Integration test for full user flow
|
||||
- Mock external API calls using existing fixtures
|
||||
|
||||
### Identified Risks
|
||||
- Rate limiting not yet implemented for external API
|
||||
- Database migration needed for new user fields
|
||||
```
|
||||
|
||||
**Dev context sees**: Story file with the above section included. The prompt says "Read the story file completely before writing any code" - so dev agent has access to kickoff insights.
|
||||
|
||||
**Why this works**: Dev agent can leverage the multi-agent discussion without those agents being in its context window.
|
||||
|
||||
---
|
||||
|
||||
### B. Dev → Review Transfer (Unchanged)
|
||||
|
||||
**What gets transferred**:
|
||||
1. Code changes (via git staging)
|
||||
2. Dev Agent Record (via story file)
|
||||
|
||||
**Transfer mechanism**:
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────────────────┐
|
||||
│ Dev Context writes: │
|
||||
│ │
|
||||
│ 1. Code files → git add -A (staged, not committed) │
|
||||
│ │
|
||||
│ 2. Story file updated with: │
|
||||
│ ## Dev Agent Record │
|
||||
│ │
|
||||
│ ### Implementation Summary │
|
||||
│ Added user registration endpoint with email verification │
|
||||
│ │
|
||||
│ ### Files Created │
|
||||
│ - src/services/UserService.ts - User registration logic │
|
||||
│ - src/routes/users.ts - REST endpoints │
|
||||
│ │
|
||||
│ ### Files Modified │
|
||||
│ - src/app.ts - Added user routes │
|
||||
│ │
|
||||
│ ### Key Decisions │
|
||||
│ - Used bcrypt for password hashing (industry standard) │
|
||||
│ - Async email verification (non-blocking) │
|
||||
│ │
|
||||
│ ### Tests Added │
|
||||
│ - test/services/UserService.test.ts │
|
||||
│ │
|
||||
│ ### Notes for Reviewer │
|
||||
│ - Email templates need design review │
|
||||
│ - Rate limiting deferred to next story │
|
||||
│ │
|
||||
│ 3. Outputs: IMPLEMENTATION COMPLETE: story-42-1 │
|
||||
└─────────────────────────────────────────────────────────────────────────┘
|
||||
│
|
||||
▼
|
||||
┌─────────────────────────────────────────────────────────────────────────┐
|
||||
│ Review Context receives (via prompt injection): │
|
||||
│ │
|
||||
│ 1. Story file contents (includes Dev Agent Record) │
|
||||
│ 2. Instructions to run: git diff --staged │
|
||||
│ │
|
||||
│ Review Context has NO knowledge of: │
|
||||
│ - Dead-ends the dev tried │
|
||||
│ - Time spent debugging │
|
||||
│ - Alternative approaches considered │
|
||||
│ - Frustrations or workarounds │
|
||||
│ │
|
||||
│ This is intentional - reviewer sees code "cold" │
|
||||
└─────────────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### C. Party Review vs Standard Review
|
||||
|
||||
The difference is in **prompt content**, not transfer mechanism.
|
||||
|
||||
**Standard Review Prompt** (current):
|
||||
```
|
||||
You are a Senior Code Reviewer performing a BMAD code review.
|
||||
|
||||
## Your Task
|
||||
Review the implementation of story: {story_id}
|
||||
You are seeing this code for the first time...
|
||||
```
|
||||
|
||||
**Party Review Prompt** (new):
|
||||
```
|
||||
You are orchestrating a Party Code Review with multiple BMAD agents.
|
||||
|
||||
## Participating Agents
|
||||
|
||||
### Winston (Architect)
|
||||
- Focus: Pattern adherence, scalability, API design consistency
|
||||
- Communication style: Calm, pragmatic, balances "what could be" with "what should be"
|
||||
|
||||
### Murat (Test Architect)
|
||||
- Focus: Test coverage, security vulnerabilities, edge cases
|
||||
- Communication style: Data-driven, "strong opinions weakly held", risk calculations
|
||||
|
||||
### Amelia (Developer)
|
||||
- Focus: Code quality, readability, maintainability, error handling
|
||||
- Communication style: [from agent definition]
|
||||
|
||||
## Your Task
|
||||
|
||||
1. Run: git diff --staged
|
||||
2. For each agent, generate their review perspective in character
|
||||
3. Facilitate cross-discussion where agents reference each other
|
||||
4. Build consensus on issues and fixes
|
||||
5. Apply the same severity-based fix policy
|
||||
6. Generate unified Party Review Record
|
||||
|
||||
## Output Format
|
||||
|
||||
Each agent reviews, then they discuss:
|
||||
|
||||
🏗️ **Winston**: "Looking at the architecture, I see..."
|
||||
|
||||
🧪 **Murat**: "From a testing perspective, Winston raises a good point about..."
|
||||
|
||||
💻 **Amelia**: "I agree with Murat on test coverage. Additionally..."
|
||||
|
||||
### Consensus
|
||||
[Unified findings after discussion]
|
||||
```
|
||||
|
||||
**Same inputs**: Story file + git diff
|
||||
**Same outputs**: Code Review Record in story file, PASSED/FAILED signal
|
||||
**Different process**: Multi-perspective analysis within the prompt
|
||||
|
||||
---
|
||||
|
||||
### D. Failure Analysis Context
|
||||
|
||||
**Trigger**: Dev or Review outputs BLOCKED/FAILED signal
|
||||
|
||||
**What gets transferred**:
|
||||
1. Story file (current state, may have partial Dev Agent Record)
|
||||
2. Failure type ("dev" or "review")
|
||||
3. Failure message extracted from output
|
||||
|
||||
```bash
|
||||
# In shell script
|
||||
if echo "$result" | grep -q "IMPLEMENTATION BLOCKED"; then
|
||||
# Extract failure reason
|
||||
failure_msg=$(echo "$result" | grep "IMPLEMENTATION BLOCKED" | sed 's/.*BLOCKED: [^ ]* - //')
|
||||
|
||||
if [ "$PARTY_FAILURE" = true ]; then
|
||||
execute_party_failure_analysis "$story_file" "dev" "$failure_msg"
|
||||
fi
|
||||
fi
|
||||
```
|
||||
|
||||
**Failure Analysis Context receives**:
|
||||
```
|
||||
## Failed Story
|
||||
<story>
|
||||
{story_file_contents}
|
||||
</story>
|
||||
|
||||
## Failure Information
|
||||
- **Type**: dev phase
|
||||
- **Signal**: IMPLEMENTATION BLOCKED
|
||||
- **Message**: "Cannot resolve circular dependency between UserService and AuthService"
|
||||
|
||||
## Participating Agents
|
||||
- Winston (Architect): Assess if this is an architectural issue
|
||||
- Amelia (Developer): Assess if this is an implementation issue
|
||||
- Bob (Scrum Master): Assess if this is a requirements/process issue
|
||||
```
|
||||
|
||||
**Output**: Failure Analysis Record appended to story + recommendation (Retry | Skip | Escalate)
|
||||
|
||||
---
|
||||
|
||||
### E. Retrospective Context (Post-Epic)
|
||||
|
||||
**Trigger**: All stories completed (or `--party-retro` flag with completed epic)
|
||||
|
||||
**What gets transferred** (read-only, aggregated):
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────────────────┐
|
||||
│ Retro Context receives: │
|
||||
│ │
|
||||
│ 1. Epic file │
|
||||
│ - Epic description, goals, scope │
|
||||
│ │
|
||||
│ 2. ALL story files, each containing: │
|
||||
│ - Original specification │
|
||||
│ - Kickoff Insights (if --party-kickoff was used) │
|
||||
│ - Dev Agent Record │
|
||||
│ - Code Review Record (standard or party) │
|
||||
│ - Failure Analysis Record (if any failures occurred) │
|
||||
│ │
|
||||
│ 3. Execution summary │
|
||||
│ - Stories completed: 8 │
|
||||
│ - Stories failed: 1 │
|
||||
│ - Total duration: 45 minutes │
|
||||
└─────────────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
**Output** (creates new files):
|
||||
|
||||
```
|
||||
docs/sprints/epic-42-retro.md # Retrospective insights
|
||||
docs/handoffs/epic-42-handoff.md # Context for next epic (used by epic-chain)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Context Window Considerations
|
||||
|
||||
### Current Approach: Prompt Injection
|
||||
|
||||
Each context receives its input via prompt injection - the shell script reads files and embeds their contents in the prompt string:
|
||||
|
||||
```bash
|
||||
local story_contents=$(cat "$story_file")
|
||||
|
||||
local dev_prompt="You are the Dev agent...
|
||||
|
||||
## Story Specification
|
||||
|
||||
<story>
|
||||
$story_contents
|
||||
</story>
|
||||
|
||||
## Implementation Requirements
|
||||
..."
|
||||
```
|
||||
|
||||
**Advantage**: Full control over what each context sees
|
||||
**Limitation**: Large stories or many files can consume significant context window
|
||||
|
||||
### Party Mode Implications
|
||||
|
||||
Party phases add context window usage:
|
||||
|
||||
| Phase | Additional Context Load |
|
||||
|-------|------------------------|
|
||||
| Kickoff | Agent personas (~500 tokens) + discussion instructions |
|
||||
| Party Review | 3x agent personas + cross-talk instructions |
|
||||
| Failure Analysis | Agent personas + failure context |
|
||||
| Retrospective | ALL story files aggregated (potentially large) |
|
||||
|
||||
**Mitigation strategies**:
|
||||
|
||||
1. **Selective loading**: Only include relevant agent personas, not full manifests
|
||||
2. **Summary injection**: For retro, summarize stories rather than full contents
|
||||
3. **Timeout configuration**: Allow configuration of max tokens per phase
|
||||
|
||||
---
|
||||
|
||||
## Data Flow Summary
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────────────────┐
|
||||
│ DATA FLOW SUMMARY │
|
||||
├─────────────────────────────────────────────────────────────────────────┤
|
||||
│ │
|
||||
│ PERSISTENT STORAGE (survives across contexts) │
|
||||
│ ├── Story files (.md) │
|
||||
│ │ ├── Original spec │
|
||||
│ │ ├── Kickoff Insights (written by Context 0) │
|
||||
│ │ ├── Dev Agent Record (written by Context A) │
|
||||
│ │ ├── Code Review Record (written by Context B) │
|
||||
│ │ └── Failure Analysis Record (written by Context F) │
|
||||
│ │ │
|
||||
│ ├── Git staging area │
|
||||
│ │ └── Code changes (written by Context A, read by Context B) │
|
||||
│ │ │
|
||||
│ ├── Git commits │
|
||||
│ │ └── Committed code (written by Shell after Context B passes) │
|
||||
│ │ │
|
||||
│ └── Output files │
|
||||
│ ├── docs/uat/epic-{id}-uat.md (written by Context C) │
|
||||
│ ├── docs/sprints/epic-{id}-retro.md (written by Context R) │
|
||||
│ └── docs/handoffs/epic-{id}-handoff.md (written by Context R) │
|
||||
│ │
|
||||
│ EPHEMERAL (exists only during context execution) │
|
||||
│ ├── Conversation history (per context, not shared) │
|
||||
│ ├── Tool call results (per context) │
|
||||
│ └── Working memory (per context) │
|
||||
│ │
|
||||
│ SHELL VARIABLES (orchestrator state) │
|
||||
│ ├── STORIES array │
|
||||
│ ├── COMPLETED/FAILED counters │
|
||||
│ ├── Flag states (PARTY_KICKOFF, etc.) │
|
||||
│ └── Current story pointer │
|
||||
│ │
|
||||
└─────────────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Key Design Principles
|
||||
|
||||
### 1. File System as the Bridge
|
||||
|
||||
All context transfer happens via the file system. This is intentional:
|
||||
- Git staging for code
|
||||
- Markdown files for documentation/records
|
||||
- No shared memory or conversation history
|
||||
|
||||
### 2. Append-Only Story Files
|
||||
|
||||
Each phase appends to the story file rather than replacing. This creates an audit trail:
|
||||
```
|
||||
Story File
|
||||
├── Original Spec (created during planning)
|
||||
├── Kickoff Insights (appended by kickoff party)
|
||||
├── Dev Agent Record (appended by dev phase)
|
||||
├── Code Review Record (appended by review phase)
|
||||
└── Failure Analysis (appended if failure occurred)
|
||||
```
|
||||
|
||||
### 3. Signals for Flow Control
|
||||
|
||||
Each context outputs a specific signal that the shell parses:
|
||||
- `KICKOFF COMPLETE: story-id`
|
||||
- `IMPLEMENTATION COMPLETE: story-id`
|
||||
- `IMPLEMENTATION BLOCKED: story-id - reason`
|
||||
- `REVIEW PASSED: story-id`
|
||||
- `REVIEW PASSED WITH FIXES: story-id - Fixed N issues`
|
||||
- `REVIEW FAILED: story-id - reason`
|
||||
- `ANALYSIS COMPLETE: story-id - Retry|Skip|Escalate`
|
||||
- `RETRO COMPLETE: Epic epic-id`
|
||||
|
||||
### 4. Non-Blocking Optional Phases
|
||||
|
||||
Party phases are designed to be non-blocking:
|
||||
- Kickoff failure → Continue to dev (insights are helpful but not required)
|
||||
- Failure analysis → Informational (doesn't change retry/skip decision)
|
||||
- Retro failure → Log warning, epic still considered complete
|
||||
|
||||
---
|
||||
|
||||
## Testing Context Isolation
|
||||
|
||||
To verify context isolation is maintained:
|
||||
|
||||
```bash
|
||||
# Test 1: Verify dev context doesn't see review instructions
|
||||
./epic-execute.sh test-epic --dry-run --verbose 2>&1 | grep -A 50 "DEV PHASE"
|
||||
# Should NOT contain "Code Review" or "severity" language
|
||||
|
||||
# Test 2: Verify review context doesn't see dev struggles
|
||||
./epic-execute.sh test-epic --dry-run --verbose 2>&1 | grep -A 50 "REVIEW PHASE"
|
||||
# Should contain "You are seeing this code for the first time"
|
||||
|
||||
# Test 3: Verify party kickoff writes to story file
|
||||
./epic-execute.sh test-epic --party-kickoff --dry-run
|
||||
cat docs/stories/test-story.md | grep "Story Kickoff Insights"
|
||||
# Should find the section
|
||||
```
|
||||
File diff suppressed because it is too large
Load Diff
|
|
@ -0,0 +1,137 @@
|
|||
# Party Mode Integration with Epic Execute
|
||||
|
||||
**Status**: Planning
|
||||
**Version**: 1.0.0
|
||||
**Date**: 2026-01-03
|
||||
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
This folder contains the complete documentation for integrating Party Mode's multi-agent collaboration capabilities into the Epic Execute and Epic Chain workflows.
|
||||
|
||||
## Documents
|
||||
|
||||
| Document | Description |
|
||||
|----------|-------------|
|
||||
| [01-implementation-plan.md](./01-implementation-plan.md) | High-level implementation plan with CLI flags, configuration, and phases |
|
||||
| [02-context-management.md](./02-context-management.md) | Deep dive into context isolation architecture and data transfer mechanisms |
|
||||
| [03-file-modifications.md](./03-file-modifications.md) | Detailed specification of all file changes required |
|
||||
| [04-prompt-engineering.md](./04-prompt-engineering.md) | Prompt templates for each party phase (future) |
|
||||
|
||||
## Quick Links
|
||||
|
||||
- **Why Party Mode?** See [Benefits Analysis](#benefits-of-integration)
|
||||
- **How does context work?** See [02-context-management.md](./02-context-management.md)
|
||||
- **What files change?** See [03-file-modifications.md](./03-file-modifications.md)
|
||||
|
||||
---
|
||||
|
||||
## Benefits of Integration
|
||||
|
||||
### Current Pain Points
|
||||
|
||||
| Pain Point | Impact |
|
||||
|------------|--------|
|
||||
| Architectural issues found mid-implementation | Costly rework, context loss |
|
||||
| Single-perspective code review | Missed issues in blind spots |
|
||||
| Shallow context handoffs in epic-chain | Next epic starts without learnings |
|
||||
| Silent failures with only logged errors | No actionable remediation guidance |
|
||||
|
||||
### Party Mode Solutions
|
||||
|
||||
| Party Phase | Addresses |
|
||||
|-------------|-----------|
|
||||
| **Story Kickoff Party** | Surfaces architectural/implementation/testing concerns before coding |
|
||||
| **Party Review** | Multi-perspective code review catches more issue categories |
|
||||
| **Failure Analysis Party** | Root cause analysis with actionable remediation |
|
||||
| **Post-Epic Retrospective** | Rich context handoffs, documented patterns and lessons |
|
||||
|
||||
---
|
||||
|
||||
## Architecture Summary
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────────────┐
|
||||
│ ENHANCED EPIC EXECUTE FLOW │
|
||||
├─────────────────────────────────────────────────────────────────────┤
|
||||
│ │
|
||||
│ Per Story: │
|
||||
│ ┌──────────────┐ │
|
||||
│ │ PARTY: │ ← --party-kickoff (optional) │
|
||||
│ │ Kickoff │ Agents: Winston + Amelia + Murat │
|
||||
│ └──────┬───────┘ │
|
||||
│ │ │
|
||||
│ ▼ │
|
||||
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
|
||||
│ │ Dev Phase │───►│ Review Phase │───►│ Commit │ │
|
||||
│ │ (Context A) │ │ Standard OR │ │ (Shell) │ │
|
||||
│ │ │ │ PARTY Review │ │ │ │
|
||||
│ └──────────────┘ └──────────────┘ └──────────────┘ │
|
||||
│ │ │
|
||||
│ ▼ (on failure) │
|
||||
│ ┌──────────────┐ │
|
||||
│ │ PARTY: │ ← --party-failure (optional) │
|
||||
│ │ Failure │ │
|
||||
│ │ Analysis │ │
|
||||
│ └──────────────┘ │
|
||||
│ │
|
||||
│ Post-Epic: │
|
||||
│ ┌──────────────┐ ┌──────────────┐ │
|
||||
│ │ UAT │ │ PARTY: │ ← --party-retro (optional) │
|
||||
│ │ Generation │ │ Retrospective│ │
|
||||
│ └──────────────┘ └──────────────┘ │
|
||||
│ │
|
||||
└─────────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## CLI Usage
|
||||
|
||||
```bash
|
||||
# Enable individual party phases
|
||||
./epic-execute.sh 42 --party-kickoff
|
||||
./epic-execute.sh 42 --party-review
|
||||
./epic-execute.sh 42 --party-failure
|
||||
./epic-execute.sh 42 --party-retro
|
||||
|
||||
# Enable all party phases
|
||||
./epic-execute.sh 42 --party-all
|
||||
|
||||
# Custom agents for a phase
|
||||
./epic-execute.sh 42 --party-review --party-agents "Winston,Murat"
|
||||
|
||||
# Combine with existing flags
|
||||
./epic-execute.sh 42 --party-all --skip-done --verbose
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Implementation Phases
|
||||
|
||||
| Phase | Priority | Scope |
|
||||
|-------|----------|-------|
|
||||
| **Phase 1** | High | CLI flags + Story Kickoff Party |
|
||||
| **Phase 2** | High | Party Review (multi-agent code review) |
|
||||
| **Phase 3** | Medium | Failure Analysis Party |
|
||||
| **Phase 4** | Medium | Retrospective + epic-chain handoff integration |
|
||||
| **Phase 5** | Low | Polish (TTS, metrics, comprehensive docs) |
|
||||
|
||||
---
|
||||
|
||||
## Related Documents
|
||||
|
||||
- [Epic Workflows v1 Improvements](../epic-workflows-v1.md) - General workflow improvements
|
||||
- [Party Mode Core Docs](../../../src/core/workflows/party-mode/workflow.md) - Party Mode workflow definition
|
||||
- [Epic Execute Workflow](../../../src/modules/bmm/workflows/4-implementation/epic-execute/workflow.md) - Current workflow
|
||||
|
||||
---
|
||||
|
||||
## Decision Log
|
||||
|
||||
| Date | Decision | Rationale |
|
||||
|------|----------|-----------|
|
||||
| 2026-01-03 | Option B: Configurable flags | Opt-in approach preserves existing behavior, allows gradual adoption |
|
||||
| 2026-01-03 | File-based context transfer | Maintains context isolation while enabling information flow |
|
||||
| 2026-01-03 | Non-blocking kickoff | Kickoff insights are helpful but not critical path |
|
||||
|
|
@ -0,0 +1,709 @@
|
|||
# UAT Workflow Implementation Plan
|
||||
|
||||
**Date:** 2026-01-05
|
||||
**Source:** `docs/improvements/uat-workflow-implementation-gaps.md`
|
||||
**Scope:** All gaps (P0, P1, P2)
|
||||
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
Implement all gaps identified in the UAT workflow implementation gaps analysis to make the UAT validation workflow and epic chain report generator production-ready.
|
||||
|
||||
**Current State:** Workflow definitions, templates, and agent triggers exist
|
||||
**Missing:** Shell orchestration, metrics collection, step files, and integration points
|
||||
|
||||
---
|
||||
|
||||
## Files to Create
|
||||
|
||||
| File | Priority | Lines (est) |
|
||||
|------|----------|-------------|
|
||||
| `scripts/uat-validate.sh` | P0 | ~350 |
|
||||
| `src/modules/bmm/workflows/5-validation/uat-validate/steps/step-01-load-uat.md` | P2 | ~60 |
|
||||
| `src/modules/bmm/workflows/5-validation/uat-validate/steps/step-02-classify-scenarios.md` | P2 | ~50 |
|
||||
| `src/modules/bmm/workflows/5-validation/uat-validate/steps/step-03-execute-scenarios.md` | P2 | ~70 |
|
||||
| `src/modules/bmm/workflows/5-validation/uat-validate/steps/step-04-evaluate-gate.md` | P2 | ~60 |
|
||||
| `src/modules/bmm/workflows/5-validation/uat-validate/steps/step-05-report-results.md` | P2 | ~50 |
|
||||
|
||||
## Files to Modify
|
||||
|
||||
| File | Priority | Changes |
|
||||
|------|----------|---------|
|
||||
| `scripts/epic-execute.sh` | P0 | Add metrics collection (~60 lines) |
|
||||
| `scripts/epic-chain.sh` | P1 | Add UAT gate + report generation (~100 lines) |
|
||||
|
||||
---
|
||||
|
||||
## Implementation Steps
|
||||
|
||||
### Step 1: Create `scripts/uat-validate.sh` [P0]
|
||||
|
||||
**Purpose:** Shell orchestration for UAT validation with self-healing loop
|
||||
|
||||
**Structure (following epic-execute.sh patterns):**
|
||||
|
||||
```
|
||||
Section 1: Configuration (lines 1-50)
|
||||
- Script/project paths
|
||||
- Color codes
|
||||
- Default values: UAT_GATE_MODE=quick, MAX_RETRIES=2
|
||||
|
||||
Section 2: Helper Functions (lines 51-90)
|
||||
- log(), log_success(), log_error(), log_warn()
|
||||
- Log to /tmp/bmad-uat-validate-$$.log
|
||||
|
||||
Section 3: Argument Parsing (lines 91-140)
|
||||
- Required: <epic_id>
|
||||
- Flags: --gate-mode=quick|full|skip, --max-retries=N, --skip-manual, --verbose, --dry-run
|
||||
|
||||
Section 4: UAT Document Loading (lines 141-180)
|
||||
- Find: docs/uat/epic-{id}-uat.md
|
||||
- Parse scenario blocks
|
||||
|
||||
Section 5: Scenario Classification (lines 181-230)
|
||||
- Automatable: contains npx, npm run, curl, pytest, etc.
|
||||
- Semi-auto: requires setup then command
|
||||
- Manual: no detectable command
|
||||
|
||||
Section 6: Scenario Execution (lines 231-280)
|
||||
- Execute automatable scenarios with timeout
|
||||
- Capture exit code + output
|
||||
- Record pass/fail results
|
||||
|
||||
Section 7: Gate Evaluation (lines 281-320)
|
||||
- If all passed: output UAT_GATE_RESULT: PASS, exit 0
|
||||
- If failed: generate fix context, attempt self-healing
|
||||
|
||||
Section 8: Self-Healing Loop (lines 321-380)
|
||||
- Generate fix context doc at: docs/sprint-artifacts/uat-fixes/epic-{id}-fix-context-{attempt}.md
|
||||
- Spawn fresh Claude for quick-dev fixes
|
||||
- Re-validate in new iteration
|
||||
- Loop until pass or max_retries
|
||||
|
||||
Section 9: Output Signals (lines 381-400)
|
||||
- UAT_GATE_RESULT: PASS|FAIL
|
||||
- UAT_FIX_ATTEMPTS: N
|
||||
- UAT_SCENARIOS_PASSED: X/Y
|
||||
- Exit codes: 0=pass, 1=fail-fixable, 2=max-retries-exceeded
|
||||
```
|
||||
|
||||
**Key Functions:**
|
||||
|
||||
```bash
|
||||
load_uat_document() # Parse UAT doc, extract scenarios
|
||||
classify_scenario() # Return: automatable|semi-auto|manual
|
||||
execute_scenario() # Run command, capture result
|
||||
evaluate_gate() # Determine pass/fail
|
||||
generate_fix_context() # Create fix context doc from template
|
||||
run_quick_dev_fix() # Spawn fresh Claude session for fixes
|
||||
```
|
||||
|
||||
**Interface:**
|
||||
|
||||
```bash
|
||||
# Usage
|
||||
./scripts/uat-validate.sh <epic_id> [options]
|
||||
|
||||
# Options
|
||||
--gate-mode=quick|full|skip # Which scenarios to run
|
||||
--max-retries=2 # Fix attempts before halt
|
||||
--skip-manual # Skip manual-only scenarios
|
||||
--verbose # Detailed output
|
||||
--dry-run # Show what would run
|
||||
|
||||
# Output signals (for epic-chain.sh to parse)
|
||||
UAT_GATE_RESULT: PASS|FAIL
|
||||
UAT_FIX_ATTEMPTS: N
|
||||
UAT_SCENARIOS_PASSED: X/Y
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Step 2: Add Metrics Instrumentation to `scripts/epic-execute.sh` [P0]
|
||||
|
||||
**Location:** Modify existing file
|
||||
|
||||
**Add at epic start (~line 145, after directory setup):**
|
||||
|
||||
```bash
|
||||
# Initialize metrics file
|
||||
METRICS_DIR="$SPRINT_ARTIFACTS_DIR/metrics"
|
||||
METRICS_FILE="$METRICS_DIR/epic-${EPIC_ID}-metrics.yaml"
|
||||
mkdir -p "$METRICS_DIR"
|
||||
EPIC_START_TIME=$(date -u +"%Y-%m-%dT%H:%M:%SZ")
|
||||
EPIC_START_SECONDS=$(date +%s)
|
||||
|
||||
cat > "$METRICS_FILE" << EOF
|
||||
epic_id: "$EPIC_ID"
|
||||
execution:
|
||||
start_time: "$EPIC_START_TIME"
|
||||
end_time: ""
|
||||
duration_seconds: 0
|
||||
stories:
|
||||
total: 0
|
||||
completed: 0
|
||||
failed: 0
|
||||
skipped: 0
|
||||
validation:
|
||||
gate_executed: false
|
||||
gate_status: "PENDING"
|
||||
fix_attempts: 0
|
||||
issues: []
|
||||
EOF
|
||||
```
|
||||
|
||||
**Add helper function (~line 65):**
|
||||
|
||||
```bash
|
||||
update_story_metrics() {
|
||||
local status="$1" # completed|failed|skipped
|
||||
case "$status" in
|
||||
completed) yq -i '.stories.completed += 1' "$METRICS_FILE" ;;
|
||||
failed) yq -i '.stories.failed += 1' "$METRICS_FILE" ;;
|
||||
skipped) yq -i '.stories.skipped += 1' "$METRICS_FILE" ;;
|
||||
esac
|
||||
}
|
||||
```
|
||||
|
||||
**Call after each story completion in main loop:**
|
||||
|
||||
```bash
|
||||
update_story_metrics "completed" # or failed/skipped based on result
|
||||
```
|
||||
|
||||
**Add at epic end (~line 400, before summary):**
|
||||
|
||||
```bash
|
||||
# Finalize metrics
|
||||
EPIC_END_TIME=$(date -u +"%Y-%m-%dT%H:%M:%SZ")
|
||||
DURATION=$(($(date +%s) - EPIC_START_SECONDS))
|
||||
yq -i ".execution.end_time = \"$EPIC_END_TIME\"" "$METRICS_FILE"
|
||||
yq -i ".execution.duration_seconds = $DURATION" "$METRICS_FILE"
|
||||
yq -i ".stories.total = ${#STORIES[@]}" "$METRICS_FILE"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Step 3: Integrate UAT Gate into `scripts/epic-chain.sh` [P1]
|
||||
|
||||
**Add configuration variables (~line 42):**
|
||||
|
||||
```bash
|
||||
# UAT Gate Configuration
|
||||
UAT_GATE_ENABLED="${UAT_GATE_ENABLED:-true}"
|
||||
UAT_GATE_MODE="${UAT_GATE_MODE:-quick}"
|
||||
UAT_MAX_RETRIES="${UAT_MAX_RETRIES:-2}"
|
||||
UAT_BLOCKING="${UAT_BLOCKING:-false}"
|
||||
```
|
||||
|
||||
**Add CLI flags (~line 120):**
|
||||
|
||||
```bash
|
||||
--uat-gate=*)
|
||||
UAT_GATE_MODE="${1#*=}"
|
||||
shift
|
||||
;;
|
||||
--uat-blocking)
|
||||
UAT_BLOCKING=true
|
||||
shift
|
||||
;;
|
||||
--no-uat)
|
||||
UAT_GATE_ENABLED=false
|
||||
shift
|
||||
;;
|
||||
```
|
||||
|
||||
**Add UAT gate phase after epic completion (~line 320, after epic-execute succeeds):**
|
||||
|
||||
```bash
|
||||
# Run UAT validation if enabled
|
||||
if [ "$UAT_GATE_ENABLED" = true ]; then
|
||||
log_section "UAT Validation Gate: Epic $epic_id"
|
||||
|
||||
uat_result=$("$SCRIPT_DIR/uat-validate.sh" "$epic_id" \
|
||||
--gate-mode="$UAT_GATE_MODE" \
|
||||
--max-retries="$UAT_MAX_RETRIES" 2>&1) || true
|
||||
|
||||
# Parse result
|
||||
if echo "$uat_result" | grep -q "UAT_GATE_RESULT: PASS"; then
|
||||
log_success "UAT validation passed for Epic $epic_id"
|
||||
# Update metrics
|
||||
yq -i '.validation.gate_executed = true' "$METRICS_FILE"
|
||||
yq -i '.validation.gate_status = "PASS"' "$METRICS_FILE"
|
||||
else
|
||||
log_error "UAT validation failed for Epic $epic_id"
|
||||
yq -i '.validation.gate_executed = true' "$METRICS_FILE"
|
||||
yq -i '.validation.gate_status = "FAIL"' "$METRICS_FILE"
|
||||
|
||||
if [ "$UAT_BLOCKING" = true ]; then
|
||||
log_error "UAT blocking enabled - halting chain"
|
||||
exit 1
|
||||
else
|
||||
log_warn "UAT blocking disabled - continuing to next epic"
|
||||
fi
|
||||
fi
|
||||
fi
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Step 4: Integrate Report Generation into `scripts/epic-chain.sh` [P1]
|
||||
|
||||
**Add configuration (~line 45):**
|
||||
|
||||
```bash
|
||||
GENERATE_REPORT="${GENERATE_REPORT:-true}"
|
||||
CHAIN_REPORT_FILE="$SPRINT_ARTIFACTS_DIR/chain-execution-report.md"
|
||||
METRICS_DIR="$SPRINT_ARTIFACTS_DIR/metrics"
|
||||
```
|
||||
|
||||
**Add CLI flag (~line 130):**
|
||||
|
||||
```bash
|
||||
--no-report)
|
||||
GENERATE_REPORT=false
|
||||
shift
|
||||
;;
|
||||
```
|
||||
|
||||
**Add report generation after all epics (~line 400, before final summary):**
|
||||
|
||||
```bash
|
||||
# Generate chain execution report
|
||||
if [ "$GENERATE_REPORT" = true ] && [ "$DRY_RUN" = false ]; then
|
||||
log_section "Generating Chain Execution Report"
|
||||
|
||||
INSTALLED_PATH="$BMAD_DIR/bmm/workflows/4-implementation/epic-chain"
|
||||
|
||||
report_prompt="You are Bob, the Scrum Master.
|
||||
|
||||
Execute the chain report generation workflow:
|
||||
- Step file: $INSTALLED_PATH/steps/step-10-generate-report.md
|
||||
- Metrics folder: $METRICS_DIR
|
||||
- Chain plan: $CHAIN_PLAN_FILE
|
||||
- Output to: $CHAIN_REPORT_FILE
|
||||
|
||||
Generate the complete execution report."
|
||||
|
||||
claude --dangerously-skip-permissions -p "$report_prompt" || true
|
||||
|
||||
if [ -f "$CHAIN_REPORT_FILE" ]; then
|
||||
log_success "Report generated: $CHAIN_REPORT_FILE"
|
||||
git add "$CHAIN_REPORT_FILE" 2>/dev/null || true
|
||||
else
|
||||
log_warn "Report generation did not produce output"
|
||||
fi
|
||||
fi
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Step 5: Create UAT Validation Step Files [P2]
|
||||
|
||||
**Directory:** `src/modules/bmm/workflows/5-validation/uat-validate/steps/`
|
||||
|
||||
#### step-01-load-uat.md
|
||||
|
||||
```markdown
|
||||
# Step 1: Load UAT Document
|
||||
|
||||
## Purpose
|
||||
Load and validate the UAT document for the specified epic.
|
||||
|
||||
## Inputs
|
||||
| Input | Source | Required |
|
||||
|-------|--------|----------|
|
||||
| epic_id | CLI argument | Yes |
|
||||
| uat_dir | Configuration | Yes |
|
||||
|
||||
## Process
|
||||
|
||||
### 1.1 Locate UAT Document
|
||||
Search for UAT document at: `{uat_dir}/epic-{epic_id}-uat.md`
|
||||
|
||||
### 1.2 Validate Structure
|
||||
Confirm document contains:
|
||||
- ## Acceptance Criteria or ## Scenarios section
|
||||
- At least one scenario block
|
||||
|
||||
### 1.3 Parse Scenarios
|
||||
Extract scenario blocks with:
|
||||
- Scenario ID/Title
|
||||
- Given/When/Then steps
|
||||
- Verification command (if present)
|
||||
- Expected result
|
||||
|
||||
## Outputs
|
||||
| Output | Location | Description |
|
||||
|--------|----------|-------------|
|
||||
| scenario_list | Memory | Parsed scenario objects |
|
||||
| scenario_count | Console | Number of scenarios found |
|
||||
|
||||
## Completion Signal
|
||||
UAT_LOADED: {scenario_count}
|
||||
|
||||
## Error Handling
|
||||
| Error | Action |
|
||||
|-------|--------|
|
||||
| File not found | Exit 1 with clear message |
|
||||
| Invalid structure | Exit 1 with parsing error |
|
||||
```
|
||||
|
||||
#### step-02-classify-scenarios.md
|
||||
|
||||
```markdown
|
||||
# Step 2: Classify Scenarios
|
||||
|
||||
## Purpose
|
||||
Categorize scenarios by their executability level.
|
||||
|
||||
## Inputs
|
||||
| Input | Source | Required |
|
||||
|-------|--------|----------|
|
||||
| scenario_list | Step 1 | Yes |
|
||||
|
||||
## Process
|
||||
|
||||
### 2.1 Detect Automatable Scenarios
|
||||
Keywords that indicate automatability:
|
||||
- npx, npm run, yarn
|
||||
- curl, wget, http
|
||||
- pytest, jest, vitest
|
||||
- /health, /api/
|
||||
- exit code, returns
|
||||
|
||||
### 2.2 Detect Semi-Automated
|
||||
Scenarios with commands that require setup:
|
||||
- "Start the server first"
|
||||
- "Ensure database is running"
|
||||
- Manual setup + automated verification
|
||||
|
||||
### 2.3 Classify as Manual
|
||||
No detectable command or automation path.
|
||||
|
||||
## Outputs
|
||||
| Output | Location | Description |
|
||||
|--------|----------|-------------|
|
||||
| automatable | Array | Scenarios to execute |
|
||||
| semi_auto | Array | Scenarios needing setup |
|
||||
| manual | Array | Human verification required |
|
||||
|
||||
## Completion Signal
|
||||
SCENARIOS_CLASSIFIED: {auto}/{semi}/{manual}
|
||||
```
|
||||
|
||||
#### step-03-execute-scenarios.md
|
||||
|
||||
```markdown
|
||||
# Step 3: Execute Scenarios
|
||||
|
||||
## Purpose
|
||||
Run automatable scenarios via shell commands.
|
||||
|
||||
## Inputs
|
||||
| Input | Source | Required |
|
||||
|-------|--------|----------|
|
||||
| automatable | Step 2 | Yes |
|
||||
| gate_mode | CLI | Yes (quick/full) |
|
||||
| timeout | Config | No (default: 30s) |
|
||||
|
||||
## Process
|
||||
|
||||
### 3.1 Filter by Gate Mode
|
||||
- quick: Execute only critical/blocking scenarios
|
||||
- full: Execute all automatable scenarios
|
||||
- skip: Return success without execution
|
||||
|
||||
### 3.2 Execute Each Scenario
|
||||
For each scenario:
|
||||
1. Extract command from verification step
|
||||
2. Execute with timeout: `timeout {seconds} {command}`
|
||||
3. Capture exit code and output
|
||||
4. Record result: PASS (exit 0) or FAIL (exit non-zero)
|
||||
|
||||
### 3.3 Handle Execution Errors
|
||||
- Command not found: Record as FAIL with clear message
|
||||
- Timeout exceeded: Record as FAIL with timeout note
|
||||
- Unexpected error: Record as FAIL with stderr
|
||||
|
||||
## Outputs
|
||||
| Output | Location | Description |
|
||||
|--------|----------|-------------|
|
||||
| results | Array | {scenario_id, status, output, exit_code} |
|
||||
| passed_count | Console | Scenarios that passed |
|
||||
| failed_count | Console | Scenarios that failed |
|
||||
|
||||
## Completion Signal
|
||||
SCENARIOS_EXECUTED: {passed}/{total}
|
||||
```
|
||||
|
||||
#### step-04-evaluate-gate.md
|
||||
|
||||
```markdown
|
||||
# Step 4: Evaluate Gate
|
||||
|
||||
## Purpose
|
||||
Determine pass/fail status and trigger self-healing if needed.
|
||||
|
||||
## Inputs
|
||||
| Input | Source | Required |
|
||||
|-------|--------|----------|
|
||||
| results | Step 3 | Yes |
|
||||
| max_retries | CLI | Yes |
|
||||
| current_attempt | State | Yes |
|
||||
|
||||
## Process
|
||||
|
||||
### 4.1 Check All Results
|
||||
If all automatable scenarios passed:
|
||||
- Set gate_status = PASS
|
||||
- Skip to Step 5
|
||||
|
||||
### 4.2 Handle Failures
|
||||
If any scenario failed:
|
||||
- Collect failed scenario details
|
||||
- Check if current_attempt < max_retries
|
||||
|
||||
### 4.3 Generate Fix Context
|
||||
If retries available:
|
||||
1. Load fix context template
|
||||
2. Populate with failed scenarios
|
||||
3. Write to: `{sprint_artifacts}/uat-fixes/epic-{id}-fix-context-{attempt}.md`
|
||||
|
||||
### 4.4 Trigger Quick-Dev Fix
|
||||
Spawn fresh Claude session:
|
||||
```
|
||||
claude --dangerously-skip-permissions -p "Load fix context, implement fixes..."
|
||||
```
|
||||
|
||||
### 4.5 Increment and Retry
|
||||
- Increment attempt counter
|
||||
- Return to Step 3 for re-validation
|
||||
|
||||
## Outputs
|
||||
| Output | Location | Description |
|
||||
|--------|----------|-------------|
|
||||
| gate_status | State | PASS or FAIL |
|
||||
| fix_context_file | Path | Generated fix context (if failed) |
|
||||
|
||||
## Completion Signal
|
||||
GATE_EVALUATED: PASS|FAIL
|
||||
```
|
||||
|
||||
#### step-05-report-results.md
|
||||
|
||||
```markdown
|
||||
# Step 5: Report Results
|
||||
|
||||
## Purpose
|
||||
Update metrics and output parseable signals.
|
||||
|
||||
## Inputs
|
||||
| Input | Source | Required |
|
||||
|-------|--------|----------|
|
||||
| gate_status | Step 4 | Yes |
|
||||
| results | Step 3 | Yes |
|
||||
| fix_attempts | State | Yes |
|
||||
|
||||
## Process
|
||||
|
||||
### 5.1 Update Metrics File
|
||||
Update `{metrics_dir}/epic-{id}-metrics.yaml`:
|
||||
```yaml
|
||||
validation:
|
||||
gate_executed: true
|
||||
gate_status: "PASS|FAIL"
|
||||
fix_attempts: N
|
||||
scenarios_passed: X
|
||||
scenarios_failed: Y
|
||||
```
|
||||
|
||||
### 5.2 Output Signals
|
||||
Print to stdout (for parent script parsing):
|
||||
```
|
||||
UAT_GATE_RESULT: PASS|FAIL
|
||||
UAT_FIX_ATTEMPTS: N
|
||||
UAT_SCENARIOS_PASSED: X/Y
|
||||
```
|
||||
|
||||
### 5.3 Set Exit Code
|
||||
- 0: PASS
|
||||
- 1: FAIL (fixable, retries remain)
|
||||
- 2: FAIL (max retries exceeded)
|
||||
|
||||
## Outputs
|
||||
| Output | Location | Description |
|
||||
|--------|----------|-------------|
|
||||
| Updated metrics | YAML file | Validation results |
|
||||
| Signals | stdout | Parseable output |
|
||||
|
||||
## Completion Signal
|
||||
RESULTS_REPORTED: {metrics_path}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Step 6: Integrate Fix Context with Handoff Pattern [P2]
|
||||
|
||||
**Directory structure:**
|
||||
|
||||
```
|
||||
docs/sprint-artifacts/
|
||||
├── handoffs/
|
||||
│ ├── epic-1-to-2-handoff.md
|
||||
│ └── epic-2-to-3-handoff.md
|
||||
├── uat-fixes/ # NEW
|
||||
│ ├── epic-1-fix-context-1.md
|
||||
│ └── epic-2-fix-context-1.md
|
||||
└── metrics/
|
||||
├── epic-1-metrics.yaml
|
||||
└── epic-2-metrics.yaml
|
||||
```
|
||||
|
||||
**In uat-validate.sh, implement generate_fix_context():**
|
||||
|
||||
```bash
|
||||
generate_fix_context() {
|
||||
local epic_id="$1"
|
||||
local attempt="$2"
|
||||
local failed_scenarios="$3"
|
||||
|
||||
local fix_dir="$SPRINT_ARTIFACTS_DIR/uat-fixes"
|
||||
mkdir -p "$fix_dir"
|
||||
|
||||
local fix_file="$fix_dir/epic-${epic_id}-fix-context-${attempt}.md"
|
||||
local template="$PROJECT_ROOT/src/modules/bmm/workflows/5-validation/uat-validate/uat-fix-context-template.md"
|
||||
|
||||
# Render template with variables
|
||||
sed -e "s/{epic_id}/$epic_id/g" \
|
||||
-e "s/{attempt}/$attempt/g" \
|
||||
-e "s/{timestamp}/$(date -u +"%Y-%m-%dT%H:%M:%SZ")/g" \
|
||||
"$template" > "$fix_file"
|
||||
|
||||
# Append failed scenarios
|
||||
echo "" >> "$fix_file"
|
||||
echo "## Failed Scenarios" >> "$fix_file"
|
||||
echo "$failed_scenarios" >> "$fix_file"
|
||||
|
||||
echo "$fix_file"
|
||||
}
|
||||
```
|
||||
|
||||
**In quick-dev fix session, load context:**
|
||||
|
||||
```bash
|
||||
run_quick_dev_fix() {
|
||||
local fix_context_file="$1"
|
||||
local epic_id="$2"
|
||||
local attempt="$3"
|
||||
|
||||
local fix_prompt="You are Barry, the Quick Flow Solo Dev.
|
||||
|
||||
Load and process this fix context document:
|
||||
$fix_context_file
|
||||
|
||||
Your task:
|
||||
1. Read the failed scenarios and error details
|
||||
2. Analyze root cause for each failure
|
||||
3. Implement targeted fixes
|
||||
4. Run the failing commands to verify fixes
|
||||
5. Stage changes: git add -A
|
||||
6. Commit with message: fix(epic-${epic_id}): UAT fix #${attempt}
|
||||
|
||||
Constraints:
|
||||
- Only fix the identified failures
|
||||
- Do not refactor unrelated code
|
||||
- Run tests after fixes
|
||||
|
||||
When done, output:
|
||||
FIX_COMPLETE: {number_fixed}/{total_failures}"
|
||||
|
||||
# Fresh Claude context for fixes
|
||||
claude --dangerously-skip-permissions -p "$fix_prompt"
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Execution Order
|
||||
|
||||
1. **`scripts/uat-validate.sh`** - Core orchestration (enables self-healing)
|
||||
2. **`scripts/epic-execute.sh`** modifications - Metrics collection (enables reporting)
|
||||
3. **`scripts/epic-chain.sh`** modifications - UAT gate + report integration
|
||||
4. **Step files** - Documentation and maintainability
|
||||
|
||||
---
|
||||
|
||||
## Testing Plan
|
||||
|
||||
### Unit Tests
|
||||
|
||||
1. `uat-validate.sh --dry-run` - Verify argument parsing and flow
|
||||
2. Metrics YAML structure matches template at `src/modules/bmm/workflows/4-implementation/epic-chain/templates/epic-metrics-template.yaml`
|
||||
3. Signal output format matches spec
|
||||
|
||||
### Integration Tests
|
||||
|
||||
1. Run epic-execute with metrics collection, verify `docs/sprint-artifacts/metrics/epic-{id}-metrics.yaml` created
|
||||
2. Run uat-validate against known-passing UAT doc, verify `UAT_GATE_RESULT: PASS`
|
||||
3. Run uat-validate against known-failing UAT doc, verify fix loop triggers
|
||||
4. Run epic-chain with `--uat-gate=quick`, verify gate runs after each epic
|
||||
|
||||
### Manual Verification
|
||||
|
||||
1. Run `epic-chain 1-3` on test project
|
||||
2. Verify `docs/sprint-artifacts/metrics/` populated with per-epic metrics
|
||||
3. Verify `docs/sprint-artifacts/uat-fixes/` created on UAT failure
|
||||
4. Verify `chain-execution-report.md` generated with accurate aggregated data
|
||||
|
||||
---
|
||||
|
||||
## Dependencies
|
||||
|
||||
| Dependency | Purpose | Required |
|
||||
|------------|---------|----------|
|
||||
| `yq` | YAML manipulation | Recommended (fallback: inline append) |
|
||||
| `timeout` | Command timeout | Yes (GNU coreutils) |
|
||||
| `claude` CLI | Isolated context spawning | Yes |
|
||||
| `sed` | Template rendering | Yes (POSIX) |
|
||||
|
||||
---
|
||||
|
||||
## Risks & Mitigations
|
||||
|
||||
| Risk | Mitigation |
|
||||
|------|------------|
|
||||
| `yq` not installed | Detect and fall back to inline YAML append via echo/cat |
|
||||
| UAT document malformed | Validate structure before processing, clear error messages |
|
||||
| Claude session fails | Capture exit code, log output, allow retry |
|
||||
| Infinite fix loop | Hard limit via `--max-retries`, default 2 |
|
||||
| Scenario command not found | Record as FAIL with clear "command not found" message |
|
||||
| Timeout exceeded | Record as FAIL, include timeout duration in output |
|
||||
|
||||
---
|
||||
|
||||
## Reference Files
|
||||
|
||||
### Existing (to read for patterns)
|
||||
|
||||
- `scripts/epic-execute.sh` - Context isolation, logging, argument parsing
|
||||
- `scripts/epic-chain.sh` - Orchestration, CLI flags, integration points
|
||||
- `src/modules/bmm/workflows/5-validation/uat-validate/instructions.md` - Validation logic
|
||||
- `src/modules/bmm/workflows/5-validation/uat-validate/uat-fix-context-template.md` - Fix context template
|
||||
- `src/modules/bmm/workflows/4-implementation/epic-chain/templates/epic-metrics-template.yaml` - Metrics schema
|
||||
- `src/modules/bmm/workflows/4-implementation/epic-chain/steps/step-10-generate-report.md` - Report generation
|
||||
|
||||
### To Create
|
||||
|
||||
- `scripts/uat-validate.sh`
|
||||
- `src/modules/bmm/workflows/5-validation/uat-validate/steps/step-01-load-uat.md`
|
||||
- `src/modules/bmm/workflows/5-validation/uat-validate/steps/step-02-classify-scenarios.md`
|
||||
- `src/modules/bmm/workflows/5-validation/uat-validate/steps/step-03-execute-scenarios.md`
|
||||
- `src/modules/bmm/workflows/5-validation/uat-validate/steps/step-04-evaluate-gate.md`
|
||||
- `src/modules/bmm/workflows/5-validation/uat-validate/steps/step-05-report-results.md`
|
||||
|
||||
### To Modify
|
||||
|
||||
- `scripts/epic-execute.sh` - Add metrics collection
|
||||
- `scripts/epic-chain.sh` - Add UAT gate and report generation
|
||||
Loading…
Reference in New Issue