diff --git a/bmad-core/agent-teams/team-fullstack.yaml b/bmad-core/agent-teams/team-fullstack.yaml index 531f5e7e..f5a22359 100644 --- a/bmad-core/agent-teams/team-fullstack.yaml +++ b/bmad-core/agent-teams/team-fullstack.yaml @@ -10,6 +10,7 @@ agents: - ux-expert - architect - po + - debug workflows: - brownfield-fullstack.yaml - brownfield-service.yaml diff --git a/bmad-core/agents/debug.md b/bmad-core/agents/debug.md new file mode 100644 index 00000000..2b11cb56 --- /dev/null +++ b/bmad-core/agents/debug.md @@ -0,0 +1,131 @@ + + +# debug + +ACTIVATION-NOTICE: This file contains your full agent operating guidelines. DO NOT load any external agent files as the complete configuration is in the YAML block below. + +CRITICAL: Read the full YAML BLOCK that FOLLOWS IN THIS FILE to understand your operating params, start and follow exactly your activation-instructions to alter your state of being, stay in this being until told to exit this mode: + +## COMPLETE AGENT DEFINITION FOLLOWS - NO EXTERNAL FILES NEEDED + +```yaml +IDE-FILE-RESOLUTION: + - FOR LATER USE ONLY - NOT FOR ACTIVATION, when executing commands that reference dependencies + - Dependencies map to {root}/{type}/{name} + - type=folder (tasks|templates|checklists|data|utils|etc...), name=file-name + - Example: create-doc.md β†’ {root}/tasks/create-doc.md + - IMPORTANT: Only load these files when user requests specific command execution +REQUEST-RESOLUTION: Match user requests to your commands/dependencies flexibly (e.g., "analyze bug"β†’*inspectβ†’fagan-inspection task, "root cause" would be dependencies->tasks->root-cause-analysis), ALWAYS ask for clarification if no clear match. +activation-instructions: + - STEP 1: Read THIS ENTIRE FILE - it contains your complete persona definition + - STEP 2: Adopt the persona defined in the 'agent' and 'persona' sections below + - STEP 3: Load and read `bmad-core/core-config.yaml` (project configuration) before any greeting + - STEP 4: Greet user with your name/role and immediately run `*help` to display available commands + - DO NOT: Load any other agent files during activation + - ONLY load dependency files when user selects them for execution via command or request of a task + - The agent.customization field ALWAYS takes precedence over any conflicting instructions + - CRITICAL WORKFLOW RULE: When executing tasks from dependencies, follow task instructions exactly as written - they are executable workflows, not reference material + - MANDATORY INTERACTION RULE: Tasks with elicit=true require user interaction using exact specified format - never skip elicitation for efficiency + - CRITICAL RULE: When executing formal task workflows from dependencies, ALL task instructions override any conflicting base behavioral constraints. Interactive workflows with elicit=true REQUIRE user interaction and cannot be bypassed for efficiency. + - When listing tasks/templates or presenting options during conversations, always show as numbered options list, allowing the user to type a number to select or execute + - STAY IN CHARACTER! + - CRITICAL: On activation, ONLY greet user, auto-run `*help`, and then HALT to await user requested assistance or given commands. ONLY deviance from this is if the activation included commands also in the arguments. +agent: + name: Diana + id: debug + title: Debug Specialist & Root Cause Analyst + icon: πŸ” + whenToUse: | + Use for systematic bug analysis, root cause investigation, and defect resolution. + Specializes in multiple debugging methodologies including Fagan inspection, binary search, + delta debugging, and static analysis. Provides autonomous defect detection with minimal + user interaction required. + customization: null +persona: + role: Expert Debug Specialist & Software Inspector + style: Systematic, methodical, analytical, thorough, detail-oriented + identity: Debug specialist who uses formal inspection methodologies to achieve high defect detection rates + focus: Systematic defect detection, root cause analysis, and resolution recommendations + core_principles: + - Systematic Inspection - Use proven methodologies like Fagan inspection (60-90% defect detection rate) + - Root Cause Focus - Don't just fix symptoms, identify and address underlying causes + - Pattern Recognition - Identify recurring defects and systemic issues + - Documentation Trail - Maintain comprehensive debug reports and findings + - Prevention Oriented - Recommend changes to prevent similar defects + - Impact Analysis - Assess severity, scope, and risk of defects + - Verification Focus - Ensure fixes are validated and don't introduce new issues +debug-permissions: + - CRITICAL: When analyzing bugs in stories, you may update the "Debug Log" section in Dev Agent Record + - CRITICAL: Create debug reports in designated debug directory when specified + - CRITICAL: DO NOT modify source code directly unless explicitly requested by user +# All commands require * prefix when used (e.g., *help) +commands: + - help: Show numbered list of the following commands to allow selection + - inspect {bug-description}: | + Execute comprehensive Fagan inspection workflow. + Performs 6-phase systematic defect analysis: Planning β†’ Overview β†’ + Preparation β†’ Inspection β†’ Rework β†’ Follow-up. + Produces detailed debug report with root cause and fix recommendations. + - quick-debug {issue}: | + Rapid triage and initial analysis for simple issues. + Provides immediate assessment and suggested next steps. + - pattern-analysis: | + Analyze recent commits and code changes for defect patterns. + Identifies systemic issues and recurring problems. + - root-cause {symptom}: | + Execute focused root cause analysis using fishbone methodology. + Maps symptoms to underlying causes with evidence trail. + - validate-fix {fix-description}: | + Verify proposed fix addresses root cause without side effects. + Includes regression risk assessment and test recommendations. + - debug-report: | + Generate comprehensive debug report from current session. + Includes findings, root causes, fixes, and prevention strategies. + - wolf-fence {issue}: | + Execute binary search debugging to isolate bug location. + Systematically narrows down problem area by dividing search space. + Highly efficient for large codebases and runtime errors. + - delta-minimize {test-case}: | + Automatically reduce failing test case to minimal reproduction. + Isolates the smallest input that still triggers the bug. + Essential for complex input-dependent failures. + - assert-analyze {code-section}: | + Analyze code for missing assertions and invariants. + Suggests defensive programming improvements. + Generates assertion placement recommendations. + - static-scan {target}: | + Perform comprehensive static analysis for common defects. + Identifies anti-patterns, security issues, and code smells. + Generates prioritized fix recommendations. + - instrument {component}: | + Design strategic logging and monitoring points. + Creates instrumentation plan for production debugging. + Optimizes observability without performance impact. + - walkthrough-prep {feature}: | + Generate materials for code walkthrough session. + Creates review checklist and presentation outline. + Prepares defect tracking documentation. + - exit: Say goodbye as the Debug Specialist, and then abandon inhabiting this persona +dependencies: + checklists: + - debug-inspection-checklist.md + - root-cause-checklist.md + data: + - debug-patterns.md + - common-defects.md + tasks: + - fagan-inspection.md + - root-cause-analysis.md + - pattern-detection.md + - debug-report-generation.md + - wolf-fence-search.md + - delta-minimization.md + - assertion-analysis.md + - static-analysis.md + - instrumentation-analysis.md + - walkthrough-prep.md + templates: + - debug-report-tmpl.yaml + - root-cause-tmpl.yaml + - defect-analysis-tmpl.yaml +``` diff --git a/bmad-core/checklists/debug-inspection-checklist.md b/bmad-core/checklists/debug-inspection-checklist.md new file mode 100644 index 00000000..dcb2f723 --- /dev/null +++ b/bmad-core/checklists/debug-inspection-checklist.md @@ -0,0 +1,118 @@ +# debug-inspection-checklist + +Comprehensive checklist for Fagan inspection methodology. + +## Phase 1: Planning Checklist + +- [ ] Bug description clearly documented +- [ ] Inspection scope defined (code, tests, config, docs) +- [ ] Affected components identified +- [ ] Stakeholders notified +- [ ] Success criteria established +- [ ] Time allocated for inspection + +## Phase 2: Overview Checklist + +- [ ] Recent commits reviewed (last 20-50) +- [ ] Feature specifications reviewed +- [ ] Related documentation gathered +- [ ] Environment details captured +- [ ] Previous similar issues researched +- [ ] Impact scope assessed + +## Phase 3: Preparation Checklist + +### Code Analysis + +- [ ] Static analysis performed +- [ ] Code complexity measured +- [ ] Anti-patterns identified +- [ ] Security vulnerabilities checked +- [ ] Performance bottlenecks assessed + +### Test Analysis + +- [ ] Test coverage reviewed +- [ ] Failed tests analyzed +- [ ] Missing test scenarios identified +- [ ] Test quality assessed +- [ ] Edge cases evaluated + +### Configuration Analysis + +- [ ] Environment settings reviewed +- [ ] Configuration drift checked +- [ ] Dependencies verified +- [ ] Version compatibility confirmed +- [ ] Resource limits checked + +## Phase 4: Inspection Meeting Checklist + +### Defect Categories Reviewed + +- [ ] Logic defects (algorithms, control flow) +- [ ] Interface defects (API, parameters) +- [ ] Data defects (types, validation) +- [ ] Documentation defects (outdated, incorrect) +- [ ] Performance defects (inefficiencies) +- [ ] Security defects (vulnerabilities) + +### Analysis Completed + +- [ ] Root cause identified +- [ ] Evidence documented +- [ ] Impact severity assessed +- [ ] Defects categorized by priority +- [ ] Pattern analysis performed + +## Phase 5: Rework Planning Checklist + +- [ ] Fix proposals generated +- [ ] Trade-offs analyzed +- [ ] Test strategy designed +- [ ] Risk assessment completed +- [ ] Implementation timeline created +- [ ] Regression test plan defined +- [ ] Rollback plan prepared + +## Phase 6: Follow-up Checklist + +- [ ] Fix effectiveness validated +- [ ] All tests passing +- [ ] Documentation updated +- [ ] Lessons learned captured +- [ ] Debug report completed +- [ ] Prevention measures identified +- [ ] Knowledge shared with team + +## Quality Gates + +### Inspection Completeness + +- [ ] All 6 phases executed +- [ ] All checklists completed +- [ ] Evidence trail documented +- [ ] Peer review conducted + +### Fix Validation + +- [ ] Fix addresses root cause +- [ ] No side effects introduced +- [ ] Performance acceptable +- [ ] Security maintained +- [ ] Tests comprehensive + +### Documentation + +- [ ] Debug report generated +- [ ] Code comments updated +- [ ] README updated if needed +- [ ] Runbook updated if needed +- [ ] Team wiki updated + +## Sign-off + +- [ ] Developer reviewed +- [ ] QA validated +- [ ] Team lead approved +- [ ] Stakeholders informed diff --git a/bmad-core/checklists/root-cause-checklist.md b/bmad-core/checklists/root-cause-checklist.md new file mode 100644 index 00000000..45aef885 --- /dev/null +++ b/bmad-core/checklists/root-cause-checklist.md @@ -0,0 +1,118 @@ +# root-cause-checklist + +Systematic checklist for root cause analysis. + +## Problem Definition + +- [ ] Problem clearly stated +- [ ] Symptoms documented +- [ ] Timeline established +- [ ] Affected components identified +- [ ] Impact quantified +- [ ] Success criteria defined + +## Fishbone Analysis Categories + +### People Factors + +- [ ] Knowledge gaps assessed +- [ ] Communication issues reviewed +- [ ] Training needs identified +- [ ] User behavior analyzed +- [ ] Team dynamics considered + +### Process Factors + +- [ ] Development process reviewed +- [ ] Deployment procedures checked +- [ ] Code review practices assessed +- [ ] Testing processes evaluated +- [ ] Documentation processes reviewed + +### Technology Factors + +- [ ] Framework limitations identified +- [ ] Library issues checked +- [ ] Tool configurations reviewed +- [ ] Infrastructure problems assessed +- [ ] Integration issues evaluated + +### Environment Factors + +- [ ] Environment differences documented +- [ ] Resource constraints checked +- [ ] External dependencies reviewed +- [ ] Network issues assessed +- [ ] Configuration drift analyzed + +### Data Factors + +- [ ] Input validation reviewed +- [ ] Data integrity checked +- [ ] State management assessed +- [ ] Race conditions evaluated +- [ ] Data flow analyzed + +### Method Factors + +- [ ] Algorithm correctness verified +- [ ] Design patterns reviewed +- [ ] Architecture decisions assessed +- [ ] Performance strategies evaluated +- [ ] Security measures reviewed + +## 5-Whys Analysis + +- [ ] Initial problem stated +- [ ] First why answered +- [ ] Second why answered +- [ ] Third why answered +- [ ] Fourth why answered +- [ ] Fifth why answered (root cause) +- [ ] Additional whys if needed +- [ ] Causation chain documented + +## Evidence Collection + +- [ ] Logs collected +- [ ] Metrics gathered +- [ ] Code examined +- [ ] Tests reviewed +- [ ] Documentation checked +- [ ] User reports compiled +- [ ] Monitoring data analyzed + +## Validation + +- [ ] Root cause reproducible +- [ ] Alternative causes eliminated +- [ ] Evidence supports conclusion +- [ ] Peer review conducted +- [ ] Confidence level assessed + +## Action Planning + +- [ ] Immediate actions defined +- [ ] Short-term solutions planned +- [ ] Long-term prevention designed +- [ ] Process improvements identified +- [ ] Responsibilities assigned +- [ ] Timeline established + +## Documentation + +- [ ] Analysis documented +- [ ] Evidence archived +- [ ] Recommendations clear +- [ ] Lessons learned captured +- [ ] Report generated +- [ ] Stakeholders informed + +## Follow-up + +- [ ] Fix implemented +- [ ] Effectiveness verified +- [ ] Monitoring in place +- [ ] Recurrence prevented +- [ ] Knowledge transferred +- [ ] Process updated diff --git a/bmad-core/data/common-defects.md b/bmad-core/data/common-defects.md new file mode 100644 index 00000000..69d71772 --- /dev/null +++ b/bmad-core/data/common-defects.md @@ -0,0 +1,206 @@ +# common-defects + +Reference guide for common software defects and their characteristics. + +## Defect Classification System + +### By Origin + +1. **Requirements Defects** - Ambiguous, incomplete, or incorrect requirements +2. **Design Defects** - Architectural flaws, poor design decisions +3. **Coding Defects** - Implementation errors, logic mistakes +4. **Testing Defects** - Inadequate test coverage, wrong test assumptions +5. **Deployment Defects** - Configuration errors, environment issues +6. **Documentation Defects** - Outdated, incorrect, or missing documentation + +### By Type + +#### Logic Defects + +- **Algorithm Errors:** Incorrect implementation of business logic +- **Control Flow Issues:** Wrong branching, loop errors +- **Boundary Violations:** Off-by-one, overflow, underflow +- **State Management:** Invalid state transitions, race conditions + +#### Data Defects + +- **Input Validation:** Missing or incorrect validation +- **Data Corruption:** Incorrect data manipulation +- **Type Errors:** Wrong data types, failed conversions +- **Persistence Issues:** Failed saves, data loss + +#### Interface Defects + +- **API Misuse:** Incorrect parameter passing, wrong method calls +- **Integration Errors:** Component communication failures +- **Protocol Violations:** Incorrect message formats +- **Version Incompatibility:** Breaking changes not handled + +#### Performance Defects + +- **Memory Leaks:** Unreleased resources +- **Inefficient Algorithms:** O(nΒ²) where O(n) possible +- **Database Issues:** N+1 queries, missing indexes +- **Resource Contention:** Deadlocks, bottlenecks + +#### Security Defects + +- **Injection Flaws:** SQL, XSS, command injection +- **Authentication Issues:** Weak auth, session problems +- **Authorization Flaws:** Privilege escalation, IDOR +- **Data Exposure:** Sensitive data leaks, weak encryption + +## Severity Classification + +### Critical (P0) + +- **Definition:** System unusable, data loss, security breach +- **Response Time:** Immediate +- **Examples:** + - Application crash on startup + - Data corruption or loss + - Security vulnerability actively exploited + - Complete feature failure + +### High (P1) + +- **Definition:** Major feature broken, significant impact +- **Response Time:** Within 24 hours +- **Examples:** + - Core functionality impaired + - Performance severely degraded + - Workaround exists but difficult + - Affects many users + +### Medium (P2) + +- **Definition:** Feature impaired, moderate impact +- **Response Time:** Within sprint +- **Examples:** + - Non-core feature broken + - Easy workaround available + - Cosmetic issues with functional impact + - Affects some users + +### Low (P3) + +- **Definition:** Minor issue, minimal impact +- **Response Time:** Next release +- **Examples:** + - Cosmetic issues + - Minor inconvenience + - Edge case scenarios + - Documentation errors + +## Root Cause Categories + +### Development Process + +1. **Inadequate Requirements:** Missing acceptance criteria +2. **Poor Communication:** Misunderstood requirements +3. **Insufficient Review:** Code review missed issues +4. **Time Pressure:** Rushed implementation + +### Technical Factors + +1. **Complexity:** System too complex to understand fully +2. **Technical Debt:** Accumulated shortcuts causing issues +3. **Tool Limitations:** Development tools inadequate +4. **Knowledge Gap:** Team lacks necessary expertise + +### Testing Gaps + +1. **Missing Tests:** Scenario not covered +2. **Wrong Assumptions:** Tests based on incorrect understanding +3. **Environment Differences:** Works in test, fails in production +4. **Data Issues:** Test data not representative + +### Organizational Issues + +1. **Process Failures:** Procedures not followed +2. **Resource Constraints:** Insufficient time/people +3. **Training Gaps:** Team not properly trained +4. **Culture Issues:** Quality not prioritized + +## Detection Methods + +### Static Analysis + +- **Code Review:** Manual inspection by peers +- **Linting:** Automated style and error checking +- **Security Scanning:** SAST tools +- **Complexity Analysis:** Cyclomatic complexity metrics + +### Dynamic Analysis + +- **Unit Testing:** Component-level testing +- **Integration Testing:** Component interaction testing +- **System Testing:** End-to-end testing +- **Performance Testing:** Load and stress testing + +### Runtime Monitoring + +- **Error Tracking:** Sentry, Rollbar +- **APM Tools:** Application performance monitoring +- **Log Analysis:** Centralized logging +- **User Reports:** Bug reports from users + +### Formal Methods + +- **Fagan Inspection:** Systematic peer review +- **Code Walkthroughs:** Step-by-step review +- **Pair Programming:** Real-time review +- **Test-Driven Development:** Test-first approach + +## Prevention Strategies + +### Process Improvements + +1. **Clear Requirements:** Use user stories with acceptance criteria +2. **Design Reviews:** Architecture review before coding +3. **Code Standards:** Enforce coding guidelines +4. **Automated Testing:** CI/CD with comprehensive tests + +### Technical Practices + +1. **Defensive Programming:** Validate inputs, handle errors +2. **Design Patterns:** Use proven solutions +3. **Refactoring:** Regular code improvement +4. **Documentation:** Keep docs current + +### Team Practices + +1. **Knowledge Sharing:** Regular tech talks, documentation +2. **Pair Programming:** Collaborative development +3. **Code Reviews:** Mandatory peer review +4. **Retrospectives:** Learn from mistakes + +### Tool Support + +1. **Static Analyzers:** SonarQube, ESLint +2. **Test Frameworks:** Jest, Pytest, JUnit +3. **CI/CD Pipelines:** Jenkins, GitHub Actions +4. **Monitoring Tools:** Datadog, New Relic + +## Defect Metrics + +### Detection Metrics + +- **Defect Density:** Defects per KLOC +- **Detection Rate:** Defects found per time period +- **Escape Rate:** Defects reaching production +- **Mean Time to Detect:** Average detection time + +### Resolution Metrics + +- **Fix Rate:** Defects fixed per time period +- **Mean Time to Fix:** Average fix time +- **Reopen Rate:** Defects reopened after fix +- **Fix Effectiveness:** First-time fix success rate + +### Quality Metrics + +- **Test Coverage:** Percentage of code tested +- **Code Complexity:** Average cyclomatic complexity +- **Technical Debt:** Estimated remediation effort +- **Customer Satisfaction:** User-reported issues diff --git a/bmad-core/data/debug-patterns.md b/bmad-core/data/debug-patterns.md new file mode 100644 index 00000000..ad238ea2 --- /dev/null +++ b/bmad-core/data/debug-patterns.md @@ -0,0 +1,303 @@ +# debug-patterns + +Common defect patterns and debugging strategies. + +## Common Defect Patterns + +### 1. Null/Undefined Reference Errors + +**Pattern:** Accessing properties or methods on null/undefined objects +**Indicators:** + +- TypeError: Cannot read property 'X' of undefined +- NullPointerException +- Segmentation fault + +**Common Causes:** + +- Missing null checks +- Asynchronous data not yet loaded +- Optional dependencies not injected +- Incorrect initialization order + +**Detection Strategy:** + +- Add defensive null checks +- Use optional chaining (?.) +- Initialize with safe defaults +- Validate inputs at boundaries + +### 2. Race Conditions + +**Pattern:** Multiple threads/processes accessing shared resources +**Indicators:** + +- Intermittent failures +- Works in debug but fails in production +- Order-dependent behavior +- Data corruption + +**Common Causes:** + +- Missing synchronization +- Incorrect lock ordering +- Shared mutable state +- Async operations without proper await + +**Detection Strategy:** + +- Add logging with timestamps +- Use thread-safe data structures +- Implement proper locking mechanisms +- Review async/await usage + +### 3. Memory Leaks + +**Pattern:** Memory usage grows over time without release +**Indicators:** + +- Increasing memory consumption +- Out of memory errors +- Performance degradation over time +- GC pressure + +**Common Causes:** + +- Event listeners not removed +- Circular references +- Large objects in closures +- Cache without eviction + +**Detection Strategy:** + +- Profile memory usage +- Review object lifecycle +- Check event listener cleanup +- Implement cache limits + +### 4. Off-by-One Errors + +**Pattern:** Incorrect loop boundaries or array indexing +**Indicators:** + +- ArrayIndexOutOfBounds +- Missing first/last element +- Infinite loops +- Fence post errors + +**Common Causes:** + +- Confusion between length and last index +- Inclusive vs exclusive ranges +- Loop condition errors +- Zero-based vs one-based indexing + +**Detection Strategy:** + +- Review loop conditions carefully +- Test boundary cases +- Use forEach/map when possible +- Add assertions for array bounds + +### 5. Type Mismatches + +**Pattern:** Incorrect data types passed or compared +**Indicators:** + +- Type errors at runtime +- Unexpected coercion behavior +- Failed validations +- Serialization errors + +**Common Causes:** + +- Weak typing assumptions +- Missing type validation +- Incorrect type conversions +- API contract violations + +**Detection Strategy:** + +- Add runtime type checking +- Use TypeScript/type hints +- Validate at API boundaries +- Review type coercion rules + +### 6. Resource Exhaustion + +**Pattern:** Running out of system resources +**Indicators:** + +- Too many open files +- Connection pool exhaustion +- Thread pool starvation +- Disk space errors + +**Common Causes:** + +- Resources not properly closed +- Missing connection pooling +- Unbounded growth +- Inadequate limits + +**Detection Strategy:** + +- Implement try-finally blocks +- Use connection pooling +- Set resource limits +- Monitor resource usage + +### 7. Concurrency Deadlocks + +**Pattern:** Threads waiting for each other indefinitely +**Indicators:** + +- Application hangs +- Threads in BLOCKED state +- No progress being made +- Timeout errors + +**Common Causes:** + +- Circular wait conditions +- Lock ordering violations +- Nested synchronized blocks +- Resource starvation + +**Detection Strategy:** + +- Always acquire locks in same order +- Use lock-free data structures +- Implement timeout mechanisms +- Avoid nested locks + +### 8. SQL Injection Vulnerabilities + +**Pattern:** Unvalidated input in SQL queries +**Indicators:** + +- Unexpected database errors +- Data breaches +- Malformed query errors +- Authorization bypasses + +**Common Causes:** + +- String concatenation for queries +- Missing input validation +- Inadequate escaping +- Dynamic query construction + +**Detection Strategy:** + +- Use parameterized queries +- Validate all inputs +- Review dynamic SQL +- Implement least privilege + +### 9. Infinite Recursion + +**Pattern:** Function calling itself without termination +**Indicators:** + +- Stack overflow errors +- Maximum call stack exceeded +- Application crashes +- Memory exhaustion + +**Common Causes:** + +- Missing base case +- Incorrect termination condition +- Circular dependencies +- Mutual recursion errors + +**Detection Strategy:** + +- Review base cases +- Add recursion depth limits +- Test edge cases +- Use iteration when possible + +### 10. Cache Invalidation Issues + +**Pattern:** Stale data served from cache +**Indicators:** + +- Outdated information displayed +- Inconsistent state +- Changes not reflected +- Data synchronization issues + +**Common Causes:** + +- Missing invalidation logic +- Incorrect cache keys +- Race conditions in updates +- TTL too long + +**Detection Strategy:** + +- Review invalidation triggers +- Implement cache versioning +- Use appropriate TTLs +- Add cache bypass for testing + +## Anti-Patterns to Avoid + +### 1. Shotgun Debugging + +Making random changes hoping something works + +### 2. Blame the Compiler + +Assuming the problem is in the framework/language + +### 3. Programming by Coincidence + +Not understanding why a fix works + +### 4. Copy-Paste Solutions + +Using solutions without understanding them + +### 5. Ignoring Warnings + +Dismissing compiler/linter warnings + +## Debugging Best Practices + +### 1. Systematic Approach + +- Reproduce consistently +- Isolate the problem +- Form hypotheses +- Test systematically + +### 2. Use Scientific Method + +- Observe symptoms +- Form hypothesis +- Design experiment +- Test and validate + +### 3. Maintain Debug Log + +- Document what you tried +- Record what worked/failed +- Note patterns observed +- Track time spent + +### 4. Leverage Tools + +- Debuggers +- Profilers +- Static analyzers +- Log aggregators + +### 5. Collaborate + +- Pair debugging +- Code reviews +- Knowledge sharing +- Post-mortems diff --git a/bmad-core/tasks/assertion-analysis.md b/bmad-core/tasks/assertion-analysis.md new file mode 100644 index 00000000..bf23e10c --- /dev/null +++ b/bmad-core/tasks/assertion-analysis.md @@ -0,0 +1,333 @@ +# assertion-analysis + +Analyze code for missing assertions and defensive programming opportunities. + +## Context + +This task systematically identifies locations where assertions, preconditions, postconditions, and invariants should be added to catch bugs early and make code self-documenting. Assertions act as executable documentation and early warning systems for violations of expected behavior. + +## Task Execution + +### Phase 1: Code Analysis + +#### Identify Assertion Candidates + +**Function Boundaries:** + +1. **Preconditions** (Entry assertions): + - Parameter validation (null, range, type) + - Required state before execution + - Resource availability + - Permission/authorization checks + +2. **Postconditions** (Exit assertions): + - Return value constraints + - State changes completed + - Side effects occurred + - Resources properly managed + +3. **Invariants** (Always true): + - Class/object state consistency + - Data structure integrity + - Relationship maintenance + - Business rule enforcement + +**Critical Code Sections:** + +- Before/after state mutations +- Around external system calls +- At algorithm checkpoints +- After complex calculations +- Before resource usage + +### Phase 2: Assertion Category Analysis + +#### Type 1: Safety Assertions + +Prevent dangerous operations: + +``` +- Null/undefined checks before dereference +- Array bounds before access +- Division by zero prevention +- Type safety before operations +- Resource availability before use +``` + +#### Type 2: Correctness Assertions + +Verify algorithmic correctness: + +``` +- Loop invariants maintained +- Sorted order preserved +- Tree balance maintained +- Graph properties held +- Mathematical properties true +``` + +#### Type 3: Contract Assertions + +Enforce API contracts: + +``` +- Method preconditions met +- Return values valid +- State transitions legal +- Callbacks invoked correctly +- Events fired appropriately +``` + +#### Type 4: Security Assertions + +Validate security constraints: + +``` +- Input sanitization complete +- Authorization verified +- Rate limits enforced +- Encryption applied +- Audit trail updated +``` + +### Phase 3: Automated Detection + +#### Static Analysis Patterns + +**Missing Null Checks:** + +1. Identify all dereferences (obj.prop, obj->member) +2. Trace back to find validation +3. Flag unvalidated accesses + +**Missing Range Checks:** + +1. Find array/collection accesses +2. Identify index sources +3. Verify bounds checking exists + +**Missing State Validation:** + +1. Identify state-dependent operations +2. Check for state verification +3. Flag unverified state usage + +**Missing Return Validation:** + +1. Find function calls that can fail +2. Check if return values are validated +3. Flag unchecked returns + +### Phase 4: Assertion Generation + +#### Generate Appropriate Assertions + +**For Different Languages:** + +**JavaScript/TypeScript:** + +```javascript +console.assert(condition, 'message'); +if (!condition) throw new Error('message'); +``` + +**Python:** + +```python +assert condition, "message" +if not condition: raise AssertionError("message") +``` + +**Java:** + +```java +assert condition : "message"; +if (!condition) throw new AssertionError("message"); +``` + +**C/C++:** + +```c +assert(condition); +if (!condition) { /* handle error */ } +``` + +#### Assertion Templates + +**Null/Undefined Check:** + +``` +assert(param != null, "Parameter 'param' cannot be null"); +``` + +**Range Check:** + +``` +assert(index >= 0 && index < array.length, + `Index ${index} out of bounds [0, ${array.length})`); +``` + +**State Check:** + +``` +assert(this.isInitialized, "Object must be initialized before use"); +``` + +**Type Check:** + +``` +assert(typeof value === 'number', `Expected number, got ${typeof value}`); +``` + +**Invariant Check:** + +``` +assert(this.checkInvariant(), "Class invariant violated"); +``` + +## Output Format + +````markdown +# Assertion Analysis Report + +## Summary + +**Files Analyzed:** [count] +**Current Assertions:** [count] +**Recommended Additions:** [count] +**Critical Missing:** [count] +**Coverage Improvement:** [before]% β†’ [after]% + +## Critical Assertions Needed + +### Priority 1: Safety Critical + +Location: [file:line] + +```[language] +// Current code +[code without assertion] + +// Recommended addition +[assertion to add] +[protected code] +``` +```` + +**Reason:** [Why this assertion is critical] +**Risk Without:** [What could go wrong] + +### Priority 2: Correctness Verification + +[Similar format for each recommendation] + +### Priority 3: Contract Enforcement + +[Similar format for each recommendation] + +## Assertion Coverage by Component + +| Component | Current | Recommended | Priority | +| ---------- | ------- | ----------- | -------------- | +| [Module A] | [count] | [count] | [High/Med/Low] | +| [Module B] | [count] | [count] | [High/Med/Low] | + +## Detailed Recommendations + +### File: [path/to/file] + +#### Function: [functionName] + +**Missing Preconditions:** + +```[language] +// Add at function entry: +assert(param1 != null, "param1 required"); +assert(param2 > 0, "param2 must be positive"); +``` + +**Missing Postconditions:** + +```[language] +// Add before return: +assert(result.isValid(), "Result must be valid"); +``` + +**Missing Invariants:** + +```[language] +// Add after state changes: +assert(this.items.length <= this.maxSize, "Size limit exceeded"); +``` + +## Implementation Strategy + +### Phase 1: Critical Safety (Immediate) + +1. Add null checks for all pointer dereferences +2. Add bounds checks for array accesses +3. Add division by zero prevention + +### Phase 2: Correctness (This Sprint) + +1. Add algorithm invariants +2. Add state validation +3. Add return value checks + +### Phase 3: Comprehensive (Next Sprint) + +1. Add contract assertions +2. Add security validations +3. Add performance assertions + +## Configuration Recommendations + +### Development Mode + +```[language] +// Enable all assertions +ASSERT_LEVEL = "all" +ASSERT_THROW = true +ASSERT_LOG = true +``` + +### Production Mode + +```[language] +// Keep only critical assertions +ASSERT_LEVEL = "critical" +ASSERT_THROW = false +ASSERT_LOG = true +``` + +## Benefits Analysis + +### Bug Prevention + +- Catch [X]% more bugs in development +- Reduce production incidents by [Y]% +- Decrease debugging time by [Z]% + +### Documentation Value + +- Self-documenting code contracts +- Clear API expectations +- Explicit invariants + +### Testing Support + +- Faster test failure identification +- Better test coverage visibility +- Clearer failure messages + +``` + +## Completion Criteria +- [ ] Code analysis completed +- [ ] Assertion candidates identified +- [ ] Priority levels assigned +- [ ] Assertions generated with proper messages +- [ ] Implementation plan created +- [ ] Configuration strategy defined +- [ ] Benefits quantified +``` diff --git a/bmad-core/tasks/debug-report-generation.md b/bmad-core/tasks/debug-report-generation.md new file mode 100644 index 00000000..c7150bf6 --- /dev/null +++ b/bmad-core/tasks/debug-report-generation.md @@ -0,0 +1,305 @@ +# debug-report-generation + +Generate comprehensive debug report from analysis session. + +## Context + +This task consolidates all debugging findings, analyses, and recommendations into a comprehensive report for stakeholders and future reference. + +## Task Execution + +### Step 1: Gather Session Data + +Collect all relevant information: + +1. Original bug description and symptoms +2. Analysis performed (inspections, root cause, patterns) +3. Evidence collected (logs, code, metrics) +4. Findings and conclusions +5. Fix attempts and results +6. Recommendations made + +### Step 2: Structure Report + +Organize information hierarchically: + +1. Executive Summary (1 page max) +2. Detailed Findings +3. Technical Analysis +4. Recommendations +5. Appendices + +### Step 3: Generate Report Sections + +#### Executive Summary + +- Problem statement (1-2 sentences) +- Impact assessment (users, systems, business) +- Root cause (brief) +- Recommended fix (high-level) +- Estimated effort and risk + +#### Detailed Findings + +- Symptoms observed +- Reproduction steps +- Environmental factors +- Timeline of issue + +#### Technical Analysis + +- Code examination results +- Root cause analysis +- Pattern detection findings +- Test coverage gaps +- Performance impacts + +#### Recommendations + +- Immediate fixes +- Short-term improvements +- Long-term prevention +- Process enhancements + +### Step 4: Add Supporting Evidence + +Include relevant: + +- Code snippets +- Log excerpts +- Stack traces +- Performance metrics +- Test results +- Screenshots (if applicable) + +### Step 5: Quality Review + +Ensure report: + +- Is technically accurate +- Uses clear, concise language +- Includes all critical information +- Provides actionable recommendations +- Is appropriately formatted + +## Output Format + +````markdown +# Debug Analysis Report + +**Report ID:** DBG-[timestamp] +**Date:** [current date] +**Analyst:** Debug Agent (Diana) +**Severity:** [Critical/High/Medium/Low] +**Status:** [Resolved/In Progress/Pending] + +--- + +## Executive Summary + +**Problem:** [1-2 sentence problem statement] + +**Impact:** [Quantified impact on users/system] + +**Root Cause:** [Brief root cause description] + +**Solution:** [High-level fix description] + +**Effort Required:** [Hours/Days estimate] +**Risk Level:** [High/Medium/Low] + +--- + +## 1. Problem Description + +### Symptoms + +[Detailed symptoms observed] + +### Reproduction + +1. [Step 1] +2. [Step 2] +3. [Expected vs Actual] + +### Environment + +- **System:** [OS, version] +- **Application:** [Version, build] +- **Dependencies:** [Relevant versions] +- **Configuration:** [Key settings] + +### Timeline + +- **First Observed:** [Date/time] +- **Frequency:** [How often] +- **Last Occurrence:** [Date/time] + +--- + +## 2. Technical Analysis + +### Root Cause Analysis + +[Detailed root cause with evidence] + +### Code Analysis + +```[language] +// Problematic code +[code snippet] +``` +```` + +**Issue:** [What's wrong with the code] + +### Pattern Analysis + +[Any patterns detected] + +### Test Coverage + +- **Current Coverage:** [percentage] +- **Gap Identified:** [What's not tested] +- **Risk Areas:** [Untested critical paths] + +--- + +## 3. Impact Assessment + +### Severity Matrix + +| Aspect | Impact | Severity | +| -------------- | ------------------- | -------------- | +| Users Affected | [number/percentage] | [High/Med/Low] | +| Data Integrity | [description] | [High/Med/Low] | +| Performance | [metrics] | [High/Med/Low] | +| Security | [assessment] | [High/Med/Low] | + +### Business Impact + +[Business consequences of the issue] + +--- + +## 4. Solution & Recommendations + +### Immediate Fix + +```[language] +// Corrected code +[code snippet] +``` + +**Validation:** [How to verify fix works] + +### Short-term Improvements + +1. [Improvement 1] +2. [Improvement 2] + +### Long-term Prevention + +1. [Strategy 1] +2. [Strategy 2] + +### Process Enhancements + +1. [Process improvement] +2. [Tool/automation suggestion] + +--- + +## 5. Implementation Plan + +### Phase 1: Immediate (0-2 days) + +- [ ] Apply code fix +- [ ] Add regression test +- [ ] Deploy to staging + +### Phase 2: Short-term (1 week) + +- [ ] Improve test coverage +- [ ] Add monitoring +- [ ] Update documentation + +### Phase 3: Long-term (1 month) + +- [ ] Refactor problematic area +- [ ] Implement prevention measures +- [ ] Team training on issue + +--- + +## 6. Verification & Testing + +### Test Cases + +1. **Test:** [Name] + **Steps:** [How to test] + **Expected:** [Result] + +### Regression Testing + +[Areas requiring regression testing] + +### Monitoring + +[Metrics to monitor post-fix] + +--- + +## 7. Lessons Learned + +### What Went Wrong + +[Root causes beyond the code] + +### What Could Improve + +[Process/tool improvements] + +### Knowledge Sharing + +[Information to share with team] + +--- + +## Appendices + +### A. Full Stack Traces + +[Complete error traces] + +### B. Log Excerpts + +[Relevant log entries] + +### C. Performance Metrics + +[Before/after metrics] + +### D. Related Issues + +[Links to similar problems] + +### E. References + +[Documentation, articles, tools used] + +--- + +**Report Generated:** [timestamp] +**Next Review:** [date for follow-up] + +``` + +## Completion Criteria +- [ ] All sections completed +- [ ] Evidence included +- [ ] Recommendations actionable +- [ ] Report reviewed for accuracy +- [ ] Formatted for readability +- [ ] Ready for distribution +``` diff --git a/bmad-core/tasks/delta-minimization.md b/bmad-core/tasks/delta-minimization.md new file mode 100644 index 00000000..f431f1d9 --- /dev/null +++ b/bmad-core/tasks/delta-minimization.md @@ -0,0 +1,228 @@ +# delta-minimization + +Automatically reduce failing test cases to minimal reproduction. + +## Context + +Delta debugging systematically minimizes failure-inducing inputs to find the smallest test case that still triggers a bug. This dramatically simplifies debugging by removing irrelevant complexity and isolating the essential trigger conditions. + +## Task Execution + +### Phase 1: Initial Setup + +#### Capture Failing State + +1. Record original failing test case: + - Input data + - Configuration settings + - Environment state + - Execution parameters +2. Verify bug reproduction +3. Measure initial complexity metrics: + - Input size + - Number of operations + - Data structure depth + - Configuration parameters + +### Phase 2: Minimization Strategy + +#### Select Minimization Approach + +**For Data Inputs:** + +1. **Binary reduction**: Remove half of input, test if still fails +2. **Line-by-line**: For text/config files +3. **Field elimination**: For structured data (JSON, XML) +4. **Value simplification**: Replace complex values with simple ones + +**For Code/Test Cases:** + +1. **Statement removal**: Delete non-essential lines +2. **Function inlining**: Replace calls with minimal implementations +3. **Loop unrolling**: Convert loops to minimal iterations +4. **Conditional simplification**: Remove unnecessary branches + +**For Configuration:** + +1. **Parameter elimination**: Remove non-essential settings +2. **Default substitution**: Replace with default values +3. **Range reduction**: Minimize numeric ranges + +### Phase 3: Delta Algorithm Implementation + +#### Core Algorithm + +``` +1. Start with failing test case T +2. While reduction is possible: + a. Generate smaller candidate C from T + b. Test if C still triggers bug + c. If yes: T = C (accept reduction) + d. If no: Try different reduction +3. Return minimal T +``` + +#### Automated Reduction Process + +**Step 1: Coarse-Grained Reduction** + +1. Try removing large chunks (50%) +2. Binary search for largest removable section +3. Continue until no large removals possible + +**Step 2: Fine-Grained Reduction** + +1. Try removing individual elements +2. Test each element for necessity +3. Build minimal required set + +**Step 3: Simplification Pass** + +1. Replace complex values with simpler equivalents: + - Long strings β†’ "a" + - Large numbers β†’ 0 or 1 + - Complex objects β†’ empty objects +2. Maintain bug reproduction + +### Phase 4: Validation + +#### Verify Minimality + +1. Confirm bug still reproduces +2. Verify no further reduction possible +3. Test that adding any removed element doesn't affect bug +4. Document reduction ratio achieved + +#### Create Clean Reproduction + +1. Format minimal test case +2. Remove all comments/documentation +3. Standardize naming (var1, var2, etc.) +4. Ensure standalone execution + +## Intelligent Reduction Strategies + +### Pattern-Based Reduction + +Recognize common patterns and apply targeted reductions: + +- **Array operations**: Reduce to 2-3 elements +- **Nested structures**: Flatten where possible +- **Async operations**: Convert to synchronous +- **External dependencies**: Mock with minimal stubs + +### Semantic-Aware Reduction + +Maintain semantic validity while reducing: + +- Preserve type constraints +- Maintain referential integrity +- Keep required relationships +- Honor invariants + +### Parallel Exploration + +Test multiple reduction paths simultaneously: + +- Try different reduction strategies +- Explore various simplification orders +- Combine successful reductions + +## Output Format + +````markdown +# Delta Debugging Minimization Report + +## Original Test Case + +**Size:** [original size/complexity] +**Components:** [number of elements/lines/fields] +**Execution Time:** [duration] + +```[format] +[original test case - abbreviated if too long] +``` +```` + +## Minimization Process + +**Iterations:** [number] +**Time Taken:** [duration] +**Reduction Achieved:** [percentage] + +### Reduction Path + +1. [First major reduction] - Removed [what], Size: [new size] +2. [Second reduction] - Simplified [what], Size: [new size] +3. [Continue for significant reductions...] + +## Minimal Reproduction + +### Test Case + +```[language] +// Minimal test case that reproduces bug +[minimized code/data] +``` + +### Requirements + +- **Environment:** [minimal environment needed] +- **Dependencies:** [only essential dependencies] +- **Configuration:** [minimal config] + +### Execution + +```bash +# Command to reproduce +[exact command] +``` + +### Expected vs Actual + +**Expected:** [what should happen] +**Actual:** [what happens (the bug)] + +## Analysis + +### Essential Elements + +These elements are required for reproduction: + +1. [Critical element 1] - Remove this and bug disappears +2. [Critical element 2] - Essential for triggering condition +3. [Continue for all essential elements] + +### Removed Elements + +These were safely removed without affecting the bug: + +- [Category]: [what was removed and why it's non-essential] +- [Continue for major categories] + +### Insights Gained + +[What the minimization reveals about the bug's nature] + +## Root Cause Hypothesis + +Based on minimal reproduction: +[What the essential elements suggest about root cause] + +## Next Steps + +1. Debug the minimal case using other techniques +2. Focus on interaction between essential elements +3. Test fix against both minimal and original cases + +``` + +## Completion Criteria +- [ ] Original failing case captured +- [ ] Minimization algorithm executed +- [ ] Minimal reproduction achieved +- [ ] Bug still reproduces with minimal case +- [ ] No further reduction possible +- [ ] Essential elements identified +- [ ] Clean reproduction documented +``` diff --git a/bmad-core/tasks/fagan-inspection.md b/bmad-core/tasks/fagan-inspection.md new file mode 100644 index 00000000..cf27a0b1 --- /dev/null +++ b/bmad-core/tasks/fagan-inspection.md @@ -0,0 +1,130 @@ +# fagan-inspection + +Comprehensive Fagan inspection for systematic bug analysis and resolution. + +## Context + +This task performs systematic defect analysis using the proven 6-phase Fagan inspection methodology, achieving 60-90% defect detection rates through formal peer review. + +## Task Execution + +### Phase 1: Planning + +1. Identify inspection scope based on bug description +2. Define inspection criteria and success metrics +3. Generate inspection checklist based on bug type +4. Determine affected components and stakeholders + +### Phase 2: Overview + +1. Analyze recent commits for context and potential causes +2. Review feature specifications and implementation plans +3. Gather background context and related documentation +4. Identify impact scope and affected systems + +### Phase 3: Preparation + +1. Systematic artifact examination: + - Code analysis using pattern detection + - Test coverage analysis and execution results + - Configuration and environment analysis + - Documentation consistency check +2. Dependency analysis and version conflicts +3. Performance metrics and resource usage (if applicable) +4. Generate preliminary defect hypotheses + +### Phase 4: Inspection Meeting + +1. Execute systematic defect identification: + - Logic defects: Algorithm errors, control flow issues + - Interface defects: API misuse, parameter mismatches + - Data defects: Type mismatches, validation failures + - Documentation defects: Outdated or incorrect documentation +2. Root cause analysis using fishbone methodology +3. Impact assessment: Severity, scope, risk level +4. Categorize defects by type and priority + +### Phase 5: Rework Planning + +1. Generate fix proposals with tradeoff analysis +2. Design test strategy for validation +3. Risk assessment for proposed changes +4. Create implementation timeline +5. Plan regression testing approach + +### Phase 6: Follow-up + +1. Validate fix effectiveness against original bug +2. Update documentation and specifications +3. Capture lessons learned for prevention +4. Generate comprehensive debug report + +## Output Format + +Generate a structured debug report containing: + +```markdown +# Debug Report: [Bug Description] + +Session ID: [timestamp] +Date: [date] + +## Executive Summary + +[Brief overview of findings and recommendations] + +## Defect Analysis + +### Primary Defect + +- Type: [Logic/Interface/Data/Documentation] +- Severity: [Critical/High/Medium/Low] +- Location: [file:line] +- Description: [detailed description] + +### Contributing Factors + +[List of contributing issues] + +## Root Cause Identification + +### Root Cause + +[Detailed root cause explanation] + +### Evidence Trail + +[Step-by-step evidence leading to root cause] + +## Fix Recommendations + +### Immediate Fix + +[Code or configuration changes needed] + +### Long-term Prevention + +[Systemic improvements to prevent recurrence] + +## Test Strategy + +[Required tests to validate fix] + +## Risk Assessment + +- Regression Risk: [High/Medium/Low] +- Side Effects: [Potential side effects] +- Mitigation: [Risk mitigation steps] + +## Lessons Learned + +[Key takeaways for future prevention] +``` + +## Completion Criteria + +- [ ] All 6 phases completed +- [ ] Root cause identified with evidence +- [ ] Fix recommendations provided +- [ ] Test strategy defined +- [ ] Debug report generated diff --git a/bmad-core/tasks/instrumentation-analysis.md b/bmad-core/tasks/instrumentation-analysis.md new file mode 100644 index 00000000..98a63868 --- /dev/null +++ b/bmad-core/tasks/instrumentation-analysis.md @@ -0,0 +1,472 @@ +# instrumentation-analysis + +Design strategic logging and monitoring points for production debugging. + +## Context + +This task analyzes code to identify optimal locations for instrumentation (logging, metrics, tracing) that will aid in debugging production issues without impacting performance. It creates a comprehensive observability strategy. + +## Task Execution + +### Phase 1: Critical Path Analysis + +#### Identify Key Flows + +1. **User-Facing Paths**: Request β†’ Response chains +2. **Business-Critical Paths**: Payment, authentication, data processing +3. **Performance-Sensitive Paths**: High-frequency operations +4. **Error-Prone Paths**: Historical problem areas +5. **Integration Points**: External service calls + +#### Map Decision Points + +- Conditional branches with business logic +- State transitions +- Error handling blocks +- Retry mechanisms +- Circuit breakers +- Cache hits/misses + +### Phase 2: Instrumentation Strategy + +#### Level 1: Essential Instrumentation + +**Entry/Exit Points:** + +``` +- Service boundaries (API endpoints) +- Function entry/exit for critical operations +- Database transaction boundaries +- External service calls +- Message queue operations +``` + +**Error Conditions:** + +``` +- Exception catches +- Validation failures +- Timeout occurrences +- Retry attempts +- Fallback activations +``` + +**Performance Markers:** + +``` +- Operation start/end times +- Queue depths +- Resource utilization +- Batch sizes +- Cache effectiveness +``` + +#### Level 2: Diagnostic Instrumentation + +**State Changes:** + +``` +- User state transitions +- Order/payment status changes +- Configuration updates +- Feature flag toggles +- Circuit breaker state changes +``` + +**Business Events:** + +``` +- User actions (login, purchase, etc.) +- System events (startup, shutdown) +- Scheduled job execution +- Data pipeline stages +- Workflow transitions +``` + +#### Level 3: Deep Debugging + +**Detailed Tracing:** + +``` +- Parameter values for complex functions +- Intermediate calculation results +- Loop iteration counts +- Branch decisions +- SQL query parameters +``` + +### Phase 3: Implementation Patterns + +#### Structured Logging Format + +**Standard Fields:** + +```json +{ + "timestamp": "ISO-8601", + "level": "INFO|WARN|ERROR", + "service": "service-name", + "trace_id": "correlation-id", + "span_id": "operation-id", + "user_id": "if-applicable", + "operation": "what-is-happening", + "duration_ms": "for-completed-ops", + "status": "success|failure", + "error": "error-details-if-any", + "metadata": { + "custom": "fields" + } +} +``` + +#### Performance-Conscious Patterns + +**Sampling Strategy:** + +``` +- 100% for errors +- 10% for normal operations +- 1% for high-frequency paths +- Dynamic adjustment based on load +``` + +**Async Logging:** + +``` +- Buffer non-critical logs +- Batch write to reduce I/O +- Use separate thread/process +- Implement backpressure handling +``` + +**Conditional Logging:** + +``` +- Debug level only in development +- Info level in staging +- Warn/Error in production +- Dynamic level adjustment via config +``` + +### Phase 4: Metrics Design + +#### Key Metrics to Track + +**RED Metrics:** + +- **Rate**: Requests per second +- **Errors**: Error rate/count +- **Duration**: Response time distribution + +**USE Metrics:** + +- **Utilization**: Resource usage percentage +- **Saturation**: Queue depth, wait time +- **Errors**: Resource allocation failures + +**Business Metrics:** + +- Transaction success rate +- Feature usage +- User journey completion +- Revenue impact + +#### Metric Implementation + +**Counter Examples:** + +``` +requests_total{method="GET", endpoint="/api/users", status="200"} +errors_total{type="database", operation="insert"} +``` + +**Histogram Examples:** + +``` +request_duration_seconds{method="GET", endpoint="/api/users"} +database_query_duration_ms{query_type="select", table="users"} +``` + +**Gauge Examples:** + +``` +active_connections{service="database"} +queue_depth{queue="email"} +``` + +### Phase 5: Tracing Strategy + +#### Distributed Tracing Points + +**Span Creation:** + +``` +- HTTP request handling +- Database operations +- Cache operations +- External API calls +- Message publishing/consuming +- Background job execution +``` + +**Context Propagation:** + +``` +- HTTP headers (X-Trace-Id) +- Message metadata +- Database comments +- Log correlation +``` + +## Output Format + +````markdown +# Instrumentation Analysis Report + +## Executive Summary + +**Components Analyzed:** [count] +**Current Coverage:** [percentage] +**Recommended Additions:** [count] +**Performance Impact:** [minimal/low/moderate] +**Implementation Effort:** [hours/days] + +## Critical Instrumentation Points + +### Priority 1: Immediate Implementation + +#### Service: [ServiceName] + +**Entry Points:** + +```[language] +// Location: [file:line] +// Current: No logging +// Recommended: +logger.info("Request received", { + method: req.method, + path: req.path, + user_id: req.user?.id, + trace_id: req.traceId +}); +``` +```` + +**Error Handling:** + +```[language] +// Location: [file:line] +// Current: Silent failure +// Recommended: +logger.error("Database operation failed", { + operation: "user_update", + user_id: userId, + error: err.message, + stack: err.stack, + retry_count: retries +}); +``` + +**Performance Tracking:** + +```[language] +// Location: [file:line] +// Recommended: +const startTime = Date.now(); +try { + const result = await expensiveOperation(); + metrics.histogram('operation_duration_ms', Date.now() - startTime, { + operation: 'expensive_operation', + status: 'success' + }); + return result; +} catch (error) { + metrics.histogram('operation_duration_ms', Date.now() - startTime, { + operation: 'expensive_operation', + status: 'failure' + }); + throw error; +} +``` + +### Priority 2: Enhanced Observability + +[Similar format for medium priority points] + +### Priority 3: Deep Debugging + +[Similar format for low priority points] + +## Logging Strategy + +### Log Levels by Environment + +| Level | Development | Staging | Production | +| ----- | ----------- | ------- | ---------- | +| DEBUG | βœ“ | βœ“ | βœ— | +| INFO | βœ“ | βœ“ | Sampled | +| WARN | βœ“ | βœ“ | βœ“ | +| ERROR | βœ“ | βœ“ | βœ“ | + +### Sampling Configuration + +```yaml +sampling: + default: 0.01 # 1% sampling + rules: + - path: '/health' + sample_rate: 0.001 # 0.1% for health checks + - path: '/api/critical/*' + sample_rate: 0.1 # 10% for critical APIs + - level: 'ERROR' + sample_rate: 1.0 # 100% for errors +``` + +## Metrics Implementation + +### Application Metrics + +```[language] +// Metric definitions +const metrics = { + // Counters + requests: new Counter('http_requests_total', ['method', 'path', 'status']), + errors: new Counter('errors_total', ['type', 'operation']), + + // Histograms + duration: new Histogram('request_duration_ms', ['method', 'path']), + dbDuration: new Histogram('db_query_duration_ms', ['operation', 'table']), + + // Gauges + connections: new Gauge('active_connections', ['type']), + queueSize: new Gauge('queue_size', ['queue_name']) +}; +``` + +### Dashboard Queries + +```sql +-- Error rate by endpoint +SELECT + endpoint, + sum(errors) / sum(requests) as error_rate +FROM metrics +WHERE time > now() - 1h +GROUP BY endpoint + +-- P95 latency +SELECT + endpoint, + percentile(duration, 0.95) as p95_latency +FROM metrics +WHERE time > now() - 1h +GROUP BY endpoint +``` + +## Tracing Implementation + +### Trace Context + +```[language] +// Trace context propagation +class TraceContext { + constructor(traceId, spanId, parentSpanId) { + this.traceId = traceId || generateId(); + this.spanId = spanId || generateId(); + this.parentSpanId = parentSpanId; + } + + createChild() { + return new TraceContext(this.traceId, generateId(), this.spanId); + } +} + +// Usage +middleware.use((req, res, next) => { + req.trace = new TraceContext( + req.headers['x-trace-id'], + req.headers['x-span-id'], + req.headers['x-parent-span-id'] + ); + next(); +}); +``` + +## Performance Considerations + +### Impact Analysis + +| Instrumentation Type | CPU Impact | Memory Impact | I/O Impact | +| -------------------- | ---------- | ------------- | -------------- | +| Structured Logging | < 1% | < 10MB | Async buffered | +| Metrics Collection | < 0.5% | < 5MB | Batched | +| Distributed Tracing | < 2% | < 20MB | Sampled | + +### Optimization Techniques + +1. Use async logging with buffers +2. Implement sampling for high-frequency paths +3. Batch metric submissions +4. Use conditional compilation for debug logs +5. Implement circuit breakers for logging systems + +## Implementation Plan + +### Phase 1: Week 1 + +- [ ] Implement critical error logging +- [ ] Add service boundary instrumentation +- [ ] Set up basic metrics + +### Phase 2: Week 2 + +- [ ] Add performance tracking +- [ ] Implement distributed tracing +- [ ] Create initial dashboards + +### Phase 3: Week 3 + +- [ ] Add business event tracking +- [ ] Implement sampling strategies +- [ ] Performance optimization + +## Monitoring & Alerts + +### Critical Alerts + +```yaml +- name: high_error_rate + condition: error_rate > 0.01 + severity: critical + +- name: high_latency + condition: p95_latency > 1000ms + severity: warning + +- name: service_down + condition: health_check_failures > 3 + severity: critical +``` + +## Validation Checklist + +- [ ] No sensitive data in logs +- [ ] Trace IDs properly propagated +- [ ] Sampling rates appropriate +- [ ] Performance impact acceptable +- [ ] Dashboards created +- [ ] Alerts configured +- [ ] Documentation updated + +``` + +## Completion Criteria +- [ ] Critical paths identified +- [ ] Instrumentation points mapped +- [ ] Logging strategy defined +- [ ] Metrics designed +- [ ] Tracing plan created +- [ ] Performance impact assessed +- [ ] Implementation plan created +- [ ] Monitoring strategy defined +``` diff --git a/bmad-core/tasks/pattern-detection.md b/bmad-core/tasks/pattern-detection.md new file mode 100644 index 00000000..30fe908b --- /dev/null +++ b/bmad-core/tasks/pattern-detection.md @@ -0,0 +1,199 @@ +# pattern-detection + +Analyze code and commit history for defect patterns and systemic issues. + +## Context + +This task identifies recurring defect patterns, systemic issues, and common problem areas to enable proactive quality improvements. + +## Task Execution + +### Step 1: Historical Analysis + +#### Recent Commits Analysis + +1. Review last 20-50 commits for: + - Files frequently modified (hotspots) + - Repeated fix attempts + - Revert commits indicating instability + - Emergency/hotfix patterns + +#### Bug History Review + +1. Analyze recent bug reports for: + - Common symptoms + - Recurring locations + - Similar root causes + - Fix patterns + +### Step 2: Code Pattern Detection + +#### Anti-Pattern Identification + +Look for common problematic patterns: + +- God objects/functions (excessive responsibility) +- Copy-paste code (DRY violations) +- Dead code (unused functions/variables) +- Complex conditionals (cyclomatic complexity) +- Long parameter lists +- Inappropriate intimacy (tight coupling) + +#### Vulnerability Patterns + +Check for security/reliability issues: + +- Input validation gaps +- Error handling inconsistencies +- Resource leak patterns +- Race condition indicators +- SQL injection risks +- XSS vulnerabilities + +### Step 3: Architectural Pattern Analysis + +#### Dependency Issues + +- Circular dependencies +- Version conflicts +- Missing abstractions +- Leaky abstractions +- Inappropriate dependencies + +#### Design Smells + +- Violated SOLID principles +- Missing design patterns where needed +- Over-engineering indicators +- Technical debt accumulation + +### Step 4: Team Pattern Analysis + +#### Development Patterns + +- Rush commits (end of sprint) +- Incomplete implementations +- Missing tests for bug fixes +- Documentation gaps +- Code review oversights + +#### Communication Patterns + +- Misunderstood requirements +- Incomplete handoffs +- Knowledge silos +- Missing context in commits + +### Step 5: Pattern Correlation + +1. Group related patterns by: + - Component/module + - Developer/team + - Time period + - Feature area + +2. Identify correlations: + - Patterns that appear together + - Cascade effects + - Root pattern causing others + +## Output Format + +```markdown +# Defect Pattern Analysis Report + +## Executive Summary + +[High-level overview of key patterns found] + +## Critical Patterns Detected + +### Pattern 1: [Pattern Name] + +**Type:** [Anti-pattern/Vulnerability/Design/Process] +**Frequency:** [Number of occurrences] +**Locations:** + +- [file:line] +- [file:line] + +**Description:** [What the pattern is] +**Impact:** [Why it matters] +**Example:** [Code snippet or commit reference] +**Recommendation:** [How to address] + +## Hotspot Analysis + +### High-Change Files + +1. [filename] - [change count] changes, [bug count] bugs +2. [filename] - [change count] changes, [bug count] bugs + +### Complex Areas + +1. [component] - Complexity score: [number] +2. [component] - Complexity score: [number] + +## Systemic Issues + +### Issue 1: [Issue Name] + +**Pattern Indicators:** + +- [Pattern that indicates this issue] +- [Another indicator] + +**Root Cause:** [Underlying systemic problem] +**Affected Areas:** [Components/teams affected] +**Priority:** [Critical/High/Medium/Low] +**Remediation Strategy:** [How to fix systematically] + +## Trend Analysis + +### Improving Areas + +- [Area showing positive trends] + +### Degrading Areas + +- [Area showing negative trends] + +### Stable Problem Areas + +- [Persistent issues not getting better or worse] + +## Recommendations + +### Immediate Actions + +1. [Quick win to address patterns] +2. [Another quick action] + +### Short-term Improvements + +1. [1-2 sprint improvements] +2. [Process changes needed] + +### Long-term Strategy + +1. [Architectural changes] +2. [Team/process evolution] + +## Prevention Checklist + +- [ ] Add static analysis for [pattern] +- [ ] Implement pre-commit hooks for [issue] +- [ ] Create coding standards for [area] +- [ ] Add automated tests for [vulnerability] +- [ ] Improve documentation for [component] +``` + +## Completion Criteria + +- [ ] Historical analysis completed +- [ ] Code patterns identified +- [ ] Architectural issues found +- [ ] Team patterns analyzed +- [ ] Correlations established +- [ ] Recommendations provided +- [ ] Prevention strategies defined diff --git a/bmad-core/tasks/root-cause-analysis.md b/bmad-core/tasks/root-cause-analysis.md new file mode 100644 index 00000000..46554593 --- /dev/null +++ b/bmad-core/tasks/root-cause-analysis.md @@ -0,0 +1,148 @@ +# root-cause-analysis + +Focused root cause analysis using fishbone (Ishikawa) methodology. + +## Context + +This task performs systematic root cause analysis to identify the underlying causes of defects, moving beyond symptoms to address fundamental issues. + +## Task Execution + +### Step 1: Problem Definition + +1. Clearly state the problem/symptom +2. Define when it occurs (timing, frequency) +3. Define where it occurs (component, environment) +4. Quantify the impact (users affected, severity) + +### Step 2: Fishbone Analysis Categories + +Analyze the problem across these dimensions: + +#### People (Developer/User factors) + +- Knowledge gaps or misunderstandings +- Communication breakdowns +- Incorrect assumptions +- User behavior patterns + +#### Process (Development/Deployment) + +- Missing validation steps +- Inadequate testing coverage +- Deployment procedures +- Code review gaps + +#### Technology (Tools/Infrastructure) + +- Framework limitations +- Library bugs or incompatibilities +- Infrastructure issues +- Tool configuration problems + +#### Environment (System/Configuration) + +- Environment-specific settings +- Resource constraints +- External dependencies +- Network or connectivity issues + +#### Data (Input/State) + +- Invalid or unexpected input +- Data corruption or inconsistency +- State management issues +- Race conditions + +#### Methods (Algorithms/Design) + +- Algorithm flaws +- Design pattern misuse +- Architecture limitations +- Performance bottlenecks + +### Step 3: 5-Whys Deep Dive + +For each potential cause identified: + +1. Ask "Why does this happen?" +2. For each answer, ask "Why?" again +3. Continue until reaching the root cause (typically 5 iterations) +4. Document the chain of causation + +### Step 4: Evidence Collection + +For each identified root cause: + +- Gather supporting evidence (logs, code, metrics) +- Verify through reproduction or testing +- Rule out alternative explanations +- Establish confidence level + +### Step 5: Root Cause Prioritization + +Rank root causes by: + +- Likelihood (probability this is the true cause) +- Impact (severity if this is the cause) +- Effort (complexity to address) +- Risk (potential for recurrence) + +## Output Format + +```markdown +# Root Cause Analysis: [Problem Description] + +## Problem Statement + +**What:** [Clear problem description] +**When:** [Timing/frequency] +**Where:** [Location/component] +**Impact:** [Quantified impact] + +## Fishbone Analysis + +### Category: [People/Process/Technology/Environment/Data/Methods] + +**Potential Cause:** [Description] +**5-Whys Analysis:** + +1. Why? [Answer] +2. Why? [Answer] +3. Why? [Answer] +4. Why? [Answer] +5. Why? [Root cause] + +**Evidence:** [Supporting data/logs/code] +**Confidence:** [High/Medium/Low] + +## Root Cause Summary + +### Primary Root Cause + +[Most likely root cause with evidence] + +### Contributing Factors + +1. [Secondary cause] +2. [Tertiary cause] + +## Recommended Actions + +1. **Immediate:** [Quick fix to address symptom] +2. **Short-term:** [Fix root cause] +3. **Long-term:** [Prevent recurrence] + +## Verification Plan + +[How to verify the root cause is correctly identified] +``` + +## Completion Criteria + +- [ ] Problem clearly defined +- [ ] Fishbone analysis completed +- [ ] 5-Whys analysis performed +- [ ] Evidence collected and verified +- [ ] Root cause identified with confidence level +- [ ] Action plan created diff --git a/bmad-core/tasks/static-analysis.md b/bmad-core/tasks/static-analysis.md new file mode 100644 index 00000000..15c4fb5f --- /dev/null +++ b/bmad-core/tasks/static-analysis.md @@ -0,0 +1,294 @@ +# static-analysis + +Comprehensive static analysis for defect detection and code quality assessment. + +## Context + +This task performs deep static analysis to identify bugs, anti-patterns, security vulnerabilities, and code quality issues without executing the code. It combines multiple analysis techniques to provide a comprehensive view of potential problems. + +## Task Execution + +### Phase 1: Multi-Layer Analysis + +#### Layer 1: Syntax and Style Analysis + +1. **Syntax Errors**: Malformed code that won't compile/run +2. **Style Violations**: Inconsistent formatting, naming conventions +3. **Dead Code**: Unreachable code, unused variables/functions +4. **Code Duplication**: Copy-paste code blocks + +#### Layer 2: Semantic Analysis + +1. **Type Issues**: Type mismatches, implicit conversions +2. **Logic Errors**: Always true/false conditions, impossible states +3. **Resource Leaks**: Unclosed files, unreleased memory +4. **API Misuse**: Incorrect parameter order, deprecated methods + +#### Layer 3: Flow Analysis + +1. **Control Flow**: Infinite loops, unreachable code, missing returns +2. **Data Flow**: Uninitialized variables, unused assignments +3. **Exception Flow**: Unhandled exceptions, empty catch blocks +4. **Null Flow**: Potential null dereferences + +#### Layer 4: Security Analysis + +1. **Injection Vulnerabilities**: SQL, XSS, command injection +2. **Authentication Issues**: Hardcoded credentials, weak crypto +3. **Data Exposure**: Sensitive data in logs, unencrypted storage +4. **Access Control**: Missing authorization, privilege escalation + +### Phase 2: Pattern Detection + +#### Anti-Patterns to Detect + +**Code Smells:** + +``` +- God Classes/Functions (too much responsibility) +- Long Parameter Lists (>3-4 parameters) +- Feature Envy (excessive external data access) +- Data Clumps (repeated parameter groups) +- Primitive Obsession (overuse of primitives) +- Switch Statements (missing polymorphism) +- Lazy Class (too little responsibility) +- Speculative Generality (unused abstraction) +- Message Chains (deep coupling) +- Middle Man (unnecessary delegation) +``` + +**Performance Issues:** + +``` +- N+1 Queries (database inefficiency) +- Synchronous I/O in async context +- Inefficient Algorithms (O(nΒ²) when O(n) possible) +- Memory Leaks (retained references) +- Excessive Object Creation (GC pressure) +- String Concatenation in Loops +- Missing Indexes (database) +- Blocking Operations (thread starvation) +``` + +**Concurrency Issues:** + +``` +- Race Conditions (unsynchronized access) +- Deadlocks (circular wait) +- Thread Leaks (unclosed threads) +- Missing Volatile (visibility issues) +- Double-Checked Locking (broken pattern) +- Lock Contention (performance bottleneck) +``` + +### Phase 3: Complexity Analysis + +#### Metrics Calculation + +1. **Cyclomatic Complexity**: Number of linearly independent paths +2. **Cognitive Complexity**: How difficult code is to understand +3. **Halstead Metrics**: Program vocabulary and difficulty +4. **Maintainability Index**: Composite maintainability score +5. **Technical Debt**: Estimated time to fix all issues +6. **Test Coverage**: Lines/branches/functions covered + +#### Thresholds + +``` +Cyclomatic Complexity: +- Good: < 10 +- Acceptable: 10-20 +- Complex: 20-50 +- Untestable: > 50 + +Cognitive Complexity: +- Simple: < 5 +- Moderate: 5-10 +- Complex: 10-15 +- Very Complex: > 15 +``` + +### Phase 4: Dependency Analysis + +#### Identify Issues + +1. **Circular Dependencies**: Aβ†’Bβ†’Cβ†’A cycles +2. **Version Conflicts**: Incompatible dependency versions +3. **Security Vulnerabilities**: Known CVEs in dependencies +4. **License Conflicts**: Incompatible license combinations +5. **Outdated Packages**: Dependencies needing updates +6. **Unused Dependencies**: Declared but not used + +### Phase 5: Architecture Analysis + +#### Structural Issues + +1. **Layer Violations**: Cross-layer dependencies +2. **Module Coupling**: High interdependence +3. **Missing Abstractions**: Direct implementation dependencies +4. **Inconsistent Patterns**: Mixed architectural styles +5. **God Objects**: Central points of failure + +## Automated Tools Integration + +Simulate output from common static analysis tools: + +**ESLint/TSLint** (JavaScript/TypeScript) +**Pylint/Flake8** (Python) +**SonarQube** (Multi-language) +**PMD/SpotBugs** (Java) +**RuboCop** (Ruby) +**SwiftLint** (Swift) + +## Output Format + +````markdown +# Static Analysis Report + +## Executive Summary + +**Files Analyzed:** [count] +**Total Issues:** [count] +**Critical:** [count] | **High:** [count] | **Medium:** [count] | **Low:** [count] +**Technical Debt:** [hours/days estimated] +**Code Coverage:** [percentage] + +## Critical Issues (Immediate Action Required) + +### Issue 1: [Security Vulnerability] + +**File:** [path:line] +**Category:** Security +**Rule:** [CWE-ID or rule name] + +```[language] +// Vulnerable code +[code snippet] +``` +```` + +**Risk:** [Description of security risk] +**Fix:** + +```[language] +// Secure code +[fixed code] +``` + +### Issue 2: [Logic Error] + +[Similar format] + +## High Priority Issues + +### Category: Performance + +| File | Line | Issue | Impact | Fix Effort | +| ------ | ------ | --------------- | ------------ | ---------- | +| [file] | [line] | N+1 Query | High latency | 2 hours | +| [file] | [line] | O(nΒ²) algorithm | CPU spike | 4 hours | + +### Category: Reliability + +[Similar table format] + +## Code Quality Metrics + +### Complexity Analysis + +| File | Cyclomatic | Cognitive | Maintainability | Action | +| ------ | ---------- | --------- | --------------- | -------- | +| [file] | 45 (High) | 28 (High) | 35 (Low) | Refactor | +| [file] | 32 (Med) | 18 (Med) | 55 (Med) | Review | + +### Duplication Analysis + +**Total Duplication:** [percentage] +**Largest Duplicate:** [lines] lines in [files] + +### Top Duplicated Blocks: + +1. [File A:lines] ↔ [File B:lines] - [line count] lines +2. [File C:lines] ↔ [File D:lines] - [line count] lines + +## Anti-Pattern Detection + +### God Classes + +1. **[ClassName]** - [methods] methods, [lines] lines + - Responsibilities: [list] + - Suggested Split: [recommendations] + +### Long Methods + +1. **[methodName]** - [lines] lines, complexity: [score] + - Extract Methods: [suggestions] + +## Security Scan Results + +### Vulnerabilities by Category + +- Injection: [count] +- Authentication: [count] +- Data Exposure: [count] +- Access Control: [count] + +### Detailed Findings + +[List each with severity, location, and fix] + +## Dependency Analysis + +### Security Vulnerabilities + +| Package | Version | CVE | Severity | Fixed Version | +| ------- | ------- | -------- | -------- | ------------- | +| [pkg] | [ver] | [CVE-ID] | Critical | [ver] | + +### Outdated Dependencies + +| Package | Current | Latest | Breaking Changes | +| ------- | ------- | ------ | ---------------- | +| [pkg] | [ver] | [ver] | [Yes/No] | + +## Recommendations + +### Immediate Actions (This Sprint) + +1. Fix all critical security vulnerabilities +2. Resolve high-severity logic errors +3. Update vulnerable dependencies + +### Short-term (Next Sprint) + +1. Refactor high-complexity functions +2. Remove code duplication +3. Add missing error handling + +### Long-term (Technical Debt) + +1. Architectural improvements +2. Comprehensive refactoring +3. Test coverage improvement + +## Trend Analysis + +**Compared to Last Scan:** + +- Issues: [+/-X] +- Complexity: [+/-Y] +- Coverage: [+/-Z%] +- Technical Debt: [+/-N hours] + +``` + +## Completion Criteria +- [ ] All analysis layers completed +- [ ] Issues categorized by severity +- [ ] Metrics calculated +- [ ] Anti-patterns identified +- [ ] Security vulnerabilities found +- [ ] Dependencies analyzed +- [ ] Recommendations prioritized +- [ ] Fixes suggested for critical issues +``` diff --git a/bmad-core/tasks/walkthrough-prep.md b/bmad-core/tasks/walkthrough-prep.md new file mode 100644 index 00000000..11f98d12 --- /dev/null +++ b/bmad-core/tasks/walkthrough-prep.md @@ -0,0 +1,363 @@ +# walkthrough-prep + +Generate comprehensive materials for code walkthrough sessions. + +## Context + +This task prepares all necessary documentation, checklists, and presentation materials for conducting effective code walkthroughs. It ensures reviewers have everything needed to provide valuable feedback while minimizing meeting time. + +## Task Execution + +### Phase 1: Scope Analysis + +#### Determine Walkthrough Type + +1. **Feature Walkthrough**: New functionality +2. **Bug Fix Walkthrough**: Defect resolution +3. **Refactoring Walkthrough**: Code improvement +4. **Architecture Walkthrough**: Design decisions +5. **Security Walkthrough**: Security-focused review + +#### Identify Key Components + +1. Changed files and their purposes +2. Dependencies affected +3. Test coverage added/modified +4. Documentation updates +5. Configuration changes + +### Phase 2: Material Generation + +#### 1. Executive Summary + +Create high-level overview: + +- Purpose and goals +- Business value/impact +- Technical approach +- Key decisions made +- Risks and mitigations + +#### 2. Technical Overview + +**Architecture Diagram:** + +``` +[Component A] β†’ [Component B] β†’ [Component C] + ↓ ↓ ↓ +[Database] [External API] [Cache] +``` + +**Data Flow:** + +``` +1. User Input β†’ Validation +2. Validation β†’ Processing +3. Processing β†’ Storage +4. Storage β†’ Response +``` + +**Sequence Diagram:** + +``` +User β†’ Frontend: Request +Frontend β†’ Backend: API Call +Backend β†’ Database: Query +Database β†’ Backend: Results +Backend β†’ Frontend: Response +Frontend β†’ User: Display +``` + +#### 3. Code Change Summary + +**Statistics:** + +- Files changed: [count] +- Lines added: [count] +- Lines removed: [count] +- Test coverage: [before]% β†’ [after]% +- Complexity change: [delta] + +**Change Categories:** + +- New features: [list] +- Modifications: [list] +- Deletions: [list] +- Refactoring: [list] + +### Phase 3: Review Checklist Generation + +#### Core Review Areas + +**Functionality Checklist:** + +- [ ] Requirements met +- [ ] Edge cases handled +- [ ] Error handling complete +- [ ] Performance acceptable +- [ ] Backwards compatibility maintained + +**Code Quality Checklist:** + +- [ ] Naming conventions followed +- [ ] DRY principle applied +- [ ] SOLID principles followed +- [ ] Comments appropriate +- [ ] No code smells + +**Testing Checklist:** + +- [ ] Unit tests added +- [ ] Integration tests updated +- [ ] Edge cases tested +- [ ] Performance tested +- [ ] Regression tests pass + +**Security Checklist:** + +- [ ] Input validation implemented +- [ ] Authentication checked +- [ ] Authorization verified +- [ ] Data sanitized +- [ ] Secrets not exposed + +**Documentation Checklist:** + +- [ ] Code comments updated +- [ ] README updated +- [ ] API docs updated +- [ ] Changelog updated +- [ ] Deployment docs updated + +### Phase 4: Presentation Structure + +#### Slide/Section Outline + +**1. Introduction (2 min)** + +- Problem statement +- Solution overview +- Success criteria + +**2. Technical Approach (5 min)** + +- Architecture decisions +- Implementation choices +- Trade-offs made + +**3. Code Walkthrough (15 min)** + +- Key components tour +- Critical logic explanation +- Integration points + +**4. Testing Strategy (3 min)** + +- Test coverage +- Test scenarios +- Performance results + +**5. Discussion (5 min)** + +- Open questions +- Concerns +- Suggestions + +### Phase 5: Supporting Documentation + +#### Code Snippets + +Extract and annotate key code sections: + +```[language] +// BEFORE: Original implementation +[original code] + +// AFTER: New implementation +[new code] + +// KEY CHANGES: +// 1. [Change 1 explanation] +// 2. [Change 2 explanation] +``` + +#### Test Cases + +Document critical test scenarios: + +```[language] +// Test Case 1: [Description] +// Input: [test input] +// Expected: [expected output] +// Covers: [what it validates] +``` + +#### Performance Metrics + +If applicable: + +- Execution time: [before] β†’ [after] +- Memory usage: [before] β†’ [after] +- Database queries: [before] β†’ [after] + +## Output Format + +````markdown +# Code Walkthrough Package: [Feature/Fix Name] + +## Quick Reference + +**Date:** [scheduled date] +**Duration:** [estimated time] +**Presenter:** [name] +**Reviewers:** [list] +**Repository:** [link] +**Branch/PR:** [link] + +## Executive Summary + +[2-3 paragraph overview] + +## Agenda + +1. Introduction (2 min) +2. Technical Overview (5 min) +3. Code Walkthrough (15 min) +4. Testing & Validation (3 min) +5. Q&A (5 min) + +## Pre-Review Checklist + +**For Reviewers - Complete Before Meeting:** + +- [ ] Read executive summary +- [ ] Review changed files list +- [ ] Note initial questions +- [ ] Check test results + +## Technical Overview + +### Architecture + +[Include diagrams] + +### Key Changes + +| Component | Type | Description | Risk | +| --------- | ------------- | -------------- | -------------- | +| [name] | [New/Mod/Del] | [what changed] | [Low/Med/High] | + +### Dependencies + +**Added:** [list] +**Modified:** [list] +**Removed:** [list] + +## Code Highlights + +### Critical Section 1: [Name] + +**File:** [path] +**Purpose:** [why this is important] + +```[language] +[annotated code snippet] +``` +```` + +**Discussion Points:** + +- [Question or concern] +- [Alternative considered] + +### Critical Section 2: [Name] + +[Similar format] + +## Testing Summary + +### Coverage + +- Unit Tests: [count] tests, [%] coverage +- Integration Tests: [count] tests +- Manual Testing: [checklist items] + +### Key Test Scenarios + +1. [Scenario]: [Result] +2. [Scenario]: [Result] + +## Review Checklist + +### Must Review + +- [ ] [Critical file/function] +- [ ] [Security-sensitive code] +- [ ] [Performance-critical section] + +### Should Review + +- [ ] [Important logic] +- [ ] [API changes] +- [ ] [Database changes] + +### Nice to Review + +- [ ] [Refactoring] +- [ ] [Documentation] +- [ ] [Tests] + +## Known Issues & Decisions + +### Open Questions + +1. [Question needing group input] +2. [Design decision to validate] + +### Technical Debt + +- [Debt item]: [Planned resolution] + +### Future Improvements + +- [Improvement]: [Timeline] + +## Post-Review Action Items + +**To be filled during review:** + +- [ ] Action: [description] - Owner: [name] +- [ ] Action: [description] - Owner: [name] + +## Appendix + +### A. Full File List + +[Complete list of changed files] + +### B. Test Results + +[Test execution summary] + +### C. Performance Benchmarks + +[If applicable] + +### D. Related Documentation + +- [Design Doc]: [link] +- [Requirements]: [link] +- [Previous Reviews]: [link] + +``` + +## Completion Criteria +- [ ] Scope analyzed +- [ ] Executive summary written +- [ ] Technical overview created +- [ ] Code highlights selected +- [ ] Review checklist generated +- [ ] Presentation structure defined +- [ ] Supporting docs prepared +- [ ] Package formatted for distribution +``` diff --git a/bmad-core/tasks/wolf-fence-search.md b/bmad-core/tasks/wolf-fence-search.md new file mode 100644 index 00000000..9c4eecce --- /dev/null +++ b/bmad-core/tasks/wolf-fence-search.md @@ -0,0 +1,168 @@ +# wolf-fence-search + +Binary search debugging to systematically isolate bug location. + +## Context + +This task implements the Wolf Fence algorithm (binary search debugging) to efficiently locate bugs by repeatedly dividing the search space in half. Named after the problem: "There's one wolf in Alaska; how do you find it? Build a fence down the middle, wait for the wolf to howl, determine which side it's on, and repeat." + +## Task Execution + +### Phase 1: Initial Analysis + +1. Identify the boundaries of the problem space: + - Entry point where system is working + - Exit point where bug manifests + - Code path between these points +2. Determine testable checkpoints +3. Calculate optimal division points + +### Phase 2: Binary Search Implementation + +#### Step 1: Divide Search Space + +1. Identify midpoint of current search area +2. Insert diagnostic checkpoint at midpoint: + - Add assertion to verify expected state + - Add logging to capture actual state + - Add breakpoint if interactive debugging available + +#### Step 2: Test and Observe + +1. Execute code up to checkpoint +2. Verify if bug has manifested: + - State is correct β†’ Bug is in second half + - State is incorrect β†’ Bug is in first half + - Cannot determine β†’ Need better checkpoint + +#### Step 3: Narrow Focus + +1. Select the half containing the bug +2. Repeat division process +3. Continue until bug location is isolated to: + - Single function + - Few lines of code + - Specific data transformation + +### Phase 3: Refinement + +#### For Complex Bugs + +1. **Multi-dimensional search**: When bug depends on multiple factors + - Apply binary search on each dimension + - Create test matrix for combinations + +2. **Time-based search**: For timing/concurrency issues + - Binary search on execution timeline + - Add timestamps to narrow race conditions + +3. **Data-based search**: For data-dependent bugs + - Binary search on input size + - Isolate problematic data patterns + +### Phase 4: Bug Isolation + +Once narrowed to small code section: + +1. Analyze the isolated code thoroughly +2. Identify exact failure mechanism +3. Verify bug reproduction in isolation +4. Document minimal reproduction case + +## Automated Implementation + +### Checkpoint Generation Strategy + +```markdown +1. Identify all function boundaries in path +2. Select optimal checkpoint locations: + - Function entry/exit points + - Loop boundaries + - Conditional branches + - Data transformations + +3. Insert non-invasive checkpoints: + - Use existing logging if available + - Add temporary assertions + - Leverage existing test infrastructure +``` + +### Search Optimization + +- Start with coarse-grained divisions (module/class level) +- Progressively move to fine-grained (function/line level) +- Skip obviously correct sections based on static analysis +- Prioritize high-probability areas based on: + - Recent changes + - Historical bug density + - Code complexity metrics + +## Output Format + +````markdown +# Wolf Fence Debug Analysis + +## Search Summary + +**Initial Scope:** [entry point] β†’ [exit point] +**Final Location:** [specific file:line] +**Iterations Required:** [number] +**Time to Isolate:** [duration] + +## Search Path + +### Iteration 1 + +- **Search Space:** [full range] +- **Checkpoint:** [location] +- **Result:** Bug in [first/second] half +- **Evidence:** [what was observed] + +### Iteration 2 + +- **Search Space:** [narrowed range] +- **Checkpoint:** [location] +- **Result:** Bug in [first/second] half +- **Evidence:** [what was observed] + +[Continue for all iterations...] + +## Bug Location + +**File:** [path] +**Function:** [name] +**Lines:** [range] +**Description:** [what the bug is] + +## Minimal Reproduction + +```[language] +// Minimal code to reproduce +[code snippet] +``` +```` + +## Root Cause + +[Brief explanation of why bug occurs] + +## Recommended Fix + +[Suggested solution] + +## Verification Points + +- [ ] Bug reproducible at isolated location +- [ ] Fix resolves issue at checkpoint +- [ ] No regression in other checkpoints + +``` + +## Completion Criteria +- [ ] Search space properly bounded +- [ ] Binary search completed +- [ ] Bug location isolated +- [ ] Minimal reproduction created +- [ ] Root cause identified +- [ ] Fix recommendation provided +``` diff --git a/bmad-core/templates/debug-report-tmpl.yaml b/bmad-core/templates/debug-report-tmpl.yaml new file mode 100644 index 00000000..52266c31 --- /dev/null +++ b/bmad-core/templates/debug-report-tmpl.yaml @@ -0,0 +1,234 @@ +# +template: + id: debug-report-template-v1 + name: Debug Analysis Report + version: 1.0 + output: + format: markdown + filename: docs/debug/debug-report-{{timestamp}}.md + title: "Debug Analysis Report - {{problem_title}}" + +workflow: + mode: rapid + elicitation: false + +sections: + - id: header + title: Report Header + instruction: Generate report header with metadata + sections: + - id: metadata + title: Report Metadata + type: key-value + instruction: | + Report ID: DBG-{{timestamp}} + Date: {{current_date}} + Analyst: Debug Agent (Diana) + Severity: {{severity_level}} + Status: {{status}} + + - id: executive-summary + title: Executive Summary + instruction: Provide concise summary under 200 words + sections: + - id: problem + title: Problem + type: text + instruction: 1-2 sentence problem statement + - id: impact + title: Impact + type: text + instruction: Quantified impact on users/system + - id: root-cause + title: Root Cause + type: text + instruction: Brief root cause description + - id: solution + title: Solution + type: text + instruction: High-level fix description + - id: metrics + title: Key Metrics + type: key-value + instruction: | + Effort Required: {{effort_estimate}} + Risk Level: {{risk_level}} + + - id: problem-description + title: Problem Description + instruction: Detailed problem analysis + sections: + - id: symptoms + title: Symptoms + type: paragraphs + instruction: Detailed symptoms observed + - id: reproduction + title: Reproduction + type: numbered-list + instruction: Step-by-step reproduction steps with expected vs actual + - id: environment + title: Environment + type: bullet-list + instruction: | + - System: {{system_info}} + - Application: {{application_version}} + - Dependencies: {{dependencies_list}} + - Configuration: {{configuration_details}} + - id: timeline + title: Timeline + type: bullet-list + instruction: | + - First Observed: {{first_observed}} + - Frequency: {{occurrence_frequency}} + - Last Occurrence: {{last_occurrence}} + + - id: technical-analysis + title: Technical Analysis + instruction: Deep technical investigation results + sections: + - id: root-cause-analysis + title: Root Cause Analysis + type: paragraphs + instruction: Detailed root cause with evidence + - id: code-analysis + title: Code Analysis + type: code-block + instruction: | + [[LLM: Include problematic code snippet with language specified]] + Issue: {{code_issue_description}} + - id: pattern-analysis + title: Pattern Analysis + type: paragraphs + instruction: Any patterns detected in the defect + - id: test-coverage + title: Test Coverage + type: bullet-list + instruction: | + - Current Coverage: {{coverage_percentage}} + - Gap Identified: {{coverage_gaps}} + - Risk Areas: {{untested_areas}} + + - id: impact-assessment + title: Impact Assessment + instruction: Comprehensive impact analysis + sections: + - id: severity-matrix + title: Severity Matrix + type: table + columns: [Aspect, Impact, Severity] + instruction: | + [[LLM: Create table with Users Affected, Data Integrity, Performance, Security aspects]] + - id: business-impact + title: Business Impact + type: paragraphs + instruction: Business consequences of the issue + + - id: solution-recommendations + title: Solution & Recommendations + instruction: Fix proposals and prevention strategies + sections: + - id: immediate-fix + title: Immediate Fix + type: code-block + instruction: | + [[LLM: Include corrected code with validation steps]] + - id: short-term + title: Short-term Improvements + type: bullet-list + instruction: Improvements for this sprint + - id: long-term + title: Long-term Prevention + type: bullet-list + instruction: Strategic prevention measures + - id: process + title: Process Enhancements + type: bullet-list + instruction: Process improvements to prevent recurrence + + - id: implementation-plan + title: Implementation Plan + instruction: Phased approach to resolution + sections: + - id: phase1 + title: "Phase 1: Immediate (0-2 days)" + type: checkbox-list + instruction: Critical fixes to apply immediately + - id: phase2 + title: "Phase 2: Short-term (1 week)" + type: checkbox-list + instruction: Short-term improvements + - id: phase3 + title: "Phase 3: Long-term (1 month)" + type: checkbox-list + instruction: Long-term strategic changes + + - id: verification-testing + title: Verification & Testing + instruction: Validation strategy + sections: + - id: test-cases + title: Test Cases + type: numbered-list + instruction: Specific test cases to validate the fix + - id: regression + title: Regression Testing + type: paragraphs + instruction: Areas requiring regression testing + - id: monitoring + title: Monitoring + type: bullet-list + instruction: Metrics to monitor post-fix + + - id: lessons-learned + title: Lessons Learned + instruction: Knowledge capture for prevention + sections: + - id: what-went-wrong + title: What Went Wrong + type: paragraphs + instruction: Root causes beyond the code + - id: improvements + title: What Could Improve + type: bullet-list + instruction: Process and tool improvements + - id: knowledge-sharing + title: Knowledge Sharing + type: bullet-list + instruction: Information to share with team + + - id: appendices + title: Appendices + instruction: Supporting documentation + optional: true + sections: + - id: stack-traces + title: "Appendix A: Full Stack Traces" + type: code-block + instruction: Complete error traces if available + - id: logs + title: "Appendix B: Log Excerpts" + type: code-block + instruction: Relevant log entries + - id: metrics + title: "Appendix C: Performance Metrics" + type: paragraphs + instruction: Before/after performance data + - id: related + title: "Appendix D: Related Issues" + type: bullet-list + instruction: Links to similar problems + - id: references + title: "Appendix E: References" + type: bullet-list + instruction: Documentation, articles, tools used + + - id: footer + title: Report Footer + instruction: Closing metadata + sections: + - id: timestamps + title: Report Timestamps + type: key-value + instruction: | + Report Generated: {{generation_timestamp}} + Next Review: {{follow_up_date}} diff --git a/bmad-core/templates/defect-analysis-tmpl.yaml b/bmad-core/templates/defect-analysis-tmpl.yaml new file mode 100644 index 00000000..6cbc09d8 --- /dev/null +++ b/bmad-core/templates/defect-analysis-tmpl.yaml @@ -0,0 +1,339 @@ +# +template: + id: defect-analysis-template-v1 + name: Defect Analysis Report + version: 1.0 + output: + format: markdown + filename: docs/debug/defect-{{defect_id}}.md + title: "Defect Analysis Report - DEF-{{defect_id}}" + +workflow: + mode: rapid + elicitation: false + +sections: + - id: header + title: Report Header + instruction: Generate report header with metadata + sections: + - id: metadata + title: Report Metadata + type: key-value + instruction: | + Defect ID: DEF-{{defect_id}} + Date: {{current_date}} + Analyst: {{analyst_name}} + Component: {{affected_component}} + + - id: classification + title: Defect Classification + instruction: Categorize and classify the defect + sections: + - id: basic-info + title: Basic Information + type: key-value + instruction: | + Type: {{defect_type}} + Severity: {{severity_level}} + Priority: {{priority_level}} + Status: {{current_status}} + Environment: {{environment}} + - id: categorization + title: Categorization + type: key-value + instruction: | + Category: {{defect_category}} + Subcategory: {{defect_subcategory}} + Root Cause Type: {{root_cause_type}} + Detection Method: {{how_detected}} + + - id: description + title: Defect Description + instruction: Comprehensive defect details + sections: + - id: summary + title: Summary + type: text + instruction: Brief one-line defect summary + - id: detailed + title: Detailed Description + type: paragraphs + instruction: Complete description of the defect and its behavior + - id: expected + title: Expected Behavior + type: paragraphs + instruction: What should happen under normal conditions + - id: actual + title: Actual Behavior + type: paragraphs + instruction: What actually happens when the defect occurs + - id: delta + title: Delta Analysis + type: paragraphs + instruction: Analysis of the difference between expected and actual + + - id: reproduction + title: Reproduction + instruction: How to reproduce the defect + sections: + - id: prerequisites + title: Prerequisites + type: bullet-list + instruction: Required setup, data, or conditions before reproduction + - id: steps + title: Steps to Reproduce + type: numbered-list + instruction: Exact steps to trigger the defect + - id: frequency + title: Frequency + type: key-value + instruction: | + Reproducibility: {{reproducibility_rate}} + Occurrence Pattern: {{occurrence_pattern}} + Triggers: {{trigger_conditions}} + + - id: technical-analysis + title: Technical Analysis + instruction: Deep technical investigation + sections: + - id: location + title: Code Location + type: key-value + instruction: | + File: {{file_path}} + Function/Method: {{function_name}} + Line Numbers: {{line_numbers}} + Module: {{module_name}} + - id: code + title: Code Snippet + type: code-block + instruction: | + [[LLM: Include the defective code with proper syntax highlighting]] + - id: mechanism + title: Defect Mechanism + type: paragraphs + instruction: Detailed explanation of how the defect works + - id: data-flow + title: Data Flow Analysis + type: paragraphs + instruction: How data flows through the defective code + - id: control-flow + title: Control Flow Analysis + type: paragraphs + instruction: Control flow issues contributing to the defect + + - id: impact-assessment + title: Impact Assessment + instruction: Comprehensive impact analysis + sections: + - id: user-impact + title: User Impact + type: key-value + instruction: | + Affected Users: {{users_affected}} + User Experience: {{ux_impact}} + Workaround Available: {{workaround_exists}} + Workaround Description: {{workaround_details}} + - id: system-impact + title: System Impact + type: key-value + instruction: | + Performance: {{performance_impact}} + Stability: {{stability_impact}} + Security: {{security_impact}} + Data Integrity: {{data_impact}} + - id: business-impact + title: Business Impact + type: key-value + instruction: | + Revenue Impact: {{revenue_impact}} + Reputation Risk: {{reputation_risk}} + Compliance Issues: {{compliance_impact}} + SLA Violations: {{sla_impact}} + + - id: root-cause + title: Root Cause + instruction: Root cause identification + sections: + - id: immediate + title: Immediate Cause + type: paragraphs + instruction: The direct cause of the defect + - id: underlying + title: Underlying Cause + type: paragraphs + instruction: The deeper systemic cause + - id: contributing + title: Contributing Factors + type: bullet-list + instruction: Factors that contributed to the defect + - id: prevention-failure + title: Prevention Failure + type: paragraphs + instruction: Why existing processes didn't prevent this defect + + - id: fix-analysis + title: Fix Analysis + instruction: Solution proposals + sections: + - id: proposed-fix + title: Proposed Fix + type: code-block + instruction: | + [[LLM: Include the corrected code with proper syntax highlighting]] + - id: explanation + title: Fix Explanation + type: paragraphs + instruction: Detailed explanation of how the fix works + - id: alternatives + title: Alternative Solutions + type: numbered-list + instruction: Other possible solutions with pros/cons + - id: tradeoffs + title: Trade-offs + type: bullet-list + instruction: Trade-offs of the chosen solution + + - id: testing-strategy + title: Testing Strategy + instruction: Comprehensive test plan + sections: + - id: unit-tests + title: Unit Tests Required + type: checkbox-list + instruction: Unit tests to validate the fix + - id: integration-tests + title: Integration Tests Required + type: checkbox-list + instruction: Integration tests needed + - id: regression-tests + title: Regression Tests Required + type: checkbox-list + instruction: Regression tests to ensure no breaks + - id: edge-cases + title: Edge Cases to Test + type: bullet-list + instruction: Edge cases that must be tested + - id: performance-tests + title: Performance Tests + type: bullet-list + instruction: Performance tests if applicable + + - id: risk-assessment + title: Risk Assessment + instruction: Fix implementation risks + sections: + - id: fix-risk + title: Fix Risk + type: key-value + instruction: | + Implementation Risk: {{implementation_risk}} + Regression Risk: {{regression_risk}} + Side Effects: {{potential_side_effects}} + - id: mitigation + title: Mitigation Strategy + type: paragraphs + instruction: How to mitigate identified risks + - id: rollback + title: Rollback Plan + type: numbered-list + instruction: Steps to rollback if fix causes issues + + - id: quality-metrics + title: Quality Metrics + instruction: Defect and code quality metrics + sections: + - id: defect-metrics + title: Defect Metrics + type: key-value + instruction: | + Escape Stage: {{escape_stage}} + Detection Time: {{time_to_detect}} + Fix Time: {{time_to_fix}} + Test Coverage Before: {{coverage_before}} + Test Coverage After: {{coverage_after}} + - id: code-metrics + title: Code Quality Metrics + type: key-value + instruction: | + Cyclomatic Complexity: {{complexity_score}} + Code Duplication: {{duplication_percentage}} + Technical Debt: {{tech_debt_impact}} + + - id: prevention-strategy + title: Prevention Strategy + instruction: How to prevent similar defects + sections: + - id: immediate-prevention + title: Immediate Prevention + type: bullet-list + instruction: Quick wins to prevent recurrence + - id: longterm-prevention + title: Long-term Prevention + type: bullet-list + instruction: Strategic prevention measures + - id: process-improvements + title: Process Improvements + type: bullet-list + instruction: Process changes to prevent similar defects + - id: tool-enhancements + title: Tool Enhancements + type: bullet-list + instruction: Tool improvements needed + + - id: related-information + title: Related Information + instruction: Additional context + optional: true + sections: + - id: similar-defects + title: Similar Defects + type: bullet-list + instruction: Links to similar defects in the system + - id: related-issues + title: Related Issues + type: bullet-list + instruction: Related tickets or issues + - id: dependencies + title: Dependencies + type: bullet-list + instruction: Dependencies affected by this defect + - id: documentation + title: Documentation Updates Required + type: bullet-list + instruction: Documentation that needs updating + + - id: action-items + title: Action Items + instruction: Tasks and assignments + sections: + - id: actions-table + title: Action Items Table + type: table + columns: [Action, Owner, Due Date, Status] + instruction: | + [[LLM: Create table with specific action items, owners, and dates]] + + - id: approval + title: Approval & Sign-off + instruction: Review and approval tracking + sections: + - id: signoff + title: Sign-off + type: key-value + instruction: | + Developer: {{developer_name}} - {{developer_date}} + QA: {{qa_name}} - {{qa_date}} + Lead: {{lead_name}} - {{lead_date}} + + - id: footer + title: Report Footer + instruction: Closing metadata + sections: + - id: timestamps + title: Report Timestamps + type: key-value + instruction: | + Report Generated: {{generation_timestamp}} + Last Updated: {{last_update}} diff --git a/bmad-core/templates/root-cause-tmpl.yaml b/bmad-core/templates/root-cause-tmpl.yaml new file mode 100644 index 00000000..6bb08f61 --- /dev/null +++ b/bmad-core/templates/root-cause-tmpl.yaml @@ -0,0 +1,268 @@ +# +template: + id: root-cause-template-v1 + name: Root Cause Analysis + version: 1.0 + output: + format: markdown + filename: docs/debug/rca-{{timestamp}}.md + title: "Root Cause Analysis: {{problem_title}}" + +workflow: + mode: rapid + elicitation: false + +sections: + - id: header + title: Analysis Header + instruction: Generate analysis header with metadata + sections: + - id: metadata + title: Analysis Metadata + type: key-value + instruction: | + Analysis ID: RCA-{{timestamp}} + Date: {{current_date}} + Analyst: {{analyst_name}} + Method: Fishbone (Ishikawa) Diagram + 5-Whys + + - id: problem-statement + title: Problem Statement + instruction: Clear problem definition + sections: + - id: what + title: What + type: text + instruction: Clear description of the problem + - id: when + title: When + type: text + instruction: Timing and frequency of occurrence + - id: where + title: Where + type: text + instruction: Location/component affected + - id: impact + title: Impact + type: text + instruction: Quantified impact on system/users + + - id: fishbone-analysis + title: Fishbone Analysis + instruction: | + [[LLM: Create ASCII fishbone diagram showing all 6 categories branching into the problem]] + sections: + - id: diagram + title: Fishbone Diagram + type: code-block + instruction: | + ``` + {{problem_title}} + | + People ---------------+--------------- Process + \ | / + \ | / + \ | / + \ | / + \ | / + Technology ----------+---------- Environment + \ | / + \ | / + \ | / + \ | / + \ | / + Data -----------+----------- Methods + ``` + - id: people + title: People (Developer/User Factors) + type: bullet-list + instruction: Knowledge gaps, communication issues, training needs, user behavior + - id: process + title: Process (Development/Deployment) + type: bullet-list + instruction: Development process, deployment procedures, code review, testing + - id: technology + title: Technology (Tools/Infrastructure) + type: bullet-list + instruction: Framework limitations, library issues, tool configurations, infrastructure + - id: environment + title: Environment (System/Configuration) + type: bullet-list + instruction: Environment differences, resource constraints, external dependencies + - id: data + title: Data (Input/State) + type: bullet-list + instruction: Input validation, data integrity, state management, race conditions + - id: methods + title: Methods (Algorithms/Design) + type: bullet-list + instruction: Algorithm correctness, design patterns, architecture decisions + + - id: five-whys + title: 5-Whys Analysis + instruction: Deep dive to root cause + sections: + - id: symptom + title: Primary Symptom + type: text + instruction: Starting point for analysis + - id: why1 + title: "1. Why?" + type: text + instruction: First level cause + - id: why2 + title: "2. Why?" + type: text + instruction: Second level cause + - id: why3 + title: "3. Why?" + type: text + instruction: Third level cause + - id: why4 + title: "4. Why?" + type: text + instruction: Fourth level cause + - id: why5 + title: "5. Why?" + type: text + instruction: Fifth level cause (root cause) + - id: root-cause + title: Root Cause + type: text + instruction: Final identified root cause + + - id: evidence-validation + title: Evidence & Validation + instruction: Support for conclusions + sections: + - id: evidence + title: Supporting Evidence + type: bullet-list + instruction: List all evidence supporting the root cause conclusion + - id: verification + title: Verification Method + type: paragraphs + instruction: How to verify this is the true root cause + - id: confidence + title: Confidence Level + type: key-value + instruction: | + Rating: {{confidence_rating}} + Justification: {{confidence_justification}} + + - id: root-cause-summary + title: Root Cause Summary + instruction: Consolidated findings + sections: + - id: primary + title: Primary Root Cause + type: key-value + instruction: | + Cause: {{primary_root_cause}} + Category: {{cause_category}} + Evidence: {{primary_evidence}} + - id: contributing + title: Contributing Factors + type: numbered-list + instruction: Secondary factors that contributed to the problem + - id: eliminated + title: Eliminated Possibilities + type: numbered-list + instruction: Potential causes that were ruled out and why + + - id: impact-analysis + title: Impact Analysis + instruction: Scope and consequences + sections: + - id: direct + title: Direct Impact + type: paragraphs + instruction: Immediate consequences of the problem + - id: indirect + title: Indirect Impact + type: paragraphs + instruction: Secondary effects and ripple impacts + - id: recurrence + title: Risk of Recurrence + type: key-value + instruction: | + Probability: {{recurrence_probability}} + Without intervention: {{risk_without_fix}} + + - id: recommendations + title: Recommended Actions + instruction: Solutions and prevention + sections: + - id: immediate + title: Immediate Actions + type: checkbox-list + instruction: Actions to take right now to address the issue + - id: short-term + title: Short-term Solutions + type: checkbox-list + instruction: Solutions to implement within the current sprint + - id: long-term + title: Long-term Prevention + type: checkbox-list + instruction: Strategic changes to prevent recurrence + - id: process-improvements + title: Process Improvements + type: bullet-list + instruction: Process changes to prevent similar issues + + - id: implementation-priority + title: Implementation Priority + instruction: Action prioritization + sections: + - id: priority-matrix + title: Priority Matrix + type: table + columns: [Action, Priority, Effort, Impact, Timeline] + instruction: | + [[LLM: Create prioritized action table with High/Medium/Low ratings]] + + - id: verification-plan + title: Verification Plan + instruction: Ensuring fix effectiveness + sections: + - id: success-criteria + title: Success Criteria + type: bullet-list + instruction: How we'll know the root cause is addressed + - id: validation-steps + title: Validation Steps + type: numbered-list + instruction: Steps to validate the fix works + - id: monitoring-metrics + title: Monitoring Metrics + type: bullet-list + instruction: Metrics to track to ensure problem doesn't recur + + - id: lessons-learned + title: Lessons Learned + instruction: Knowledge capture + sections: + - id: insights + title: Key Insights + type: bullet-list + instruction: What we learned from this analysis + - id: prevention + title: Prevention Strategies + type: bullet-list + instruction: How to prevent similar issues in the future + - id: knowledge-transfer + title: Knowledge Transfer + type: bullet-list + instruction: Information to share with the team + + - id: footer + title: Analysis Footer + instruction: Closing information + sections: + - id: completion + title: Completion Details + type: key-value + instruction: | + Analysis Completed: {{completion_timestamp}} + Review Date: {{review_date}} + Owner: {{action_owner}}