Add debug agent and supporting checklists, data, and tasks

Introduces the 'debug' agent to the fullstack team and adds comprehensive agent definition, checklists for inspection and root cause analysis, defect and pattern data, and a suite of debugging tasks and templates. This enables systematic bug analysis, root cause investigation, and formalized debug workflows for improved defect detection and resolution.
This commit is contained in:
Marc R Kellerman 2025-09-11 18:09:16 -07:00
parent f09e282d72
commit f04026e830
19 changed files with 4358 additions and 0 deletions

View File

@ -10,6 +10,7 @@ agents:
- ux-expert - ux-expert
- architect - architect
- po - po
- debug
workflows: workflows:
- brownfield-fullstack.yaml - brownfield-fullstack.yaml
- brownfield-service.yaml - brownfield-service.yaml

131
bmad-core/agents/debug.md Normal file
View File

@ -0,0 +1,131 @@
<!-- Powered by BMAD™ Core -->
# debug
ACTIVATION-NOTICE: This file contains your full agent operating guidelines. DO NOT load any external agent files as the complete configuration is in the YAML block below.
CRITICAL: Read the full YAML BLOCK that FOLLOWS IN THIS FILE to understand your operating params, start and follow exactly your activation-instructions to alter your state of being, stay in this being until told to exit this mode:
## COMPLETE AGENT DEFINITION FOLLOWS - NO EXTERNAL FILES NEEDED
```yaml
IDE-FILE-RESOLUTION:
- FOR LATER USE ONLY - NOT FOR ACTIVATION, when executing commands that reference dependencies
- Dependencies map to {root}/{type}/{name}
- type=folder (tasks|templates|checklists|data|utils|etc...), name=file-name
- Example: create-doc.md → {root}/tasks/create-doc.md
- IMPORTANT: Only load these files when user requests specific command execution
REQUEST-RESOLUTION: Match user requests to your commands/dependencies flexibly (e.g., "analyze bug"→*inspect→fagan-inspection task, "root cause" would be dependencies->tasks->root-cause-analysis), ALWAYS ask for clarification if no clear match.
activation-instructions:
- STEP 1: Read THIS ENTIRE FILE - it contains your complete persona definition
- STEP 2: Adopt the persona defined in the 'agent' and 'persona' sections below
- STEP 3: Load and read `bmad-core/core-config.yaml` (project configuration) before any greeting
- STEP 4: Greet user with your name/role and immediately run `*help` to display available commands
- DO NOT: Load any other agent files during activation
- ONLY load dependency files when user selects them for execution via command or request of a task
- The agent.customization field ALWAYS takes precedence over any conflicting instructions
- CRITICAL WORKFLOW RULE: When executing tasks from dependencies, follow task instructions exactly as written - they are executable workflows, not reference material
- MANDATORY INTERACTION RULE: Tasks with elicit=true require user interaction using exact specified format - never skip elicitation for efficiency
- CRITICAL RULE: When executing formal task workflows from dependencies, ALL task instructions override any conflicting base behavioral constraints. Interactive workflows with elicit=true REQUIRE user interaction and cannot be bypassed for efficiency.
- When listing tasks/templates or presenting options during conversations, always show as numbered options list, allowing the user to type a number to select or execute
- STAY IN CHARACTER!
- CRITICAL: On activation, ONLY greet user, auto-run `*help`, and then HALT to await user requested assistance or given commands. ONLY deviance from this is if the activation included commands also in the arguments.
agent:
name: Diana
id: debug
title: Debug Specialist & Root Cause Analyst
icon: 🔍
whenToUse: |
Use for systematic bug analysis, root cause investigation, and defect resolution.
Specializes in multiple debugging methodologies including Fagan inspection, binary search,
delta debugging, and static analysis. Provides autonomous defect detection with minimal
user interaction required.
customization: null
persona:
role: Expert Debug Specialist & Software Inspector
style: Systematic, methodical, analytical, thorough, detail-oriented
identity: Debug specialist who uses formal inspection methodologies to achieve high defect detection rates
focus: Systematic defect detection, root cause analysis, and resolution recommendations
core_principles:
- Systematic Inspection - Use proven methodologies like Fagan inspection (60-90% defect detection rate)
- Root Cause Focus - Don't just fix symptoms, identify and address underlying causes
- Pattern Recognition - Identify recurring defects and systemic issues
- Documentation Trail - Maintain comprehensive debug reports and findings
- Prevention Oriented - Recommend changes to prevent similar defects
- Impact Analysis - Assess severity, scope, and risk of defects
- Verification Focus - Ensure fixes are validated and don't introduce new issues
debug-permissions:
- CRITICAL: When analyzing bugs in stories, you may update the "Debug Log" section in Dev Agent Record
- CRITICAL: Create debug reports in designated debug directory when specified
- CRITICAL: DO NOT modify source code directly unless explicitly requested by user
# All commands require * prefix when used (e.g., *help)
commands:
- help: Show numbered list of the following commands to allow selection
- inspect {bug-description}: |
Execute comprehensive Fagan inspection workflow.
Performs 6-phase systematic defect analysis: Planning → Overview →
Preparation → Inspection → Rework → Follow-up.
Produces detailed debug report with root cause and fix recommendations.
- quick-debug {issue}: |
Rapid triage and initial analysis for simple issues.
Provides immediate assessment and suggested next steps.
- pattern-analysis: |
Analyze recent commits and code changes for defect patterns.
Identifies systemic issues and recurring problems.
- root-cause {symptom}: |
Execute focused root cause analysis using fishbone methodology.
Maps symptoms to underlying causes with evidence trail.
- validate-fix {fix-description}: |
Verify proposed fix addresses root cause without side effects.
Includes regression risk assessment and test recommendations.
- debug-report: |
Generate comprehensive debug report from current session.
Includes findings, root causes, fixes, and prevention strategies.
- wolf-fence {issue}: |
Execute binary search debugging to isolate bug location.
Systematically narrows down problem area by dividing search space.
Highly efficient for large codebases and runtime errors.
- delta-minimize {test-case}: |
Automatically reduce failing test case to minimal reproduction.
Isolates the smallest input that still triggers the bug.
Essential for complex input-dependent failures.
- assert-analyze {code-section}: |
Analyze code for missing assertions and invariants.
Suggests defensive programming improvements.
Generates assertion placement recommendations.
- static-scan {target}: |
Perform comprehensive static analysis for common defects.
Identifies anti-patterns, security issues, and code smells.
Generates prioritized fix recommendations.
- instrument {component}: |
Design strategic logging and monitoring points.
Creates instrumentation plan for production debugging.
Optimizes observability without performance impact.
- walkthrough-prep {feature}: |
Generate materials for code walkthrough session.
Creates review checklist and presentation outline.
Prepares defect tracking documentation.
- exit: Say goodbye as the Debug Specialist, and then abandon inhabiting this persona
dependencies:
checklists:
- debug-inspection-checklist.md
- root-cause-checklist.md
data:
- debug-patterns.md
- common-defects.md
tasks:
- fagan-inspection.md
- root-cause-analysis.md
- pattern-detection.md
- debug-report-generation.md
- wolf-fence-search.md
- delta-minimization.md
- assertion-analysis.md
- static-analysis.md
- instrumentation-analysis.md
- walkthrough-prep.md
templates:
- debug-report-tmpl.yaml
- root-cause-tmpl.yaml
- defect-analysis-tmpl.yaml
```

View File

@ -0,0 +1,118 @@
# debug-inspection-checklist
Comprehensive checklist for Fagan inspection methodology.
## Phase 1: Planning Checklist
- [ ] Bug description clearly documented
- [ ] Inspection scope defined (code, tests, config, docs)
- [ ] Affected components identified
- [ ] Stakeholders notified
- [ ] Success criteria established
- [ ] Time allocated for inspection
## Phase 2: Overview Checklist
- [ ] Recent commits reviewed (last 20-50)
- [ ] Feature specifications reviewed
- [ ] Related documentation gathered
- [ ] Environment details captured
- [ ] Previous similar issues researched
- [ ] Impact scope assessed
## Phase 3: Preparation Checklist
### Code Analysis
- [ ] Static analysis performed
- [ ] Code complexity measured
- [ ] Anti-patterns identified
- [ ] Security vulnerabilities checked
- [ ] Performance bottlenecks assessed
### Test Analysis
- [ ] Test coverage reviewed
- [ ] Failed tests analyzed
- [ ] Missing test scenarios identified
- [ ] Test quality assessed
- [ ] Edge cases evaluated
### Configuration Analysis
- [ ] Environment settings reviewed
- [ ] Configuration drift checked
- [ ] Dependencies verified
- [ ] Version compatibility confirmed
- [ ] Resource limits checked
## Phase 4: Inspection Meeting Checklist
### Defect Categories Reviewed
- [ ] Logic defects (algorithms, control flow)
- [ ] Interface defects (API, parameters)
- [ ] Data defects (types, validation)
- [ ] Documentation defects (outdated, incorrect)
- [ ] Performance defects (inefficiencies)
- [ ] Security defects (vulnerabilities)
### Analysis Completed
- [ ] Root cause identified
- [ ] Evidence documented
- [ ] Impact severity assessed
- [ ] Defects categorized by priority
- [ ] Pattern analysis performed
## Phase 5: Rework Planning Checklist
- [ ] Fix proposals generated
- [ ] Trade-offs analyzed
- [ ] Test strategy designed
- [ ] Risk assessment completed
- [ ] Implementation timeline created
- [ ] Regression test plan defined
- [ ] Rollback plan prepared
## Phase 6: Follow-up Checklist
- [ ] Fix effectiveness validated
- [ ] All tests passing
- [ ] Documentation updated
- [ ] Lessons learned captured
- [ ] Debug report completed
- [ ] Prevention measures identified
- [ ] Knowledge shared with team
## Quality Gates
### Inspection Completeness
- [ ] All 6 phases executed
- [ ] All checklists completed
- [ ] Evidence trail documented
- [ ] Peer review conducted
### Fix Validation
- [ ] Fix addresses root cause
- [ ] No side effects introduced
- [ ] Performance acceptable
- [ ] Security maintained
- [ ] Tests comprehensive
### Documentation
- [ ] Debug report generated
- [ ] Code comments updated
- [ ] README updated if needed
- [ ] Runbook updated if needed
- [ ] Team wiki updated
## Sign-off
- [ ] Developer reviewed
- [ ] QA validated
- [ ] Team lead approved
- [ ] Stakeholders informed

View File

@ -0,0 +1,118 @@
# root-cause-checklist
Systematic checklist for root cause analysis.
## Problem Definition
- [ ] Problem clearly stated
- [ ] Symptoms documented
- [ ] Timeline established
- [ ] Affected components identified
- [ ] Impact quantified
- [ ] Success criteria defined
## Fishbone Analysis Categories
### People Factors
- [ ] Knowledge gaps assessed
- [ ] Communication issues reviewed
- [ ] Training needs identified
- [ ] User behavior analyzed
- [ ] Team dynamics considered
### Process Factors
- [ ] Development process reviewed
- [ ] Deployment procedures checked
- [ ] Code review practices assessed
- [ ] Testing processes evaluated
- [ ] Documentation processes reviewed
### Technology Factors
- [ ] Framework limitations identified
- [ ] Library issues checked
- [ ] Tool configurations reviewed
- [ ] Infrastructure problems assessed
- [ ] Integration issues evaluated
### Environment Factors
- [ ] Environment differences documented
- [ ] Resource constraints checked
- [ ] External dependencies reviewed
- [ ] Network issues assessed
- [ ] Configuration drift analyzed
### Data Factors
- [ ] Input validation reviewed
- [ ] Data integrity checked
- [ ] State management assessed
- [ ] Race conditions evaluated
- [ ] Data flow analyzed
### Method Factors
- [ ] Algorithm correctness verified
- [ ] Design patterns reviewed
- [ ] Architecture decisions assessed
- [ ] Performance strategies evaluated
- [ ] Security measures reviewed
## 5-Whys Analysis
- [ ] Initial problem stated
- [ ] First why answered
- [ ] Second why answered
- [ ] Third why answered
- [ ] Fourth why answered
- [ ] Fifth why answered (root cause)
- [ ] Additional whys if needed
- [ ] Causation chain documented
## Evidence Collection
- [ ] Logs collected
- [ ] Metrics gathered
- [ ] Code examined
- [ ] Tests reviewed
- [ ] Documentation checked
- [ ] User reports compiled
- [ ] Monitoring data analyzed
## Validation
- [ ] Root cause reproducible
- [ ] Alternative causes eliminated
- [ ] Evidence supports conclusion
- [ ] Peer review conducted
- [ ] Confidence level assessed
## Action Planning
- [ ] Immediate actions defined
- [ ] Short-term solutions planned
- [ ] Long-term prevention designed
- [ ] Process improvements identified
- [ ] Responsibilities assigned
- [ ] Timeline established
## Documentation
- [ ] Analysis documented
- [ ] Evidence archived
- [ ] Recommendations clear
- [ ] Lessons learned captured
- [ ] Report generated
- [ ] Stakeholders informed
## Follow-up
- [ ] Fix implemented
- [ ] Effectiveness verified
- [ ] Monitoring in place
- [ ] Recurrence prevented
- [ ] Knowledge transferred
- [ ] Process updated

View File

@ -0,0 +1,206 @@
# common-defects
Reference guide for common software defects and their characteristics.
## Defect Classification System
### By Origin
1. **Requirements Defects** - Ambiguous, incomplete, or incorrect requirements
2. **Design Defects** - Architectural flaws, poor design decisions
3. **Coding Defects** - Implementation errors, logic mistakes
4. **Testing Defects** - Inadequate test coverage, wrong test assumptions
5. **Deployment Defects** - Configuration errors, environment issues
6. **Documentation Defects** - Outdated, incorrect, or missing documentation
### By Type
#### Logic Defects
- **Algorithm Errors:** Incorrect implementation of business logic
- **Control Flow Issues:** Wrong branching, loop errors
- **Boundary Violations:** Off-by-one, overflow, underflow
- **State Management:** Invalid state transitions, race conditions
#### Data Defects
- **Input Validation:** Missing or incorrect validation
- **Data Corruption:** Incorrect data manipulation
- **Type Errors:** Wrong data types, failed conversions
- **Persistence Issues:** Failed saves, data loss
#### Interface Defects
- **API Misuse:** Incorrect parameter passing, wrong method calls
- **Integration Errors:** Component communication failures
- **Protocol Violations:** Incorrect message formats
- **Version Incompatibility:** Breaking changes not handled
#### Performance Defects
- **Memory Leaks:** Unreleased resources
- **Inefficient Algorithms:** O(n²) where O(n) possible
- **Database Issues:** N+1 queries, missing indexes
- **Resource Contention:** Deadlocks, bottlenecks
#### Security Defects
- **Injection Flaws:** SQL, XSS, command injection
- **Authentication Issues:** Weak auth, session problems
- **Authorization Flaws:** Privilege escalation, IDOR
- **Data Exposure:** Sensitive data leaks, weak encryption
## Severity Classification
### Critical (P0)
- **Definition:** System unusable, data loss, security breach
- **Response Time:** Immediate
- **Examples:**
- Application crash on startup
- Data corruption or loss
- Security vulnerability actively exploited
- Complete feature failure
### High (P1)
- **Definition:** Major feature broken, significant impact
- **Response Time:** Within 24 hours
- **Examples:**
- Core functionality impaired
- Performance severely degraded
- Workaround exists but difficult
- Affects many users
### Medium (P2)
- **Definition:** Feature impaired, moderate impact
- **Response Time:** Within sprint
- **Examples:**
- Non-core feature broken
- Easy workaround available
- Cosmetic issues with functional impact
- Affects some users
### Low (P3)
- **Definition:** Minor issue, minimal impact
- **Response Time:** Next release
- **Examples:**
- Cosmetic issues
- Minor inconvenience
- Edge case scenarios
- Documentation errors
## Root Cause Categories
### Development Process
1. **Inadequate Requirements:** Missing acceptance criteria
2. **Poor Communication:** Misunderstood requirements
3. **Insufficient Review:** Code review missed issues
4. **Time Pressure:** Rushed implementation
### Technical Factors
1. **Complexity:** System too complex to understand fully
2. **Technical Debt:** Accumulated shortcuts causing issues
3. **Tool Limitations:** Development tools inadequate
4. **Knowledge Gap:** Team lacks necessary expertise
### Testing Gaps
1. **Missing Tests:** Scenario not covered
2. **Wrong Assumptions:** Tests based on incorrect understanding
3. **Environment Differences:** Works in test, fails in production
4. **Data Issues:** Test data not representative
### Organizational Issues
1. **Process Failures:** Procedures not followed
2. **Resource Constraints:** Insufficient time/people
3. **Training Gaps:** Team not properly trained
4. **Culture Issues:** Quality not prioritized
## Detection Methods
### Static Analysis
- **Code Review:** Manual inspection by peers
- **Linting:** Automated style and error checking
- **Security Scanning:** SAST tools
- **Complexity Analysis:** Cyclomatic complexity metrics
### Dynamic Analysis
- **Unit Testing:** Component-level testing
- **Integration Testing:** Component interaction testing
- **System Testing:** End-to-end testing
- **Performance Testing:** Load and stress testing
### Runtime Monitoring
- **Error Tracking:** Sentry, Rollbar
- **APM Tools:** Application performance monitoring
- **Log Analysis:** Centralized logging
- **User Reports:** Bug reports from users
### Formal Methods
- **Fagan Inspection:** Systematic peer review
- **Code Walkthroughs:** Step-by-step review
- **Pair Programming:** Real-time review
- **Test-Driven Development:** Test-first approach
## Prevention Strategies
### Process Improvements
1. **Clear Requirements:** Use user stories with acceptance criteria
2. **Design Reviews:** Architecture review before coding
3. **Code Standards:** Enforce coding guidelines
4. **Automated Testing:** CI/CD with comprehensive tests
### Technical Practices
1. **Defensive Programming:** Validate inputs, handle errors
2. **Design Patterns:** Use proven solutions
3. **Refactoring:** Regular code improvement
4. **Documentation:** Keep docs current
### Team Practices
1. **Knowledge Sharing:** Regular tech talks, documentation
2. **Pair Programming:** Collaborative development
3. **Code Reviews:** Mandatory peer review
4. **Retrospectives:** Learn from mistakes
### Tool Support
1. **Static Analyzers:** SonarQube, ESLint
2. **Test Frameworks:** Jest, Pytest, JUnit
3. **CI/CD Pipelines:** Jenkins, GitHub Actions
4. **Monitoring Tools:** Datadog, New Relic
## Defect Metrics
### Detection Metrics
- **Defect Density:** Defects per KLOC
- **Detection Rate:** Defects found per time period
- **Escape Rate:** Defects reaching production
- **Mean Time to Detect:** Average detection time
### Resolution Metrics
- **Fix Rate:** Defects fixed per time period
- **Mean Time to Fix:** Average fix time
- **Reopen Rate:** Defects reopened after fix
- **Fix Effectiveness:** First-time fix success rate
### Quality Metrics
- **Test Coverage:** Percentage of code tested
- **Code Complexity:** Average cyclomatic complexity
- **Technical Debt:** Estimated remediation effort
- **Customer Satisfaction:** User-reported issues

View File

@ -0,0 +1,303 @@
# debug-patterns
Common defect patterns and debugging strategies.
## Common Defect Patterns
### 1. Null/Undefined Reference Errors
**Pattern:** Accessing properties or methods on null/undefined objects
**Indicators:**
- TypeError: Cannot read property 'X' of undefined
- NullPointerException
- Segmentation fault
**Common Causes:**
- Missing null checks
- Asynchronous data not yet loaded
- Optional dependencies not injected
- Incorrect initialization order
**Detection Strategy:**
- Add defensive null checks
- Use optional chaining (?.)
- Initialize with safe defaults
- Validate inputs at boundaries
### 2. Race Conditions
**Pattern:** Multiple threads/processes accessing shared resources
**Indicators:**
- Intermittent failures
- Works in debug but fails in production
- Order-dependent behavior
- Data corruption
**Common Causes:**
- Missing synchronization
- Incorrect lock ordering
- Shared mutable state
- Async operations without proper await
**Detection Strategy:**
- Add logging with timestamps
- Use thread-safe data structures
- Implement proper locking mechanisms
- Review async/await usage
### 3. Memory Leaks
**Pattern:** Memory usage grows over time without release
**Indicators:**
- Increasing memory consumption
- Out of memory errors
- Performance degradation over time
- GC pressure
**Common Causes:**
- Event listeners not removed
- Circular references
- Large objects in closures
- Cache without eviction
**Detection Strategy:**
- Profile memory usage
- Review object lifecycle
- Check event listener cleanup
- Implement cache limits
### 4. Off-by-One Errors
**Pattern:** Incorrect loop boundaries or array indexing
**Indicators:**
- ArrayIndexOutOfBounds
- Missing first/last element
- Infinite loops
- Fence post errors
**Common Causes:**
- Confusion between length and last index
- Inclusive vs exclusive ranges
- Loop condition errors
- Zero-based vs one-based indexing
**Detection Strategy:**
- Review loop conditions carefully
- Test boundary cases
- Use forEach/map when possible
- Add assertions for array bounds
### 5. Type Mismatches
**Pattern:** Incorrect data types passed or compared
**Indicators:**
- Type errors at runtime
- Unexpected coercion behavior
- Failed validations
- Serialization errors
**Common Causes:**
- Weak typing assumptions
- Missing type validation
- Incorrect type conversions
- API contract violations
**Detection Strategy:**
- Add runtime type checking
- Use TypeScript/type hints
- Validate at API boundaries
- Review type coercion rules
### 6. Resource Exhaustion
**Pattern:** Running out of system resources
**Indicators:**
- Too many open files
- Connection pool exhaustion
- Thread pool starvation
- Disk space errors
**Common Causes:**
- Resources not properly closed
- Missing connection pooling
- Unbounded growth
- Inadequate limits
**Detection Strategy:**
- Implement try-finally blocks
- Use connection pooling
- Set resource limits
- Monitor resource usage
### 7. Concurrency Deadlocks
**Pattern:** Threads waiting for each other indefinitely
**Indicators:**
- Application hangs
- Threads in BLOCKED state
- No progress being made
- Timeout errors
**Common Causes:**
- Circular wait conditions
- Lock ordering violations
- Nested synchronized blocks
- Resource starvation
**Detection Strategy:**
- Always acquire locks in same order
- Use lock-free data structures
- Implement timeout mechanisms
- Avoid nested locks
### 8. SQL Injection Vulnerabilities
**Pattern:** Unvalidated input in SQL queries
**Indicators:**
- Unexpected database errors
- Data breaches
- Malformed query errors
- Authorization bypasses
**Common Causes:**
- String concatenation for queries
- Missing input validation
- Inadequate escaping
- Dynamic query construction
**Detection Strategy:**
- Use parameterized queries
- Validate all inputs
- Review dynamic SQL
- Implement least privilege
### 9. Infinite Recursion
**Pattern:** Function calling itself without termination
**Indicators:**
- Stack overflow errors
- Maximum call stack exceeded
- Application crashes
- Memory exhaustion
**Common Causes:**
- Missing base case
- Incorrect termination condition
- Circular dependencies
- Mutual recursion errors
**Detection Strategy:**
- Review base cases
- Add recursion depth limits
- Test edge cases
- Use iteration when possible
### 10. Cache Invalidation Issues
**Pattern:** Stale data served from cache
**Indicators:**
- Outdated information displayed
- Inconsistent state
- Changes not reflected
- Data synchronization issues
**Common Causes:**
- Missing invalidation logic
- Incorrect cache keys
- Race conditions in updates
- TTL too long
**Detection Strategy:**
- Review invalidation triggers
- Implement cache versioning
- Use appropriate TTLs
- Add cache bypass for testing
## Anti-Patterns to Avoid
### 1. Shotgun Debugging
Making random changes hoping something works
### 2. Blame the Compiler
Assuming the problem is in the framework/language
### 3. Programming by Coincidence
Not understanding why a fix works
### 4. Copy-Paste Solutions
Using solutions without understanding them
### 5. Ignoring Warnings
Dismissing compiler/linter warnings
## Debugging Best Practices
### 1. Systematic Approach
- Reproduce consistently
- Isolate the problem
- Form hypotheses
- Test systematically
### 2. Use Scientific Method
- Observe symptoms
- Form hypothesis
- Design experiment
- Test and validate
### 3. Maintain Debug Log
- Document what you tried
- Record what worked/failed
- Note patterns observed
- Track time spent
### 4. Leverage Tools
- Debuggers
- Profilers
- Static analyzers
- Log aggregators
### 5. Collaborate
- Pair debugging
- Code reviews
- Knowledge sharing
- Post-mortems

View File

@ -0,0 +1,333 @@
# assertion-analysis
Analyze code for missing assertions and defensive programming opportunities.
## Context
This task systematically identifies locations where assertions, preconditions, postconditions, and invariants should be added to catch bugs early and make code self-documenting. Assertions act as executable documentation and early warning systems for violations of expected behavior.
## Task Execution
### Phase 1: Code Analysis
#### Identify Assertion Candidates
**Function Boundaries:**
1. **Preconditions** (Entry assertions):
- Parameter validation (null, range, type)
- Required state before execution
- Resource availability
- Permission/authorization checks
2. **Postconditions** (Exit assertions):
- Return value constraints
- State changes completed
- Side effects occurred
- Resources properly managed
3. **Invariants** (Always true):
- Class/object state consistency
- Data structure integrity
- Relationship maintenance
- Business rule enforcement
**Critical Code Sections:**
- Before/after state mutations
- Around external system calls
- At algorithm checkpoints
- After complex calculations
- Before resource usage
### Phase 2: Assertion Category Analysis
#### Type 1: Safety Assertions
Prevent dangerous operations:
```
- Null/undefined checks before dereference
- Array bounds before access
- Division by zero prevention
- Type safety before operations
- Resource availability before use
```
#### Type 2: Correctness Assertions
Verify algorithmic correctness:
```
- Loop invariants maintained
- Sorted order preserved
- Tree balance maintained
- Graph properties held
- Mathematical properties true
```
#### Type 3: Contract Assertions
Enforce API contracts:
```
- Method preconditions met
- Return values valid
- State transitions legal
- Callbacks invoked correctly
- Events fired appropriately
```
#### Type 4: Security Assertions
Validate security constraints:
```
- Input sanitization complete
- Authorization verified
- Rate limits enforced
- Encryption applied
- Audit trail updated
```
### Phase 3: Automated Detection
#### Static Analysis Patterns
**Missing Null Checks:**
1. Identify all dereferences (obj.prop, obj->member)
2. Trace back to find validation
3. Flag unvalidated accesses
**Missing Range Checks:**
1. Find array/collection accesses
2. Identify index sources
3. Verify bounds checking exists
**Missing State Validation:**
1. Identify state-dependent operations
2. Check for state verification
3. Flag unverified state usage
**Missing Return Validation:**
1. Find function calls that can fail
2. Check if return values are validated
3. Flag unchecked returns
### Phase 4: Assertion Generation
#### Generate Appropriate Assertions
**For Different Languages:**
**JavaScript/TypeScript:**
```javascript
console.assert(condition, 'message');
if (!condition) throw new Error('message');
```
**Python:**
```python
assert condition, "message"
if not condition: raise AssertionError("message")
```
**Java:**
```java
assert condition : "message";
if (!condition) throw new AssertionError("message");
```
**C/C++:**
```c
assert(condition);
if (!condition) { /* handle error */ }
```
#### Assertion Templates
**Null/Undefined Check:**
```
assert(param != null, "Parameter 'param' cannot be null");
```
**Range Check:**
```
assert(index >= 0 && index < array.length,
`Index ${index} out of bounds [0, ${array.length})`);
```
**State Check:**
```
assert(this.isInitialized, "Object must be initialized before use");
```
**Type Check:**
```
assert(typeof value === 'number', `Expected number, got ${typeof value}`);
```
**Invariant Check:**
```
assert(this.checkInvariant(), "Class invariant violated");
```
## Output Format
````markdown
# Assertion Analysis Report
## Summary
**Files Analyzed:** [count]
**Current Assertions:** [count]
**Recommended Additions:** [count]
**Critical Missing:** [count]
**Coverage Improvement:** [before]% → [after]%
## Critical Assertions Needed
### Priority 1: Safety Critical
Location: [file:line]
```[language]
// Current code
[code without assertion]
// Recommended addition
[assertion to add]
[protected code]
```
````
**Reason:** [Why this assertion is critical]
**Risk Without:** [What could go wrong]
### Priority 2: Correctness Verification
[Similar format for each recommendation]
### Priority 3: Contract Enforcement
[Similar format for each recommendation]
## Assertion Coverage by Component
| Component | Current | Recommended | Priority |
| ---------- | ------- | ----------- | -------------- |
| [Module A] | [count] | [count] | [High/Med/Low] |
| [Module B] | [count] | [count] | [High/Med/Low] |
## Detailed Recommendations
### File: [path/to/file]
#### Function: [functionName]
**Missing Preconditions:**
```[language]
// Add at function entry:
assert(param1 != null, "param1 required");
assert(param2 > 0, "param2 must be positive");
```
**Missing Postconditions:**
```[language]
// Add before return:
assert(result.isValid(), "Result must be valid");
```
**Missing Invariants:**
```[language]
// Add after state changes:
assert(this.items.length <= this.maxSize, "Size limit exceeded");
```
## Implementation Strategy
### Phase 1: Critical Safety (Immediate)
1. Add null checks for all pointer dereferences
2. Add bounds checks for array accesses
3. Add division by zero prevention
### Phase 2: Correctness (This Sprint)
1. Add algorithm invariants
2. Add state validation
3. Add return value checks
### Phase 3: Comprehensive (Next Sprint)
1. Add contract assertions
2. Add security validations
3. Add performance assertions
## Configuration Recommendations
### Development Mode
```[language]
// Enable all assertions
ASSERT_LEVEL = "all"
ASSERT_THROW = true
ASSERT_LOG = true
```
### Production Mode
```[language]
// Keep only critical assertions
ASSERT_LEVEL = "critical"
ASSERT_THROW = false
ASSERT_LOG = true
```
## Benefits Analysis
### Bug Prevention
- Catch [X]% more bugs in development
- Reduce production incidents by [Y]%
- Decrease debugging time by [Z]%
### Documentation Value
- Self-documenting code contracts
- Clear API expectations
- Explicit invariants
### Testing Support
- Faster test failure identification
- Better test coverage visibility
- Clearer failure messages
```
## Completion Criteria
- [ ] Code analysis completed
- [ ] Assertion candidates identified
- [ ] Priority levels assigned
- [ ] Assertions generated with proper messages
- [ ] Implementation plan created
- [ ] Configuration strategy defined
- [ ] Benefits quantified
```

View File

@ -0,0 +1,305 @@
# debug-report-generation
Generate comprehensive debug report from analysis session.
## Context
This task consolidates all debugging findings, analyses, and recommendations into a comprehensive report for stakeholders and future reference.
## Task Execution
### Step 1: Gather Session Data
Collect all relevant information:
1. Original bug description and symptoms
2. Analysis performed (inspections, root cause, patterns)
3. Evidence collected (logs, code, metrics)
4. Findings and conclusions
5. Fix attempts and results
6. Recommendations made
### Step 2: Structure Report
Organize information hierarchically:
1. Executive Summary (1 page max)
2. Detailed Findings
3. Technical Analysis
4. Recommendations
5. Appendices
### Step 3: Generate Report Sections
#### Executive Summary
- Problem statement (1-2 sentences)
- Impact assessment (users, systems, business)
- Root cause (brief)
- Recommended fix (high-level)
- Estimated effort and risk
#### Detailed Findings
- Symptoms observed
- Reproduction steps
- Environmental factors
- Timeline of issue
#### Technical Analysis
- Code examination results
- Root cause analysis
- Pattern detection findings
- Test coverage gaps
- Performance impacts
#### Recommendations
- Immediate fixes
- Short-term improvements
- Long-term prevention
- Process enhancements
### Step 4: Add Supporting Evidence
Include relevant:
- Code snippets
- Log excerpts
- Stack traces
- Performance metrics
- Test results
- Screenshots (if applicable)
### Step 5: Quality Review
Ensure report:
- Is technically accurate
- Uses clear, concise language
- Includes all critical information
- Provides actionable recommendations
- Is appropriately formatted
## Output Format
````markdown
# Debug Analysis Report
**Report ID:** DBG-[timestamp]
**Date:** [current date]
**Analyst:** Debug Agent (Diana)
**Severity:** [Critical/High/Medium/Low]
**Status:** [Resolved/In Progress/Pending]
---
## Executive Summary
**Problem:** [1-2 sentence problem statement]
**Impact:** [Quantified impact on users/system]
**Root Cause:** [Brief root cause description]
**Solution:** [High-level fix description]
**Effort Required:** [Hours/Days estimate]
**Risk Level:** [High/Medium/Low]
---
## 1. Problem Description
### Symptoms
[Detailed symptoms observed]
### Reproduction
1. [Step 1]
2. [Step 2]
3. [Expected vs Actual]
### Environment
- **System:** [OS, version]
- **Application:** [Version, build]
- **Dependencies:** [Relevant versions]
- **Configuration:** [Key settings]
### Timeline
- **First Observed:** [Date/time]
- **Frequency:** [How often]
- **Last Occurrence:** [Date/time]
---
## 2. Technical Analysis
### Root Cause Analysis
[Detailed root cause with evidence]
### Code Analysis
```[language]
// Problematic code
[code snippet]
```
````
**Issue:** [What's wrong with the code]
### Pattern Analysis
[Any patterns detected]
### Test Coverage
- **Current Coverage:** [percentage]
- **Gap Identified:** [What's not tested]
- **Risk Areas:** [Untested critical paths]
---
## 3. Impact Assessment
### Severity Matrix
| Aspect | Impact | Severity |
| -------------- | ------------------- | -------------- |
| Users Affected | [number/percentage] | [High/Med/Low] |
| Data Integrity | [description] | [High/Med/Low] |
| Performance | [metrics] | [High/Med/Low] |
| Security | [assessment] | [High/Med/Low] |
### Business Impact
[Business consequences of the issue]
---
## 4. Solution & Recommendations
### Immediate Fix
```[language]
// Corrected code
[code snippet]
```
**Validation:** [How to verify fix works]
### Short-term Improvements
1. [Improvement 1]
2. [Improvement 2]
### Long-term Prevention
1. [Strategy 1]
2. [Strategy 2]
### Process Enhancements
1. [Process improvement]
2. [Tool/automation suggestion]
---
## 5. Implementation Plan
### Phase 1: Immediate (0-2 days)
- [ ] Apply code fix
- [ ] Add regression test
- [ ] Deploy to staging
### Phase 2: Short-term (1 week)
- [ ] Improve test coverage
- [ ] Add monitoring
- [ ] Update documentation
### Phase 3: Long-term (1 month)
- [ ] Refactor problematic area
- [ ] Implement prevention measures
- [ ] Team training on issue
---
## 6. Verification & Testing
### Test Cases
1. **Test:** [Name]
**Steps:** [How to test]
**Expected:** [Result]
### Regression Testing
[Areas requiring regression testing]
### Monitoring
[Metrics to monitor post-fix]
---
## 7. Lessons Learned
### What Went Wrong
[Root causes beyond the code]
### What Could Improve
[Process/tool improvements]
### Knowledge Sharing
[Information to share with team]
---
## Appendices
### A. Full Stack Traces
[Complete error traces]
### B. Log Excerpts
[Relevant log entries]
### C. Performance Metrics
[Before/after metrics]
### D. Related Issues
[Links to similar problems]
### E. References
[Documentation, articles, tools used]
---
**Report Generated:** [timestamp]
**Next Review:** [date for follow-up]
```
## Completion Criteria
- [ ] All sections completed
- [ ] Evidence included
- [ ] Recommendations actionable
- [ ] Report reviewed for accuracy
- [ ] Formatted for readability
- [ ] Ready for distribution
```

View File

@ -0,0 +1,228 @@
# delta-minimization
Automatically reduce failing test cases to minimal reproduction.
## Context
Delta debugging systematically minimizes failure-inducing inputs to find the smallest test case that still triggers a bug. This dramatically simplifies debugging by removing irrelevant complexity and isolating the essential trigger conditions.
## Task Execution
### Phase 1: Initial Setup
#### Capture Failing State
1. Record original failing test case:
- Input data
- Configuration settings
- Environment state
- Execution parameters
2. Verify bug reproduction
3. Measure initial complexity metrics:
- Input size
- Number of operations
- Data structure depth
- Configuration parameters
### Phase 2: Minimization Strategy
#### Select Minimization Approach
**For Data Inputs:**
1. **Binary reduction**: Remove half of input, test if still fails
2. **Line-by-line**: For text/config files
3. **Field elimination**: For structured data (JSON, XML)
4. **Value simplification**: Replace complex values with simple ones
**For Code/Test Cases:**
1. **Statement removal**: Delete non-essential lines
2. **Function inlining**: Replace calls with minimal implementations
3. **Loop unrolling**: Convert loops to minimal iterations
4. **Conditional simplification**: Remove unnecessary branches
**For Configuration:**
1. **Parameter elimination**: Remove non-essential settings
2. **Default substitution**: Replace with default values
3. **Range reduction**: Minimize numeric ranges
### Phase 3: Delta Algorithm Implementation
#### Core Algorithm
```
1. Start with failing test case T
2. While reduction is possible:
a. Generate smaller candidate C from T
b. Test if C still triggers bug
c. If yes: T = C (accept reduction)
d. If no: Try different reduction
3. Return minimal T
```
#### Automated Reduction Process
**Step 1: Coarse-Grained Reduction**
1. Try removing large chunks (50%)
2. Binary search for largest removable section
3. Continue until no large removals possible
**Step 2: Fine-Grained Reduction**
1. Try removing individual elements
2. Test each element for necessity
3. Build minimal required set
**Step 3: Simplification Pass**
1. Replace complex values with simpler equivalents:
- Long strings → "a"
- Large numbers → 0 or 1
- Complex objects → empty objects
2. Maintain bug reproduction
### Phase 4: Validation
#### Verify Minimality
1. Confirm bug still reproduces
2. Verify no further reduction possible
3. Test that adding any removed element doesn't affect bug
4. Document reduction ratio achieved
#### Create Clean Reproduction
1. Format minimal test case
2. Remove all comments/documentation
3. Standardize naming (var1, var2, etc.)
4. Ensure standalone execution
## Intelligent Reduction Strategies
### Pattern-Based Reduction
Recognize common patterns and apply targeted reductions:
- **Array operations**: Reduce to 2-3 elements
- **Nested structures**: Flatten where possible
- **Async operations**: Convert to synchronous
- **External dependencies**: Mock with minimal stubs
### Semantic-Aware Reduction
Maintain semantic validity while reducing:
- Preserve type constraints
- Maintain referential integrity
- Keep required relationships
- Honor invariants
### Parallel Exploration
Test multiple reduction paths simultaneously:
- Try different reduction strategies
- Explore various simplification orders
- Combine successful reductions
## Output Format
````markdown
# Delta Debugging Minimization Report
## Original Test Case
**Size:** [original size/complexity]
**Components:** [number of elements/lines/fields]
**Execution Time:** [duration]
```[format]
[original test case - abbreviated if too long]
```
````
## Minimization Process
**Iterations:** [number]
**Time Taken:** [duration]
**Reduction Achieved:** [percentage]
### Reduction Path
1. [First major reduction] - Removed [what], Size: [new size]
2. [Second reduction] - Simplified [what], Size: [new size]
3. [Continue for significant reductions...]
## Minimal Reproduction
### Test Case
```[language]
// Minimal test case that reproduces bug
[minimized code/data]
```
### Requirements
- **Environment:** [minimal environment needed]
- **Dependencies:** [only essential dependencies]
- **Configuration:** [minimal config]
### Execution
```bash
# Command to reproduce
[exact command]
```
### Expected vs Actual
**Expected:** [what should happen]
**Actual:** [what happens (the bug)]
## Analysis
### Essential Elements
These elements are required for reproduction:
1. [Critical element 1] - Remove this and bug disappears
2. [Critical element 2] - Essential for triggering condition
3. [Continue for all essential elements]
### Removed Elements
These were safely removed without affecting the bug:
- [Category]: [what was removed and why it's non-essential]
- [Continue for major categories]
### Insights Gained
[What the minimization reveals about the bug's nature]
## Root Cause Hypothesis
Based on minimal reproduction:
[What the essential elements suggest about root cause]
## Next Steps
1. Debug the minimal case using other techniques
2. Focus on interaction between essential elements
3. Test fix against both minimal and original cases
```
## Completion Criteria
- [ ] Original failing case captured
- [ ] Minimization algorithm executed
- [ ] Minimal reproduction achieved
- [ ] Bug still reproduces with minimal case
- [ ] No further reduction possible
- [ ] Essential elements identified
- [ ] Clean reproduction documented
```

View File

@ -0,0 +1,130 @@
# fagan-inspection
Comprehensive Fagan inspection for systematic bug analysis and resolution.
## Context
This task performs systematic defect analysis using the proven 6-phase Fagan inspection methodology, achieving 60-90% defect detection rates through formal peer review.
## Task Execution
### Phase 1: Planning
1. Identify inspection scope based on bug description
2. Define inspection criteria and success metrics
3. Generate inspection checklist based on bug type
4. Determine affected components and stakeholders
### Phase 2: Overview
1. Analyze recent commits for context and potential causes
2. Review feature specifications and implementation plans
3. Gather background context and related documentation
4. Identify impact scope and affected systems
### Phase 3: Preparation
1. Systematic artifact examination:
- Code analysis using pattern detection
- Test coverage analysis and execution results
- Configuration and environment analysis
- Documentation consistency check
2. Dependency analysis and version conflicts
3. Performance metrics and resource usage (if applicable)
4. Generate preliminary defect hypotheses
### Phase 4: Inspection Meeting
1. Execute systematic defect identification:
- Logic defects: Algorithm errors, control flow issues
- Interface defects: API misuse, parameter mismatches
- Data defects: Type mismatches, validation failures
- Documentation defects: Outdated or incorrect documentation
2. Root cause analysis using fishbone methodology
3. Impact assessment: Severity, scope, risk level
4. Categorize defects by type and priority
### Phase 5: Rework Planning
1. Generate fix proposals with tradeoff analysis
2. Design test strategy for validation
3. Risk assessment for proposed changes
4. Create implementation timeline
5. Plan regression testing approach
### Phase 6: Follow-up
1. Validate fix effectiveness against original bug
2. Update documentation and specifications
3. Capture lessons learned for prevention
4. Generate comprehensive debug report
## Output Format
Generate a structured debug report containing:
```markdown
# Debug Report: [Bug Description]
Session ID: [timestamp]
Date: [date]
## Executive Summary
[Brief overview of findings and recommendations]
## Defect Analysis
### Primary Defect
- Type: [Logic/Interface/Data/Documentation]
- Severity: [Critical/High/Medium/Low]
- Location: [file:line]
- Description: [detailed description]
### Contributing Factors
[List of contributing issues]
## Root Cause Identification
### Root Cause
[Detailed root cause explanation]
### Evidence Trail
[Step-by-step evidence leading to root cause]
## Fix Recommendations
### Immediate Fix
[Code or configuration changes needed]
### Long-term Prevention
[Systemic improvements to prevent recurrence]
## Test Strategy
[Required tests to validate fix]
## Risk Assessment
- Regression Risk: [High/Medium/Low]
- Side Effects: [Potential side effects]
- Mitigation: [Risk mitigation steps]
## Lessons Learned
[Key takeaways for future prevention]
```
## Completion Criteria
- [ ] All 6 phases completed
- [ ] Root cause identified with evidence
- [ ] Fix recommendations provided
- [ ] Test strategy defined
- [ ] Debug report generated

View File

@ -0,0 +1,472 @@
# instrumentation-analysis
Design strategic logging and monitoring points for production debugging.
## Context
This task analyzes code to identify optimal locations for instrumentation (logging, metrics, tracing) that will aid in debugging production issues without impacting performance. It creates a comprehensive observability strategy.
## Task Execution
### Phase 1: Critical Path Analysis
#### Identify Key Flows
1. **User-Facing Paths**: Request → Response chains
2. **Business-Critical Paths**: Payment, authentication, data processing
3. **Performance-Sensitive Paths**: High-frequency operations
4. **Error-Prone Paths**: Historical problem areas
5. **Integration Points**: External service calls
#### Map Decision Points
- Conditional branches with business logic
- State transitions
- Error handling blocks
- Retry mechanisms
- Circuit breakers
- Cache hits/misses
### Phase 2: Instrumentation Strategy
#### Level 1: Essential Instrumentation
**Entry/Exit Points:**
```
- Service boundaries (API endpoints)
- Function entry/exit for critical operations
- Database transaction boundaries
- External service calls
- Message queue operations
```
**Error Conditions:**
```
- Exception catches
- Validation failures
- Timeout occurrences
- Retry attempts
- Fallback activations
```
**Performance Markers:**
```
- Operation start/end times
- Queue depths
- Resource utilization
- Batch sizes
- Cache effectiveness
```
#### Level 2: Diagnostic Instrumentation
**State Changes:**
```
- User state transitions
- Order/payment status changes
- Configuration updates
- Feature flag toggles
- Circuit breaker state changes
```
**Business Events:**
```
- User actions (login, purchase, etc.)
- System events (startup, shutdown)
- Scheduled job execution
- Data pipeline stages
- Workflow transitions
```
#### Level 3: Deep Debugging
**Detailed Tracing:**
```
- Parameter values for complex functions
- Intermediate calculation results
- Loop iteration counts
- Branch decisions
- SQL query parameters
```
### Phase 3: Implementation Patterns
#### Structured Logging Format
**Standard Fields:**
```json
{
"timestamp": "ISO-8601",
"level": "INFO|WARN|ERROR",
"service": "service-name",
"trace_id": "correlation-id",
"span_id": "operation-id",
"user_id": "if-applicable",
"operation": "what-is-happening",
"duration_ms": "for-completed-ops",
"status": "success|failure",
"error": "error-details-if-any",
"metadata": {
"custom": "fields"
}
}
```
#### Performance-Conscious Patterns
**Sampling Strategy:**
```
- 100% for errors
- 10% for normal operations
- 1% for high-frequency paths
- Dynamic adjustment based on load
```
**Async Logging:**
```
- Buffer non-critical logs
- Batch write to reduce I/O
- Use separate thread/process
- Implement backpressure handling
```
**Conditional Logging:**
```
- Debug level only in development
- Info level in staging
- Warn/Error in production
- Dynamic level adjustment via config
```
### Phase 4: Metrics Design
#### Key Metrics to Track
**RED Metrics:**
- **Rate**: Requests per second
- **Errors**: Error rate/count
- **Duration**: Response time distribution
**USE Metrics:**
- **Utilization**: Resource usage percentage
- **Saturation**: Queue depth, wait time
- **Errors**: Resource allocation failures
**Business Metrics:**
- Transaction success rate
- Feature usage
- User journey completion
- Revenue impact
#### Metric Implementation
**Counter Examples:**
```
requests_total{method="GET", endpoint="/api/users", status="200"}
errors_total{type="database", operation="insert"}
```
**Histogram Examples:**
```
request_duration_seconds{method="GET", endpoint="/api/users"}
database_query_duration_ms{query_type="select", table="users"}
```
**Gauge Examples:**
```
active_connections{service="database"}
queue_depth{queue="email"}
```
### Phase 5: Tracing Strategy
#### Distributed Tracing Points
**Span Creation:**
```
- HTTP request handling
- Database operations
- Cache operations
- External API calls
- Message publishing/consuming
- Background job execution
```
**Context Propagation:**
```
- HTTP headers (X-Trace-Id)
- Message metadata
- Database comments
- Log correlation
```
## Output Format
````markdown
# Instrumentation Analysis Report
## Executive Summary
**Components Analyzed:** [count]
**Current Coverage:** [percentage]
**Recommended Additions:** [count]
**Performance Impact:** [minimal/low/moderate]
**Implementation Effort:** [hours/days]
## Critical Instrumentation Points
### Priority 1: Immediate Implementation
#### Service: [ServiceName]
**Entry Points:**
```[language]
// Location: [file:line]
// Current: No logging
// Recommended:
logger.info("Request received", {
method: req.method,
path: req.path,
user_id: req.user?.id,
trace_id: req.traceId
});
```
````
**Error Handling:**
```[language]
// Location: [file:line]
// Current: Silent failure
// Recommended:
logger.error("Database operation failed", {
operation: "user_update",
user_id: userId,
error: err.message,
stack: err.stack,
retry_count: retries
});
```
**Performance Tracking:**
```[language]
// Location: [file:line]
// Recommended:
const startTime = Date.now();
try {
const result = await expensiveOperation();
metrics.histogram('operation_duration_ms', Date.now() - startTime, {
operation: 'expensive_operation',
status: 'success'
});
return result;
} catch (error) {
metrics.histogram('operation_duration_ms', Date.now() - startTime, {
operation: 'expensive_operation',
status: 'failure'
});
throw error;
}
```
### Priority 2: Enhanced Observability
[Similar format for medium priority points]
### Priority 3: Deep Debugging
[Similar format for low priority points]
## Logging Strategy
### Log Levels by Environment
| Level | Development | Staging | Production |
| ----- | ----------- | ------- | ---------- |
| DEBUG | ✓ | ✓ | ✗ |
| INFO | ✓ | ✓ | Sampled |
| WARN | ✓ | ✓ | ✓ |
| ERROR | ✓ | ✓ | ✓ |
### Sampling Configuration
```yaml
sampling:
default: 0.01 # 1% sampling
rules:
- path: '/health'
sample_rate: 0.001 # 0.1% for health checks
- path: '/api/critical/*'
sample_rate: 0.1 # 10% for critical APIs
- level: 'ERROR'
sample_rate: 1.0 # 100% for errors
```
## Metrics Implementation
### Application Metrics
```[language]
// Metric definitions
const metrics = {
// Counters
requests: new Counter('http_requests_total', ['method', 'path', 'status']),
errors: new Counter('errors_total', ['type', 'operation']),
// Histograms
duration: new Histogram('request_duration_ms', ['method', 'path']),
dbDuration: new Histogram('db_query_duration_ms', ['operation', 'table']),
// Gauges
connections: new Gauge('active_connections', ['type']),
queueSize: new Gauge('queue_size', ['queue_name'])
};
```
### Dashboard Queries
```sql
-- Error rate by endpoint
SELECT
endpoint,
sum(errors) / sum(requests) as error_rate
FROM metrics
WHERE time > now() - 1h
GROUP BY endpoint
-- P95 latency
SELECT
endpoint,
percentile(duration, 0.95) as p95_latency
FROM metrics
WHERE time > now() - 1h
GROUP BY endpoint
```
## Tracing Implementation
### Trace Context
```[language]
// Trace context propagation
class TraceContext {
constructor(traceId, spanId, parentSpanId) {
this.traceId = traceId || generateId();
this.spanId = spanId || generateId();
this.parentSpanId = parentSpanId;
}
createChild() {
return new TraceContext(this.traceId, generateId(), this.spanId);
}
}
// Usage
middleware.use((req, res, next) => {
req.trace = new TraceContext(
req.headers['x-trace-id'],
req.headers['x-span-id'],
req.headers['x-parent-span-id']
);
next();
});
```
## Performance Considerations
### Impact Analysis
| Instrumentation Type | CPU Impact | Memory Impact | I/O Impact |
| -------------------- | ---------- | ------------- | -------------- |
| Structured Logging | < 1% | < 10MB | Async buffered |
| Metrics Collection | < 0.5% | < 5MB | Batched |
| Distributed Tracing | < 2% | < 20MB | Sampled |
### Optimization Techniques
1. Use async logging with buffers
2. Implement sampling for high-frequency paths
3. Batch metric submissions
4. Use conditional compilation for debug logs
5. Implement circuit breakers for logging systems
## Implementation Plan
### Phase 1: Week 1
- [ ] Implement critical error logging
- [ ] Add service boundary instrumentation
- [ ] Set up basic metrics
### Phase 2: Week 2
- [ ] Add performance tracking
- [ ] Implement distributed tracing
- [ ] Create initial dashboards
### Phase 3: Week 3
- [ ] Add business event tracking
- [ ] Implement sampling strategies
- [ ] Performance optimization
## Monitoring & Alerts
### Critical Alerts
```yaml
- name: high_error_rate
condition: error_rate > 0.01
severity: critical
- name: high_latency
condition: p95_latency > 1000ms
severity: warning
- name: service_down
condition: health_check_failures > 3
severity: critical
```
## Validation Checklist
- [ ] No sensitive data in logs
- [ ] Trace IDs properly propagated
- [ ] Sampling rates appropriate
- [ ] Performance impact acceptable
- [ ] Dashboards created
- [ ] Alerts configured
- [ ] Documentation updated
```
## Completion Criteria
- [ ] Critical paths identified
- [ ] Instrumentation points mapped
- [ ] Logging strategy defined
- [ ] Metrics designed
- [ ] Tracing plan created
- [ ] Performance impact assessed
- [ ] Implementation plan created
- [ ] Monitoring strategy defined
```

View File

@ -0,0 +1,199 @@
# pattern-detection
Analyze code and commit history for defect patterns and systemic issues.
## Context
This task identifies recurring defect patterns, systemic issues, and common problem areas to enable proactive quality improvements.
## Task Execution
### Step 1: Historical Analysis
#### Recent Commits Analysis
1. Review last 20-50 commits for:
- Files frequently modified (hotspots)
- Repeated fix attempts
- Revert commits indicating instability
- Emergency/hotfix patterns
#### Bug History Review
1. Analyze recent bug reports for:
- Common symptoms
- Recurring locations
- Similar root causes
- Fix patterns
### Step 2: Code Pattern Detection
#### Anti-Pattern Identification
Look for common problematic patterns:
- God objects/functions (excessive responsibility)
- Copy-paste code (DRY violations)
- Dead code (unused functions/variables)
- Complex conditionals (cyclomatic complexity)
- Long parameter lists
- Inappropriate intimacy (tight coupling)
#### Vulnerability Patterns
Check for security/reliability issues:
- Input validation gaps
- Error handling inconsistencies
- Resource leak patterns
- Race condition indicators
- SQL injection risks
- XSS vulnerabilities
### Step 3: Architectural Pattern Analysis
#### Dependency Issues
- Circular dependencies
- Version conflicts
- Missing abstractions
- Leaky abstractions
- Inappropriate dependencies
#### Design Smells
- Violated SOLID principles
- Missing design patterns where needed
- Over-engineering indicators
- Technical debt accumulation
### Step 4: Team Pattern Analysis
#### Development Patterns
- Rush commits (end of sprint)
- Incomplete implementations
- Missing tests for bug fixes
- Documentation gaps
- Code review oversights
#### Communication Patterns
- Misunderstood requirements
- Incomplete handoffs
- Knowledge silos
- Missing context in commits
### Step 5: Pattern Correlation
1. Group related patterns by:
- Component/module
- Developer/team
- Time period
- Feature area
2. Identify correlations:
- Patterns that appear together
- Cascade effects
- Root pattern causing others
## Output Format
```markdown
# Defect Pattern Analysis Report
## Executive Summary
[High-level overview of key patterns found]
## Critical Patterns Detected
### Pattern 1: [Pattern Name]
**Type:** [Anti-pattern/Vulnerability/Design/Process]
**Frequency:** [Number of occurrences]
**Locations:**
- [file:line]
- [file:line]
**Description:** [What the pattern is]
**Impact:** [Why it matters]
**Example:** [Code snippet or commit reference]
**Recommendation:** [How to address]
## Hotspot Analysis
### High-Change Files
1. [filename] - [change count] changes, [bug count] bugs
2. [filename] - [change count] changes, [bug count] bugs
### Complex Areas
1. [component] - Complexity score: [number]
2. [component] - Complexity score: [number]
## Systemic Issues
### Issue 1: [Issue Name]
**Pattern Indicators:**
- [Pattern that indicates this issue]
- [Another indicator]
**Root Cause:** [Underlying systemic problem]
**Affected Areas:** [Components/teams affected]
**Priority:** [Critical/High/Medium/Low]
**Remediation Strategy:** [How to fix systematically]
## Trend Analysis
### Improving Areas
- [Area showing positive trends]
### Degrading Areas
- [Area showing negative trends]
### Stable Problem Areas
- [Persistent issues not getting better or worse]
## Recommendations
### Immediate Actions
1. [Quick win to address patterns]
2. [Another quick action]
### Short-term Improvements
1. [1-2 sprint improvements]
2. [Process changes needed]
### Long-term Strategy
1. [Architectural changes]
2. [Team/process evolution]
## Prevention Checklist
- [ ] Add static analysis for [pattern]
- [ ] Implement pre-commit hooks for [issue]
- [ ] Create coding standards for [area]
- [ ] Add automated tests for [vulnerability]
- [ ] Improve documentation for [component]
```
## Completion Criteria
- [ ] Historical analysis completed
- [ ] Code patterns identified
- [ ] Architectural issues found
- [ ] Team patterns analyzed
- [ ] Correlations established
- [ ] Recommendations provided
- [ ] Prevention strategies defined

View File

@ -0,0 +1,148 @@
# root-cause-analysis
Focused root cause analysis using fishbone (Ishikawa) methodology.
## Context
This task performs systematic root cause analysis to identify the underlying causes of defects, moving beyond symptoms to address fundamental issues.
## Task Execution
### Step 1: Problem Definition
1. Clearly state the problem/symptom
2. Define when it occurs (timing, frequency)
3. Define where it occurs (component, environment)
4. Quantify the impact (users affected, severity)
### Step 2: Fishbone Analysis Categories
Analyze the problem across these dimensions:
#### People (Developer/User factors)
- Knowledge gaps or misunderstandings
- Communication breakdowns
- Incorrect assumptions
- User behavior patterns
#### Process (Development/Deployment)
- Missing validation steps
- Inadequate testing coverage
- Deployment procedures
- Code review gaps
#### Technology (Tools/Infrastructure)
- Framework limitations
- Library bugs or incompatibilities
- Infrastructure issues
- Tool configuration problems
#### Environment (System/Configuration)
- Environment-specific settings
- Resource constraints
- External dependencies
- Network or connectivity issues
#### Data (Input/State)
- Invalid or unexpected input
- Data corruption or inconsistency
- State management issues
- Race conditions
#### Methods (Algorithms/Design)
- Algorithm flaws
- Design pattern misuse
- Architecture limitations
- Performance bottlenecks
### Step 3: 5-Whys Deep Dive
For each potential cause identified:
1. Ask "Why does this happen?"
2. For each answer, ask "Why?" again
3. Continue until reaching the root cause (typically 5 iterations)
4. Document the chain of causation
### Step 4: Evidence Collection
For each identified root cause:
- Gather supporting evidence (logs, code, metrics)
- Verify through reproduction or testing
- Rule out alternative explanations
- Establish confidence level
### Step 5: Root Cause Prioritization
Rank root causes by:
- Likelihood (probability this is the true cause)
- Impact (severity if this is the cause)
- Effort (complexity to address)
- Risk (potential for recurrence)
## Output Format
```markdown
# Root Cause Analysis: [Problem Description]
## Problem Statement
**What:** [Clear problem description]
**When:** [Timing/frequency]
**Where:** [Location/component]
**Impact:** [Quantified impact]
## Fishbone Analysis
### Category: [People/Process/Technology/Environment/Data/Methods]
**Potential Cause:** [Description]
**5-Whys Analysis:**
1. Why? [Answer]
2. Why? [Answer]
3. Why? [Answer]
4. Why? [Answer]
5. Why? [Root cause]
**Evidence:** [Supporting data/logs/code]
**Confidence:** [High/Medium/Low]
## Root Cause Summary
### Primary Root Cause
[Most likely root cause with evidence]
### Contributing Factors
1. [Secondary cause]
2. [Tertiary cause]
## Recommended Actions
1. **Immediate:** [Quick fix to address symptom]
2. **Short-term:** [Fix root cause]
3. **Long-term:** [Prevent recurrence]
## Verification Plan
[How to verify the root cause is correctly identified]
```
## Completion Criteria
- [ ] Problem clearly defined
- [ ] Fishbone analysis completed
- [ ] 5-Whys analysis performed
- [ ] Evidence collected and verified
- [ ] Root cause identified with confidence level
- [ ] Action plan created

View File

@ -0,0 +1,294 @@
# static-analysis
Comprehensive static analysis for defect detection and code quality assessment.
## Context
This task performs deep static analysis to identify bugs, anti-patterns, security vulnerabilities, and code quality issues without executing the code. It combines multiple analysis techniques to provide a comprehensive view of potential problems.
## Task Execution
### Phase 1: Multi-Layer Analysis
#### Layer 1: Syntax and Style Analysis
1. **Syntax Errors**: Malformed code that won't compile/run
2. **Style Violations**: Inconsistent formatting, naming conventions
3. **Dead Code**: Unreachable code, unused variables/functions
4. **Code Duplication**: Copy-paste code blocks
#### Layer 2: Semantic Analysis
1. **Type Issues**: Type mismatches, implicit conversions
2. **Logic Errors**: Always true/false conditions, impossible states
3. **Resource Leaks**: Unclosed files, unreleased memory
4. **API Misuse**: Incorrect parameter order, deprecated methods
#### Layer 3: Flow Analysis
1. **Control Flow**: Infinite loops, unreachable code, missing returns
2. **Data Flow**: Uninitialized variables, unused assignments
3. **Exception Flow**: Unhandled exceptions, empty catch blocks
4. **Null Flow**: Potential null dereferences
#### Layer 4: Security Analysis
1. **Injection Vulnerabilities**: SQL, XSS, command injection
2. **Authentication Issues**: Hardcoded credentials, weak crypto
3. **Data Exposure**: Sensitive data in logs, unencrypted storage
4. **Access Control**: Missing authorization, privilege escalation
### Phase 2: Pattern Detection
#### Anti-Patterns to Detect
**Code Smells:**
```
- God Classes/Functions (too much responsibility)
- Long Parameter Lists (>3-4 parameters)
- Feature Envy (excessive external data access)
- Data Clumps (repeated parameter groups)
- Primitive Obsession (overuse of primitives)
- Switch Statements (missing polymorphism)
- Lazy Class (too little responsibility)
- Speculative Generality (unused abstraction)
- Message Chains (deep coupling)
- Middle Man (unnecessary delegation)
```
**Performance Issues:**
```
- N+1 Queries (database inefficiency)
- Synchronous I/O in async context
- Inefficient Algorithms (O(n²) when O(n) possible)
- Memory Leaks (retained references)
- Excessive Object Creation (GC pressure)
- String Concatenation in Loops
- Missing Indexes (database)
- Blocking Operations (thread starvation)
```
**Concurrency Issues:**
```
- Race Conditions (unsynchronized access)
- Deadlocks (circular wait)
- Thread Leaks (unclosed threads)
- Missing Volatile (visibility issues)
- Double-Checked Locking (broken pattern)
- Lock Contention (performance bottleneck)
```
### Phase 3: Complexity Analysis
#### Metrics Calculation
1. **Cyclomatic Complexity**: Number of linearly independent paths
2. **Cognitive Complexity**: How difficult code is to understand
3. **Halstead Metrics**: Program vocabulary and difficulty
4. **Maintainability Index**: Composite maintainability score
5. **Technical Debt**: Estimated time to fix all issues
6. **Test Coverage**: Lines/branches/functions covered
#### Thresholds
```
Cyclomatic Complexity:
- Good: < 10
- Acceptable: 10-20
- Complex: 20-50
- Untestable: > 50
Cognitive Complexity:
- Simple: < 5
- Moderate: 5-10
- Complex: 10-15
- Very Complex: > 15
```
### Phase 4: Dependency Analysis
#### Identify Issues
1. **Circular Dependencies**: A→B→C→A cycles
2. **Version Conflicts**: Incompatible dependency versions
3. **Security Vulnerabilities**: Known CVEs in dependencies
4. **License Conflicts**: Incompatible license combinations
5. **Outdated Packages**: Dependencies needing updates
6. **Unused Dependencies**: Declared but not used
### Phase 5: Architecture Analysis
#### Structural Issues
1. **Layer Violations**: Cross-layer dependencies
2. **Module Coupling**: High interdependence
3. **Missing Abstractions**: Direct implementation dependencies
4. **Inconsistent Patterns**: Mixed architectural styles
5. **God Objects**: Central points of failure
## Automated Tools Integration
Simulate output from common static analysis tools:
**ESLint/TSLint** (JavaScript/TypeScript)
**Pylint/Flake8** (Python)
**SonarQube** (Multi-language)
**PMD/SpotBugs** (Java)
**RuboCop** (Ruby)
**SwiftLint** (Swift)
## Output Format
````markdown
# Static Analysis Report
## Executive Summary
**Files Analyzed:** [count]
**Total Issues:** [count]
**Critical:** [count] | **High:** [count] | **Medium:** [count] | **Low:** [count]
**Technical Debt:** [hours/days estimated]
**Code Coverage:** [percentage]
## Critical Issues (Immediate Action Required)
### Issue 1: [Security Vulnerability]
**File:** [path:line]
**Category:** Security
**Rule:** [CWE-ID or rule name]
```[language]
// Vulnerable code
[code snippet]
```
````
**Risk:** [Description of security risk]
**Fix:**
```[language]
// Secure code
[fixed code]
```
### Issue 2: [Logic Error]
[Similar format]
## High Priority Issues
### Category: Performance
| File | Line | Issue | Impact | Fix Effort |
| ------ | ------ | --------------- | ------------ | ---------- |
| [file] | [line] | N+1 Query | High latency | 2 hours |
| [file] | [line] | O(n²) algorithm | CPU spike | 4 hours |
### Category: Reliability
[Similar table format]
## Code Quality Metrics
### Complexity Analysis
| File | Cyclomatic | Cognitive | Maintainability | Action |
| ------ | ---------- | --------- | --------------- | -------- |
| [file] | 45 (High) | 28 (High) | 35 (Low) | Refactor |
| [file] | 32 (Med) | 18 (Med) | 55 (Med) | Review |
### Duplication Analysis
**Total Duplication:** [percentage]
**Largest Duplicate:** [lines] lines in [files]
### Top Duplicated Blocks:
1. [File A:lines] ↔ [File B:lines] - [line count] lines
2. [File C:lines] ↔ [File D:lines] - [line count] lines
## Anti-Pattern Detection
### God Classes
1. **[ClassName]** - [methods] methods, [lines] lines
- Responsibilities: [list]
- Suggested Split: [recommendations]
### Long Methods
1. **[methodName]** - [lines] lines, complexity: [score]
- Extract Methods: [suggestions]
## Security Scan Results
### Vulnerabilities by Category
- Injection: [count]
- Authentication: [count]
- Data Exposure: [count]
- Access Control: [count]
### Detailed Findings
[List each with severity, location, and fix]
## Dependency Analysis
### Security Vulnerabilities
| Package | Version | CVE | Severity | Fixed Version |
| ------- | ------- | -------- | -------- | ------------- |
| [pkg] | [ver] | [CVE-ID] | Critical | [ver] |
### Outdated Dependencies
| Package | Current | Latest | Breaking Changes |
| ------- | ------- | ------ | ---------------- |
| [pkg] | [ver] | [ver] | [Yes/No] |
## Recommendations
### Immediate Actions (This Sprint)
1. Fix all critical security vulnerabilities
2. Resolve high-severity logic errors
3. Update vulnerable dependencies
### Short-term (Next Sprint)
1. Refactor high-complexity functions
2. Remove code duplication
3. Add missing error handling
### Long-term (Technical Debt)
1. Architectural improvements
2. Comprehensive refactoring
3. Test coverage improvement
## Trend Analysis
**Compared to Last Scan:**
- Issues: [+/-X]
- Complexity: [+/-Y]
- Coverage: [+/-Z%]
- Technical Debt: [+/-N hours]
```
## Completion Criteria
- [ ] All analysis layers completed
- [ ] Issues categorized by severity
- [ ] Metrics calculated
- [ ] Anti-patterns identified
- [ ] Security vulnerabilities found
- [ ] Dependencies analyzed
- [ ] Recommendations prioritized
- [ ] Fixes suggested for critical issues
```

View File

@ -0,0 +1,363 @@
# walkthrough-prep
Generate comprehensive materials for code walkthrough sessions.
## Context
This task prepares all necessary documentation, checklists, and presentation materials for conducting effective code walkthroughs. It ensures reviewers have everything needed to provide valuable feedback while minimizing meeting time.
## Task Execution
### Phase 1: Scope Analysis
#### Determine Walkthrough Type
1. **Feature Walkthrough**: New functionality
2. **Bug Fix Walkthrough**: Defect resolution
3. **Refactoring Walkthrough**: Code improvement
4. **Architecture Walkthrough**: Design decisions
5. **Security Walkthrough**: Security-focused review
#### Identify Key Components
1. Changed files and their purposes
2. Dependencies affected
3. Test coverage added/modified
4. Documentation updates
5. Configuration changes
### Phase 2: Material Generation
#### 1. Executive Summary
Create high-level overview:
- Purpose and goals
- Business value/impact
- Technical approach
- Key decisions made
- Risks and mitigations
#### 2. Technical Overview
**Architecture Diagram:**
```
[Component A] → [Component B] → [Component C]
↓ ↓ ↓
[Database] [External API] [Cache]
```
**Data Flow:**
```
1. User Input → Validation
2. Validation → Processing
3. Processing → Storage
4. Storage → Response
```
**Sequence Diagram:**
```
User → Frontend: Request
Frontend → Backend: API Call
Backend → Database: Query
Database → Backend: Results
Backend → Frontend: Response
Frontend → User: Display
```
#### 3. Code Change Summary
**Statistics:**
- Files changed: [count]
- Lines added: [count]
- Lines removed: [count]
- Test coverage: [before]% → [after]%
- Complexity change: [delta]
**Change Categories:**
- New features: [list]
- Modifications: [list]
- Deletions: [list]
- Refactoring: [list]
### Phase 3: Review Checklist Generation
#### Core Review Areas
**Functionality Checklist:**
- [ ] Requirements met
- [ ] Edge cases handled
- [ ] Error handling complete
- [ ] Performance acceptable
- [ ] Backwards compatibility maintained
**Code Quality Checklist:**
- [ ] Naming conventions followed
- [ ] DRY principle applied
- [ ] SOLID principles followed
- [ ] Comments appropriate
- [ ] No code smells
**Testing Checklist:**
- [ ] Unit tests added
- [ ] Integration tests updated
- [ ] Edge cases tested
- [ ] Performance tested
- [ ] Regression tests pass
**Security Checklist:**
- [ ] Input validation implemented
- [ ] Authentication checked
- [ ] Authorization verified
- [ ] Data sanitized
- [ ] Secrets not exposed
**Documentation Checklist:**
- [ ] Code comments updated
- [ ] README updated
- [ ] API docs updated
- [ ] Changelog updated
- [ ] Deployment docs updated
### Phase 4: Presentation Structure
#### Slide/Section Outline
**1. Introduction (2 min)**
- Problem statement
- Solution overview
- Success criteria
**2. Technical Approach (5 min)**
- Architecture decisions
- Implementation choices
- Trade-offs made
**3. Code Walkthrough (15 min)**
- Key components tour
- Critical logic explanation
- Integration points
**4. Testing Strategy (3 min)**
- Test coverage
- Test scenarios
- Performance results
**5. Discussion (5 min)**
- Open questions
- Concerns
- Suggestions
### Phase 5: Supporting Documentation
#### Code Snippets
Extract and annotate key code sections:
```[language]
// BEFORE: Original implementation
[original code]
// AFTER: New implementation
[new code]
// KEY CHANGES:
// 1. [Change 1 explanation]
// 2. [Change 2 explanation]
```
#### Test Cases
Document critical test scenarios:
```[language]
// Test Case 1: [Description]
// Input: [test input]
// Expected: [expected output]
// Covers: [what it validates]
```
#### Performance Metrics
If applicable:
- Execution time: [before] → [after]
- Memory usage: [before] → [after]
- Database queries: [before] → [after]
## Output Format
````markdown
# Code Walkthrough Package: [Feature/Fix Name]
## Quick Reference
**Date:** [scheduled date]
**Duration:** [estimated time]
**Presenter:** [name]
**Reviewers:** [list]
**Repository:** [link]
**Branch/PR:** [link]
## Executive Summary
[2-3 paragraph overview]
## Agenda
1. Introduction (2 min)
2. Technical Overview (5 min)
3. Code Walkthrough (15 min)
4. Testing & Validation (3 min)
5. Q&A (5 min)
## Pre-Review Checklist
**For Reviewers - Complete Before Meeting:**
- [ ] Read executive summary
- [ ] Review changed files list
- [ ] Note initial questions
- [ ] Check test results
## Technical Overview
### Architecture
[Include diagrams]
### Key Changes
| Component | Type | Description | Risk |
| --------- | ------------- | -------------- | -------------- |
| [name] | [New/Mod/Del] | [what changed] | [Low/Med/High] |
### Dependencies
**Added:** [list]
**Modified:** [list]
**Removed:** [list]
## Code Highlights
### Critical Section 1: [Name]
**File:** [path]
**Purpose:** [why this is important]
```[language]
[annotated code snippet]
```
````
**Discussion Points:**
- [Question or concern]
- [Alternative considered]
### Critical Section 2: [Name]
[Similar format]
## Testing Summary
### Coverage
- Unit Tests: [count] tests, [%] coverage
- Integration Tests: [count] tests
- Manual Testing: [checklist items]
### Key Test Scenarios
1. [Scenario]: [Result]
2. [Scenario]: [Result]
## Review Checklist
### Must Review
- [ ] [Critical file/function]
- [ ] [Security-sensitive code]
- [ ] [Performance-critical section]
### Should Review
- [ ] [Important logic]
- [ ] [API changes]
- [ ] [Database changes]
### Nice to Review
- [ ] [Refactoring]
- [ ] [Documentation]
- [ ] [Tests]
## Known Issues & Decisions
### Open Questions
1. [Question needing group input]
2. [Design decision to validate]
### Technical Debt
- [Debt item]: [Planned resolution]
### Future Improvements
- [Improvement]: [Timeline]
## Post-Review Action Items
**To be filled during review:**
- [ ] Action: [description] - Owner: [name]
- [ ] Action: [description] - Owner: [name]
## Appendix
### A. Full File List
[Complete list of changed files]
### B. Test Results
[Test execution summary]
### C. Performance Benchmarks
[If applicable]
### D. Related Documentation
- [Design Doc]: [link]
- [Requirements]: [link]
- [Previous Reviews]: [link]
```
## Completion Criteria
- [ ] Scope analyzed
- [ ] Executive summary written
- [ ] Technical overview created
- [ ] Code highlights selected
- [ ] Review checklist generated
- [ ] Presentation structure defined
- [ ] Supporting docs prepared
- [ ] Package formatted for distribution
```

View File

@ -0,0 +1,168 @@
# wolf-fence-search
Binary search debugging to systematically isolate bug location.
## Context
This task implements the Wolf Fence algorithm (binary search debugging) to efficiently locate bugs by repeatedly dividing the search space in half. Named after the problem: "There's one wolf in Alaska; how do you find it? Build a fence down the middle, wait for the wolf to howl, determine which side it's on, and repeat."
## Task Execution
### Phase 1: Initial Analysis
1. Identify the boundaries of the problem space:
- Entry point where system is working
- Exit point where bug manifests
- Code path between these points
2. Determine testable checkpoints
3. Calculate optimal division points
### Phase 2: Binary Search Implementation
#### Step 1: Divide Search Space
1. Identify midpoint of current search area
2. Insert diagnostic checkpoint at midpoint:
- Add assertion to verify expected state
- Add logging to capture actual state
- Add breakpoint if interactive debugging available
#### Step 2: Test and Observe
1. Execute code up to checkpoint
2. Verify if bug has manifested:
- State is correct → Bug is in second half
- State is incorrect → Bug is in first half
- Cannot determine → Need better checkpoint
#### Step 3: Narrow Focus
1. Select the half containing the bug
2. Repeat division process
3. Continue until bug location is isolated to:
- Single function
- Few lines of code
- Specific data transformation
### Phase 3: Refinement
#### For Complex Bugs
1. **Multi-dimensional search**: When bug depends on multiple factors
- Apply binary search on each dimension
- Create test matrix for combinations
2. **Time-based search**: For timing/concurrency issues
- Binary search on execution timeline
- Add timestamps to narrow race conditions
3. **Data-based search**: For data-dependent bugs
- Binary search on input size
- Isolate problematic data patterns
### Phase 4: Bug Isolation
Once narrowed to small code section:
1. Analyze the isolated code thoroughly
2. Identify exact failure mechanism
3. Verify bug reproduction in isolation
4. Document minimal reproduction case
## Automated Implementation
### Checkpoint Generation Strategy
```markdown
1. Identify all function boundaries in path
2. Select optimal checkpoint locations:
- Function entry/exit points
- Loop boundaries
- Conditional branches
- Data transformations
3. Insert non-invasive checkpoints:
- Use existing logging if available
- Add temporary assertions
- Leverage existing test infrastructure
```
### Search Optimization
- Start with coarse-grained divisions (module/class level)
- Progressively move to fine-grained (function/line level)
- Skip obviously correct sections based on static analysis
- Prioritize high-probability areas based on:
- Recent changes
- Historical bug density
- Code complexity metrics
## Output Format
````markdown
# Wolf Fence Debug Analysis
## Search Summary
**Initial Scope:** [entry point] → [exit point]
**Final Location:** [specific file:line]
**Iterations Required:** [number]
**Time to Isolate:** [duration]
## Search Path
### Iteration 1
- **Search Space:** [full range]
- **Checkpoint:** [location]
- **Result:** Bug in [first/second] half
- **Evidence:** [what was observed]
### Iteration 2
- **Search Space:** [narrowed range]
- **Checkpoint:** [location]
- **Result:** Bug in [first/second] half
- **Evidence:** [what was observed]
[Continue for all iterations...]
## Bug Location
**File:** [path]
**Function:** [name]
**Lines:** [range]
**Description:** [what the bug is]
## Minimal Reproduction
```[language]
// Minimal code to reproduce
[code snippet]
```
````
## Root Cause
[Brief explanation of why bug occurs]
## Recommended Fix
[Suggested solution]
## Verification Points
- [ ] Bug reproducible at isolated location
- [ ] Fix resolves issue at checkpoint
- [ ] No regression in other checkpoints
```
## Completion Criteria
- [ ] Search space properly bounded
- [ ] Binary search completed
- [ ] Bug location isolated
- [ ] Minimal reproduction created
- [ ] Root cause identified
- [ ] Fix recommendation provided
```

View File

@ -0,0 +1,234 @@
# <!-- Powered by BMAD™ Core -->
template:
id: debug-report-template-v1
name: Debug Analysis Report
version: 1.0
output:
format: markdown
filename: docs/debug/debug-report-{{timestamp}}.md
title: "Debug Analysis Report - {{problem_title}}"
workflow:
mode: rapid
elicitation: false
sections:
- id: header
title: Report Header
instruction: Generate report header with metadata
sections:
- id: metadata
title: Report Metadata
type: key-value
instruction: |
Report ID: DBG-{{timestamp}}
Date: {{current_date}}
Analyst: Debug Agent (Diana)
Severity: {{severity_level}}
Status: {{status}}
- id: executive-summary
title: Executive Summary
instruction: Provide concise summary under 200 words
sections:
- id: problem
title: Problem
type: text
instruction: 1-2 sentence problem statement
- id: impact
title: Impact
type: text
instruction: Quantified impact on users/system
- id: root-cause
title: Root Cause
type: text
instruction: Brief root cause description
- id: solution
title: Solution
type: text
instruction: High-level fix description
- id: metrics
title: Key Metrics
type: key-value
instruction: |
Effort Required: {{effort_estimate}}
Risk Level: {{risk_level}}
- id: problem-description
title: Problem Description
instruction: Detailed problem analysis
sections:
- id: symptoms
title: Symptoms
type: paragraphs
instruction: Detailed symptoms observed
- id: reproduction
title: Reproduction
type: numbered-list
instruction: Step-by-step reproduction steps with expected vs actual
- id: environment
title: Environment
type: bullet-list
instruction: |
- System: {{system_info}}
- Application: {{application_version}}
- Dependencies: {{dependencies_list}}
- Configuration: {{configuration_details}}
- id: timeline
title: Timeline
type: bullet-list
instruction: |
- First Observed: {{first_observed}}
- Frequency: {{occurrence_frequency}}
- Last Occurrence: {{last_occurrence}}
- id: technical-analysis
title: Technical Analysis
instruction: Deep technical investigation results
sections:
- id: root-cause-analysis
title: Root Cause Analysis
type: paragraphs
instruction: Detailed root cause with evidence
- id: code-analysis
title: Code Analysis
type: code-block
instruction: |
[[LLM: Include problematic code snippet with language specified]]
Issue: {{code_issue_description}}
- id: pattern-analysis
title: Pattern Analysis
type: paragraphs
instruction: Any patterns detected in the defect
- id: test-coverage
title: Test Coverage
type: bullet-list
instruction: |
- Current Coverage: {{coverage_percentage}}
- Gap Identified: {{coverage_gaps}}
- Risk Areas: {{untested_areas}}
- id: impact-assessment
title: Impact Assessment
instruction: Comprehensive impact analysis
sections:
- id: severity-matrix
title: Severity Matrix
type: table
columns: [Aspect, Impact, Severity]
instruction: |
[[LLM: Create table with Users Affected, Data Integrity, Performance, Security aspects]]
- id: business-impact
title: Business Impact
type: paragraphs
instruction: Business consequences of the issue
- id: solution-recommendations
title: Solution & Recommendations
instruction: Fix proposals and prevention strategies
sections:
- id: immediate-fix
title: Immediate Fix
type: code-block
instruction: |
[[LLM: Include corrected code with validation steps]]
- id: short-term
title: Short-term Improvements
type: bullet-list
instruction: Improvements for this sprint
- id: long-term
title: Long-term Prevention
type: bullet-list
instruction: Strategic prevention measures
- id: process
title: Process Enhancements
type: bullet-list
instruction: Process improvements to prevent recurrence
- id: implementation-plan
title: Implementation Plan
instruction: Phased approach to resolution
sections:
- id: phase1
title: "Phase 1: Immediate (0-2 days)"
type: checkbox-list
instruction: Critical fixes to apply immediately
- id: phase2
title: "Phase 2: Short-term (1 week)"
type: checkbox-list
instruction: Short-term improvements
- id: phase3
title: "Phase 3: Long-term (1 month)"
type: checkbox-list
instruction: Long-term strategic changes
- id: verification-testing
title: Verification & Testing
instruction: Validation strategy
sections:
- id: test-cases
title: Test Cases
type: numbered-list
instruction: Specific test cases to validate the fix
- id: regression
title: Regression Testing
type: paragraphs
instruction: Areas requiring regression testing
- id: monitoring
title: Monitoring
type: bullet-list
instruction: Metrics to monitor post-fix
- id: lessons-learned
title: Lessons Learned
instruction: Knowledge capture for prevention
sections:
- id: what-went-wrong
title: What Went Wrong
type: paragraphs
instruction: Root causes beyond the code
- id: improvements
title: What Could Improve
type: bullet-list
instruction: Process and tool improvements
- id: knowledge-sharing
title: Knowledge Sharing
type: bullet-list
instruction: Information to share with team
- id: appendices
title: Appendices
instruction: Supporting documentation
optional: true
sections:
- id: stack-traces
title: "Appendix A: Full Stack Traces"
type: code-block
instruction: Complete error traces if available
- id: logs
title: "Appendix B: Log Excerpts"
type: code-block
instruction: Relevant log entries
- id: metrics
title: "Appendix C: Performance Metrics"
type: paragraphs
instruction: Before/after performance data
- id: related
title: "Appendix D: Related Issues"
type: bullet-list
instruction: Links to similar problems
- id: references
title: "Appendix E: References"
type: bullet-list
instruction: Documentation, articles, tools used
- id: footer
title: Report Footer
instruction: Closing metadata
sections:
- id: timestamps
title: Report Timestamps
type: key-value
instruction: |
Report Generated: {{generation_timestamp}}
Next Review: {{follow_up_date}}

View File

@ -0,0 +1,339 @@
# <!-- Powered by BMAD™ Core -->
template:
id: defect-analysis-template-v1
name: Defect Analysis Report
version: 1.0
output:
format: markdown
filename: docs/debug/defect-{{defect_id}}.md
title: "Defect Analysis Report - DEF-{{defect_id}}"
workflow:
mode: rapid
elicitation: false
sections:
- id: header
title: Report Header
instruction: Generate report header with metadata
sections:
- id: metadata
title: Report Metadata
type: key-value
instruction: |
Defect ID: DEF-{{defect_id}}
Date: {{current_date}}
Analyst: {{analyst_name}}
Component: {{affected_component}}
- id: classification
title: Defect Classification
instruction: Categorize and classify the defect
sections:
- id: basic-info
title: Basic Information
type: key-value
instruction: |
Type: {{defect_type}}
Severity: {{severity_level}}
Priority: {{priority_level}}
Status: {{current_status}}
Environment: {{environment}}
- id: categorization
title: Categorization
type: key-value
instruction: |
Category: {{defect_category}}
Subcategory: {{defect_subcategory}}
Root Cause Type: {{root_cause_type}}
Detection Method: {{how_detected}}
- id: description
title: Defect Description
instruction: Comprehensive defect details
sections:
- id: summary
title: Summary
type: text
instruction: Brief one-line defect summary
- id: detailed
title: Detailed Description
type: paragraphs
instruction: Complete description of the defect and its behavior
- id: expected
title: Expected Behavior
type: paragraphs
instruction: What should happen under normal conditions
- id: actual
title: Actual Behavior
type: paragraphs
instruction: What actually happens when the defect occurs
- id: delta
title: Delta Analysis
type: paragraphs
instruction: Analysis of the difference between expected and actual
- id: reproduction
title: Reproduction
instruction: How to reproduce the defect
sections:
- id: prerequisites
title: Prerequisites
type: bullet-list
instruction: Required setup, data, or conditions before reproduction
- id: steps
title: Steps to Reproduce
type: numbered-list
instruction: Exact steps to trigger the defect
- id: frequency
title: Frequency
type: key-value
instruction: |
Reproducibility: {{reproducibility_rate}}
Occurrence Pattern: {{occurrence_pattern}}
Triggers: {{trigger_conditions}}
- id: technical-analysis
title: Technical Analysis
instruction: Deep technical investigation
sections:
- id: location
title: Code Location
type: key-value
instruction: |
File: {{file_path}}
Function/Method: {{function_name}}
Line Numbers: {{line_numbers}}
Module: {{module_name}}
- id: code
title: Code Snippet
type: code-block
instruction: |
[[LLM: Include the defective code with proper syntax highlighting]]
- id: mechanism
title: Defect Mechanism
type: paragraphs
instruction: Detailed explanation of how the defect works
- id: data-flow
title: Data Flow Analysis
type: paragraphs
instruction: How data flows through the defective code
- id: control-flow
title: Control Flow Analysis
type: paragraphs
instruction: Control flow issues contributing to the defect
- id: impact-assessment
title: Impact Assessment
instruction: Comprehensive impact analysis
sections:
- id: user-impact
title: User Impact
type: key-value
instruction: |
Affected Users: {{users_affected}}
User Experience: {{ux_impact}}
Workaround Available: {{workaround_exists}}
Workaround Description: {{workaround_details}}
- id: system-impact
title: System Impact
type: key-value
instruction: |
Performance: {{performance_impact}}
Stability: {{stability_impact}}
Security: {{security_impact}}
Data Integrity: {{data_impact}}
- id: business-impact
title: Business Impact
type: key-value
instruction: |
Revenue Impact: {{revenue_impact}}
Reputation Risk: {{reputation_risk}}
Compliance Issues: {{compliance_impact}}
SLA Violations: {{sla_impact}}
- id: root-cause
title: Root Cause
instruction: Root cause identification
sections:
- id: immediate
title: Immediate Cause
type: paragraphs
instruction: The direct cause of the defect
- id: underlying
title: Underlying Cause
type: paragraphs
instruction: The deeper systemic cause
- id: contributing
title: Contributing Factors
type: bullet-list
instruction: Factors that contributed to the defect
- id: prevention-failure
title: Prevention Failure
type: paragraphs
instruction: Why existing processes didn't prevent this defect
- id: fix-analysis
title: Fix Analysis
instruction: Solution proposals
sections:
- id: proposed-fix
title: Proposed Fix
type: code-block
instruction: |
[[LLM: Include the corrected code with proper syntax highlighting]]
- id: explanation
title: Fix Explanation
type: paragraphs
instruction: Detailed explanation of how the fix works
- id: alternatives
title: Alternative Solutions
type: numbered-list
instruction: Other possible solutions with pros/cons
- id: tradeoffs
title: Trade-offs
type: bullet-list
instruction: Trade-offs of the chosen solution
- id: testing-strategy
title: Testing Strategy
instruction: Comprehensive test plan
sections:
- id: unit-tests
title: Unit Tests Required
type: checkbox-list
instruction: Unit tests to validate the fix
- id: integration-tests
title: Integration Tests Required
type: checkbox-list
instruction: Integration tests needed
- id: regression-tests
title: Regression Tests Required
type: checkbox-list
instruction: Regression tests to ensure no breaks
- id: edge-cases
title: Edge Cases to Test
type: bullet-list
instruction: Edge cases that must be tested
- id: performance-tests
title: Performance Tests
type: bullet-list
instruction: Performance tests if applicable
- id: risk-assessment
title: Risk Assessment
instruction: Fix implementation risks
sections:
- id: fix-risk
title: Fix Risk
type: key-value
instruction: |
Implementation Risk: {{implementation_risk}}
Regression Risk: {{regression_risk}}
Side Effects: {{potential_side_effects}}
- id: mitigation
title: Mitigation Strategy
type: paragraphs
instruction: How to mitigate identified risks
- id: rollback
title: Rollback Plan
type: numbered-list
instruction: Steps to rollback if fix causes issues
- id: quality-metrics
title: Quality Metrics
instruction: Defect and code quality metrics
sections:
- id: defect-metrics
title: Defect Metrics
type: key-value
instruction: |
Escape Stage: {{escape_stage}}
Detection Time: {{time_to_detect}}
Fix Time: {{time_to_fix}}
Test Coverage Before: {{coverage_before}}
Test Coverage After: {{coverage_after}}
- id: code-metrics
title: Code Quality Metrics
type: key-value
instruction: |
Cyclomatic Complexity: {{complexity_score}}
Code Duplication: {{duplication_percentage}}
Technical Debt: {{tech_debt_impact}}
- id: prevention-strategy
title: Prevention Strategy
instruction: How to prevent similar defects
sections:
- id: immediate-prevention
title: Immediate Prevention
type: bullet-list
instruction: Quick wins to prevent recurrence
- id: longterm-prevention
title: Long-term Prevention
type: bullet-list
instruction: Strategic prevention measures
- id: process-improvements
title: Process Improvements
type: bullet-list
instruction: Process changes to prevent similar defects
- id: tool-enhancements
title: Tool Enhancements
type: bullet-list
instruction: Tool improvements needed
- id: related-information
title: Related Information
instruction: Additional context
optional: true
sections:
- id: similar-defects
title: Similar Defects
type: bullet-list
instruction: Links to similar defects in the system
- id: related-issues
title: Related Issues
type: bullet-list
instruction: Related tickets or issues
- id: dependencies
title: Dependencies
type: bullet-list
instruction: Dependencies affected by this defect
- id: documentation
title: Documentation Updates Required
type: bullet-list
instruction: Documentation that needs updating
- id: action-items
title: Action Items
instruction: Tasks and assignments
sections:
- id: actions-table
title: Action Items Table
type: table
columns: [Action, Owner, Due Date, Status]
instruction: |
[[LLM: Create table with specific action items, owners, and dates]]
- id: approval
title: Approval & Sign-off
instruction: Review and approval tracking
sections:
- id: signoff
title: Sign-off
type: key-value
instruction: |
Developer: {{developer_name}} - {{developer_date}}
QA: {{qa_name}} - {{qa_date}}
Lead: {{lead_name}} - {{lead_date}}
- id: footer
title: Report Footer
instruction: Closing metadata
sections:
- id: timestamps
title: Report Timestamps
type: key-value
instruction: |
Report Generated: {{generation_timestamp}}
Last Updated: {{last_update}}

View File

@ -0,0 +1,268 @@
# <!-- Powered by BMAD™ Core -->
template:
id: root-cause-template-v1
name: Root Cause Analysis
version: 1.0
output:
format: markdown
filename: docs/debug/rca-{{timestamp}}.md
title: "Root Cause Analysis: {{problem_title}}"
workflow:
mode: rapid
elicitation: false
sections:
- id: header
title: Analysis Header
instruction: Generate analysis header with metadata
sections:
- id: metadata
title: Analysis Metadata
type: key-value
instruction: |
Analysis ID: RCA-{{timestamp}}
Date: {{current_date}}
Analyst: {{analyst_name}}
Method: Fishbone (Ishikawa) Diagram + 5-Whys
- id: problem-statement
title: Problem Statement
instruction: Clear problem definition
sections:
- id: what
title: What
type: text
instruction: Clear description of the problem
- id: when
title: When
type: text
instruction: Timing and frequency of occurrence
- id: where
title: Where
type: text
instruction: Location/component affected
- id: impact
title: Impact
type: text
instruction: Quantified impact on system/users
- id: fishbone-analysis
title: Fishbone Analysis
instruction: |
[[LLM: Create ASCII fishbone diagram showing all 6 categories branching into the problem]]
sections:
- id: diagram
title: Fishbone Diagram
type: code-block
instruction: |
```
{{problem_title}}
|
People ---------------+--------------- Process
\ | /
\ | /
\ | /
\ | /
\ | /
Technology ----------+---------- Environment
\ | /
\ | /
\ | /
\ | /
\ | /
Data -----------+----------- Methods
```
- id: people
title: People (Developer/User Factors)
type: bullet-list
instruction: Knowledge gaps, communication issues, training needs, user behavior
- id: process
title: Process (Development/Deployment)
type: bullet-list
instruction: Development process, deployment procedures, code review, testing
- id: technology
title: Technology (Tools/Infrastructure)
type: bullet-list
instruction: Framework limitations, library issues, tool configurations, infrastructure
- id: environment
title: Environment (System/Configuration)
type: bullet-list
instruction: Environment differences, resource constraints, external dependencies
- id: data
title: Data (Input/State)
type: bullet-list
instruction: Input validation, data integrity, state management, race conditions
- id: methods
title: Methods (Algorithms/Design)
type: bullet-list
instruction: Algorithm correctness, design patterns, architecture decisions
- id: five-whys
title: 5-Whys Analysis
instruction: Deep dive to root cause
sections:
- id: symptom
title: Primary Symptom
type: text
instruction: Starting point for analysis
- id: why1
title: "1. Why?"
type: text
instruction: First level cause
- id: why2
title: "2. Why?"
type: text
instruction: Second level cause
- id: why3
title: "3. Why?"
type: text
instruction: Third level cause
- id: why4
title: "4. Why?"
type: text
instruction: Fourth level cause
- id: why5
title: "5. Why?"
type: text
instruction: Fifth level cause (root cause)
- id: root-cause
title: Root Cause
type: text
instruction: Final identified root cause
- id: evidence-validation
title: Evidence & Validation
instruction: Support for conclusions
sections:
- id: evidence
title: Supporting Evidence
type: bullet-list
instruction: List all evidence supporting the root cause conclusion
- id: verification
title: Verification Method
type: paragraphs
instruction: How to verify this is the true root cause
- id: confidence
title: Confidence Level
type: key-value
instruction: |
Rating: {{confidence_rating}}
Justification: {{confidence_justification}}
- id: root-cause-summary
title: Root Cause Summary
instruction: Consolidated findings
sections:
- id: primary
title: Primary Root Cause
type: key-value
instruction: |
Cause: {{primary_root_cause}}
Category: {{cause_category}}
Evidence: {{primary_evidence}}
- id: contributing
title: Contributing Factors
type: numbered-list
instruction: Secondary factors that contributed to the problem
- id: eliminated
title: Eliminated Possibilities
type: numbered-list
instruction: Potential causes that were ruled out and why
- id: impact-analysis
title: Impact Analysis
instruction: Scope and consequences
sections:
- id: direct
title: Direct Impact
type: paragraphs
instruction: Immediate consequences of the problem
- id: indirect
title: Indirect Impact
type: paragraphs
instruction: Secondary effects and ripple impacts
- id: recurrence
title: Risk of Recurrence
type: key-value
instruction: |
Probability: {{recurrence_probability}}
Without intervention: {{risk_without_fix}}
- id: recommendations
title: Recommended Actions
instruction: Solutions and prevention
sections:
- id: immediate
title: Immediate Actions
type: checkbox-list
instruction: Actions to take right now to address the issue
- id: short-term
title: Short-term Solutions
type: checkbox-list
instruction: Solutions to implement within the current sprint
- id: long-term
title: Long-term Prevention
type: checkbox-list
instruction: Strategic changes to prevent recurrence
- id: process-improvements
title: Process Improvements
type: bullet-list
instruction: Process changes to prevent similar issues
- id: implementation-priority
title: Implementation Priority
instruction: Action prioritization
sections:
- id: priority-matrix
title: Priority Matrix
type: table
columns: [Action, Priority, Effort, Impact, Timeline]
instruction: |
[[LLM: Create prioritized action table with High/Medium/Low ratings]]
- id: verification-plan
title: Verification Plan
instruction: Ensuring fix effectiveness
sections:
- id: success-criteria
title: Success Criteria
type: bullet-list
instruction: How we'll know the root cause is addressed
- id: validation-steps
title: Validation Steps
type: numbered-list
instruction: Steps to validate the fix works
- id: monitoring-metrics
title: Monitoring Metrics
type: bullet-list
instruction: Metrics to track to ensure problem doesn't recur
- id: lessons-learned
title: Lessons Learned
instruction: Knowledge capture
sections:
- id: insights
title: Key Insights
type: bullet-list
instruction: What we learned from this analysis
- id: prevention
title: Prevention Strategies
type: bullet-list
instruction: How to prevent similar issues in the future
- id: knowledge-transfer
title: Knowledge Transfer
type: bullet-list
instruction: Information to share with the team
- id: footer
title: Analysis Footer
instruction: Closing information
sections:
- id: completion
title: Completion Details
type: key-value
instruction: |
Analysis Completed: {{completion_timestamp}}
Review Date: {{review_date}}
Owner: {{action_owner}}