# Root Cause Analysis Workflow Focused root cause analysis using fishbone (Ishikawa) methodology and 5-Whys technique. ## Context This workflow performs systematic root cause analysis to identify the underlying causes of defects, moving beyond symptoms to address fundamental issues. ## Prerequisites - Clear symptom or problem statement - Access to system logs and error messages - Knowledge of recent changes or deployments - Understanding of system architecture - Access to code repository ## Instructions Execute the following steps in order: ### Step 1: Problem Definition 1. Ask user for problem/symptom description if not already provided 2. Clearly state the problem/symptom 3. Define when it occurs (timing, frequency, patterns) 4. Define where it occurs (component, environment, scope) 5. Quantify the impact (users affected, severity level P0-P3) 6. Document problem statement clearly ### Step 2: Fishbone Analysis Categories Load `root-cause-checklist.md` and analyze the problem across these six dimensions: #### People (Developer/User factors) - Knowledge gaps or misunderstandings - Communication breakdowns between teams - Incorrect assumptions made during development - User behavior patterns contributing to issue - Training or documentation gaps #### Process (Development/Deployment) - Missing validation or review steps - Inadequate testing coverage - Deployment procedures not followed - Code review gaps or oversights - Documentation process failures #### Technology (Tools/Infrastructure) - Framework limitations or bugs - Library bugs or incompatibilities - Infrastructure issues or constraints - Tool configuration problems - Version compatibility issues #### Environment (System/Configuration) - Environment-specific settings or differences - Resource constraints (memory, CPU, disk) - External dependencies or integrations - Network or connectivity issues - Configuration drift between environments #### Data (Input/State) - Invalid or unexpected input data - Data corruption or inconsistency - State management issues - Race conditions or timing problems - Data validation gaps #### Methods (Algorithms/Design) - Algorithm flaws or inefficiencies - Design pattern misuse - Architecture limitations - Performance bottlenecks - Logic errors in implementation ### Step 3: 5-Whys Deep Dive For each potential cause identified in Step 2: 1. Ask "Why does this happen?" - Document the answer 2. For each answer, ask "Why?" again - Go deeper 3. Continue asking "Why?" until reaching the root cause (typically 5 iterations) 4. Document the complete chain of causation 5. Ensure the final "Why" points to something actionable Example: - Problem: Application crashes - Why? Memory leak - Why? Objects not released - Why? Event listeners not removed - Why? Component unmount not handled - Why? Framework lifecycle not understood → ROOT CAUSE ### Step 4: Evidence Collection For each identified root cause: - Gather supporting evidence (logs, stack traces, metrics, code examples) - Verify through reproduction or testing - Rule out alternative explanations - Cross-reference with `common-defects.md` patterns - Establish confidence level (High/Medium/Low) - Document evidence trail clearly ### Step 5: Root Cause Prioritization Rank root causes using this matrix: | Root Cause | Likelihood | Impact | Effort | Risk | Priority | | ---------- | ---------- | ------ | ------ | ---- | -------- | Scoring criteria: - **Likelihood:** Probability this is the true cause (High/Medium/Low) - **Impact:** Severity if this is the cause (Critical/High/Medium/Low) - **Effort:** Complexity to address (High/Medium/Low) - **Risk:** Potential for recurrence if not fixed (High/Medium/Low) - **Priority:** Overall priority (P0-P3) ### Step 6: Recommendations For the highest priority root causes: 1. Propose specific fixes to address root cause 2. Suggest preventive measures to avoid recurrence 3. Recommend process improvements 4. Identify monitoring or instrumentation needs 5. Create validation plan for proposed fixes ## Output Requirements Generate a structured root cause analysis document containing: - Analysis metadata (ID, date, analyst, method) - Problem statement (what, when, where, impact) - Fishbone diagram (ASCII or description) - Analysis for all 6 categories (People, Process, Technology, Environment, Data, Methods) - 5-Whys analysis for top causes - Evidence supporting each root cause - Prioritization matrix - Recommendations for fixes and prevention - Next steps and validation plan ## Completion Criteria - [ ] Problem clearly defined with quantified impact - [ ] All 6 fishbone categories analyzed - [ ] 5-Whys performed for top causes - [ ] Evidence collected and documented - [ ] Root causes prioritized by likelihood and impact - [ ] Recommendations provided with validation plan - [ ] Root cause analysis report generated and saved ## Elicitation Points Problem/Symptom: Ask user to describe: - What is the observable problem or symptom? - When does it occur (always, intermittently, specific conditions)? - Where does it occur (which environment, component, user segment)? - What is the impact (how many users, severity, business impact)? Additional Context: Ask if available: - Recent changes or deployments - Error messages or logs - Reproduction steps - Related issues or patterns - User reports or feedback