154 lines
5.6 KiB
Markdown
154 lines
5.6 KiB
Markdown
# BMAD-Spec Orchestrator Error Handling Guide
|
|
|
|
## Error Classification
|
|
|
|
### Critical Errors (Stop Execution)
|
|
- **Missing Core Files**: Required agent prompts, templates, or workflows not found
|
|
- **Invalid Configuration**: Core configuration files are corrupted or missing required fields
|
|
- **Security Violations**: Attempts to access unauthorized resources or execute dangerous operations
|
|
- **Data Corruption**: Session state or artifact corruption that prevents safe continuation
|
|
|
|
### Warning Errors (Continue with Caution)
|
|
- **Missing Optional Files**: Non-essential templates or tasks are unavailable
|
|
- **Partial Data**: Some context information is missing but workflow can continue
|
|
- **Version Mismatches**: Compatible but non-optimal versions of dependencies
|
|
- **Performance Issues**: Operations are slow but functional
|
|
|
|
### Recoverable Errors (Retry/Fix)
|
|
- **User Input Errors**: Invalid or incomplete user responses
|
|
- **Template Variable Errors**: Missing or incorrect template placeholder values
|
|
- **Agent Output Errors**: Malformed or incomplete agent responses
|
|
- **File Access Errors**: Temporary issues accessing files or resources
|
|
|
|
## Error Response Procedures
|
|
|
|
### For Critical Errors:
|
|
1. **Immediate Stop**: Halt all execution immediately
|
|
2. **Clear Error Message**: Explain exactly what went wrong and why it's critical
|
|
3. **Recovery Instructions**: Provide specific steps to fix the issue
|
|
4. **Safe State**: Ensure system is in a safe, recoverable state
|
|
5. **User Notification**: Inform user of the issue and required actions
|
|
|
|
### For Warning Errors:
|
|
1. **Continue Operation**: Allow workflow to proceed with limitations
|
|
2. **Log Warning**: Record the issue for later review
|
|
3. **User Notification**: Inform user of the limitation and potential impacts
|
|
4. **Workaround Suggestion**: Provide alternative approaches if available
|
|
5. **Monitor Progress**: Watch for related issues as workflow continues
|
|
|
|
### For Recoverable Errors:
|
|
1. **Identify Cause**: Determine the specific reason for the failure
|
|
2. **Suggest Fix**: Provide clear instructions for correction
|
|
3. **Allow Retry**: Enable user to retry after making corrections
|
|
4. **Track Attempts**: Monitor retry attempts to prevent infinite loops
|
|
5. **Escalate if Needed**: Convert to warning or critical if retries fail
|
|
|
|
## Specific Error Scenarios
|
|
|
|
### Agent Activation Failures
|
|
```
|
|
ERROR: Cannot load agent prompt for 'analyst'
|
|
CAUSE: File '.claude/agents/analyst/prompt.md' not found
|
|
RECOVERY:
|
|
1. Check if the file exists in the correct location
|
|
2. Verify file permissions allow reading
|
|
3. Ensure the agent name is spelled correctly
|
|
4. Create missing agent files if necessary
|
|
```
|
|
|
|
### Template Processing Errors
|
|
```
|
|
ERROR: Missing template variable '{{project_name}}'
|
|
CAUSE: Required variable not provided by user or previous agent
|
|
RECOVERY:
|
|
1. Request missing variable from user
|
|
2. Check if variable should come from previous step
|
|
3. Verify template syntax is correct
|
|
4. Provide default value if appropriate
|
|
```
|
|
|
|
### Workflow State Errors
|
|
```
|
|
ERROR: Cannot proceed to step 3, step 2 not completed
|
|
CAUSE: Previous workflow step did not complete successfully
|
|
RECOVERY:
|
|
1. Review step 2 completion status
|
|
2. Complete any missing requirements from step 2
|
|
3. Validate step 2 outputs meet requirements
|
|
4. Update workflow state accordingly
|
|
```
|
|
|
|
### File System Errors
|
|
```
|
|
ERROR: Cannot write to '.claude/context/artifacts/prd.md'
|
|
CAUSE: File system permissions or disk space issues
|
|
RECOVERY:
|
|
1. Check available disk space
|
|
2. Verify write permissions for the directory
|
|
3. Ensure no file locks are preventing writes
|
|
4. Try alternative file location if needed
|
|
```
|
|
|
|
## Error Logging and Reporting
|
|
|
|
### Log Entry Format:
|
|
```
|
|
TIMESTAMP: 2024-01-01 12:34:56
|
|
LEVEL: ERROR|WARNING|INFO
|
|
COMPONENT: Agent|Workflow|Template|System
|
|
ERROR_CODE: BMAD_001
|
|
MESSAGE: Detailed error description
|
|
CONTEXT: Relevant system state information
|
|
USER_ACTION: Required user action (if any)
|
|
RESOLUTION: How the error was resolved
|
|
```
|
|
|
|
### Error Codes:
|
|
- **BMAD_001-099**: Agent-related errors
|
|
- **BMAD_100-199**: Workflow execution errors
|
|
- **BMAD_200-299**: Template processing errors
|
|
- **BMAD_300-399**: File system and I/O errors
|
|
- **BMAD_400-499**: User input and validation errors
|
|
- **BMAD_500-599**: System and configuration errors
|
|
|
|
## Recovery Strategies
|
|
|
|
### State Recovery:
|
|
- Maintain regular checkpoints at each workflow step
|
|
- Store partial results in recoverable format
|
|
- Enable resumption from any valid checkpoint
|
|
- Validate state integrity before continuing
|
|
|
|
### Data Recovery:
|
|
- Backup critical artifacts before modifications
|
|
- Maintain version history for all documents
|
|
- Enable rollback to previous valid state
|
|
- Verify data integrity after recovery
|
|
|
|
### User Communication:
|
|
- Provide clear, non-technical error explanations
|
|
- Offer specific actionable steps for resolution
|
|
- Indicate estimated time for fixes when possible
|
|
- Keep users informed of progress during recovery
|
|
|
|
## Prevention Strategies
|
|
|
|
### Input Validation:
|
|
- Validate all user inputs before processing
|
|
- Check file existence before attempting access
|
|
- Verify template completeness before rendering
|
|
- Confirm agent availability before activation
|
|
|
|
### Defensive Programming:
|
|
- Assume external resources may be unavailable
|
|
- Implement graceful degradation for non-critical features
|
|
- Use timeouts for potentially long-running operations
|
|
- Provide meaningful defaults where appropriate
|
|
|
|
### Testing and Monitoring:
|
|
- Test error conditions during development
|
|
- Monitor system health and performance
|
|
- Track error patterns and frequencies
|
|
- Implement automated recovery where possible
|
|
|
|
This error handling framework ensures robust, user-friendly operation of the BMAD-Spec Orchestrator System. |