BMAD-METHOD/bmad-agent/checklists/data-science-checklist.md

49 lines
1.9 KiB
Markdown

# Data Science Project Checklist
## Problem Definition & Data Requirements
- [ ] Business problem is clearly defined
- [ ] Success metrics are established
- [ ] Required data sources are identified
- [ ] Data access methods are established
- [ ] Privacy and compliance requirements are identified
- [ ] Project timeline and resources are planned
## Data Collection & Exploration
- [ ] Data collection pipeline is established
- [ ] Data quality assessment is completed
- [ ] Exploratory data analysis is performed
- [ ] Data distributions and relationships are understood
- [ ] Missing data strategy is defined
- [ ] Outlier handling approach is determined
## Feature Engineering & Preprocessing
- [ ] Feature selection/creation strategy is defined
- [ ] Data transformations are implemented
- [ ] Feature scaling/normalization is applied where needed
- [ ] Categorical encoding is implemented appropriately
- [ ] Data splitting strategy (train/test/validation) is defined
- [ ] Data preprocessing pipeline is reproducible
## Model Development
- [ ] Appropriate algorithms are selected for the problem
- [ ] Baseline models are established
- [ ] Hyperparameter tuning strategy is defined
- [ ] Model evaluation metrics are appropriate for problem
- [ ] Cross-validation approach is implemented
- [ ] Model interpretability requirements are met
## Model Validation & Testing
- [ ] Models are evaluated on holdout data
- [ ] Performance meets business requirements
- [ ] Model generalization is assessed
- [ ] Model bias and fairness are evaluated
- [ ] Model limitations are documented
- [ ] A/B testing plan is defined (if applicable)
## Deployment & Monitoring
- [ ] Model deployment approach is defined
- [ ] Model versioning is implemented
- [ ] Inference performance is acceptable
- [ ] Monitoring for model drift is established
- [ ] Retraining strategy is defined
- [ ] Feedback loop for model improvement is established