BMAD-METHOD/bmad-agent/checklists/data-science-checklist.md

1.9 KiB

Data Science Project Checklist

Problem Definition & Data Requirements

  • Business problem is clearly defined
  • Success metrics are established
  • Required data sources are identified
  • Data access methods are established
  • Privacy and compliance requirements are identified
  • Project timeline and resources are planned

Data Collection & Exploration

  • Data collection pipeline is established
  • Data quality assessment is completed
  • Exploratory data analysis is performed
  • Data distributions and relationships are understood
  • Missing data strategy is defined
  • Outlier handling approach is determined

Feature Engineering & Preprocessing

  • Feature selection/creation strategy is defined
  • Data transformations are implemented
  • Feature scaling/normalization is applied where needed
  • Categorical encoding is implemented appropriately
  • Data splitting strategy (train/test/validation) is defined
  • Data preprocessing pipeline is reproducible

Model Development

  • Appropriate algorithms are selected for the problem
  • Baseline models are established
  • Hyperparameter tuning strategy is defined
  • Model evaluation metrics are appropriate for problem
  • Cross-validation approach is implemented
  • Model interpretability requirements are met

Model Validation & Testing

  • Models are evaluated on holdout data
  • Performance meets business requirements
  • Model generalization is assessed
  • Model bias and fairness are evaluated
  • Model limitations are documented
  • A/B testing plan is defined (if applicable)

Deployment & Monitoring

  • Model deployment approach is defined
  • Model versioning is implemented
  • Inference performance is acceptable
  • Monitoring for model drift is established
  • Retraining strategy is defined
  • Feedback loop for model improvement is established