3.0 KiB
3.0 KiB
Task: Create Data Analysis Plan
Description
Create a comprehensive data analysis plan for extracting insights, developing machine learning models, and implementing data pipelines to support project objectives.
Input Required
- Business requirements or problem statement
- Available data sources and descriptions
- Expected outcomes or success criteria
- Technical constraints or limitations
Steps
-
Problem Definition
- Clearly articulate the business problem or opportunity
- Define specific questions to be answered through analysis
- Establish success metrics and evaluation criteria
- Identify stakeholders and their requirements
-
Data Assessment
- Inventory available data sources
- Assess data quality, completeness, and accessibility
- Identify data gaps and acquisition needs
- Evaluate data privacy and compliance requirements
- Define data sampling strategy if applicable
-
Exploratory Analysis Planning
- Define key variables to explore
- Plan initial data profiling and visualization
- Identify potential relationships to investigate
- Design statistical tests to validate hypotheses
- Plan for outlier detection and handling
-
Feature Engineering Strategy
- Identify potential features to create
- Plan transformations and encoding methods
- Define feature selection approach
- Document dimensionality reduction techniques if needed
- Plan feature validation methods
-
Model Development Strategy
- Select candidate algorithms based on problem type
- Define training and validation approach
- Plan hyperparameter tuning methodology
- Establish model evaluation metrics
- Design model interpretability approach
-
Data Pipeline Architecture
- Design data ingestion processes
- Plan data transformation and storage
- Define model training pipeline
- Design inference pipeline for production
- Plan for monitoring and retraining
-
Implementation Roadmap
- Create phased implementation plan
- Establish milestones and deliverables
- Identify required resources and tools
- Develop timeline aligned with project goals
- Plan for knowledge transfer and documentation
-
Review and Validation
- Validate plan against business objectives
- Ensure technical feasibility
- Confirm alignment with project timeline
- Verify ethical considerations are addressed
Output
A comprehensive data analysis plan that includes:
- Problem definition and success criteria
- Data assessment and preparation strategy
- Exploratory analysis approach
- Feature engineering plan
- Model development methodology
- Data pipeline architecture
- Implementation roadmap and timeline
Validation Criteria
- Plan addresses the business problem completely
- Data sources and quality issues are thoroughly assessed
- Modeling approach is appropriate for the problem type
- Technical implementation is feasible within constraints
- Ethical considerations are properly addressed
- Plan can be executed within project timeline