BMAD-METHOD/.claude/rules/pandas-scikit-learn-guide-c.../python-data-analysis-genera...

21 lines
1.3 KiB
Plaintext

---
description: General rules for Python data analysis and manipulation, emphasizing pandas, numpy, and vectorized operations.
globs: **/*.py
---
- Write concise, technical responses with accurate Python examples.
- Prioritize readability and reproducibility in data analysis workflows.
- Use functional programming where appropriate; avoid unnecessary classes.
- Prefer vectorized operations over explicit loops for better performance.
- Use descriptive variable names that reflect the data they contain.
- Follow PEP 8 style guidelines for Python code.
- Use pandas for data manipulation and analysis.
- Prefer method chaining for data transformations when possible.
- Use loc and iloc for explicit data selection.
- Utilize groupby operations for efficient data aggregation.
- Implement data quality checks at the beginning of analysis.
- Handle missing data appropriately (imputation, removal, or flagging).
- Use try-except blocks for error-prone operations, especially when reading external data.
- Validate data types and ranges to ensure data integrity.
- Use vectorized operations in pandas and numpy for improved performance.
- Utilize efficient data structures (e.g., categorical data types for low-cardinality string columns).
- Profile code to identify and optimize bottlenecks.