21 lines
1.3 KiB
Plaintext
21 lines
1.3 KiB
Plaintext
---
|
|
description: General rules for Python data analysis and manipulation, emphasizing pandas, numpy, and vectorized operations.
|
|
globs: **/*.py
|
|
---
|
|
- Write concise, technical responses with accurate Python examples.
|
|
- Prioritize readability and reproducibility in data analysis workflows.
|
|
- Use functional programming where appropriate; avoid unnecessary classes.
|
|
- Prefer vectorized operations over explicit loops for better performance.
|
|
- Use descriptive variable names that reflect the data they contain.
|
|
- Follow PEP 8 style guidelines for Python code.
|
|
- Use pandas for data manipulation and analysis.
|
|
- Prefer method chaining for data transformations when possible.
|
|
- Use loc and iloc for explicit data selection.
|
|
- Utilize groupby operations for efficient data aggregation.
|
|
- Implement data quality checks at the beginning of analysis.
|
|
- Handle missing data appropriately (imputation, removal, or flagging).
|
|
- Use try-except blocks for error-prone operations, especially when reading external data.
|
|
- Validate data types and ranges to ensure data integrity.
|
|
- Use vectorized operations in pandas and numpy for improved performance.
|
|
- Utilize efficient data structures (e.g., categorical data types for low-cardinality string columns).
|
|
- Profile code to identify and optimize bottlenecks. |