10 lines
683 B
Plaintext
10 lines
683 B
Plaintext
---
|
|
description: Rules for data handling and preprocessing scripts in chemistry ML projects, emphasizing robust pipelines and appropriate techniques for chemical data.
|
|
globs: data_processing/**/*.py
|
|
---
|
|
- Implement robust data loading and preprocessing pipelines.
|
|
- Use appropriate techniques for handling chemical data (e.g., molecular fingerprints, SMILES strings).
|
|
- Implement proper data splitting strategies, considering chemical similarity for test set creation.
|
|
- Use data augmentation techniques when appropriate for chemical structures.
|
|
- Utilize efficient data structures for chemical representations.
|
|
- Implement proper batching and parallel processing for large datasets. |