-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactoring of the MLIP modules and other non-urgent things #280
Conversation
now I need to adjust the fit_kwargs handling in the docs |
The PR is ready to be reviewed, just need to make the last adjustments to the docs |
Co-authored-by: Aakash Ashok Naik <[email protected]>
…ed to write the file
…d to atomate2 ForceFieldMakers
I would say that this PR is ready to be merged, if there is no other suggestion for improvement, @JaGeo 😃 |
@@ -160,24 +163,25 @@ In a similar way, the M3GNet fit hyperparameters can be passed using `make` as w | |||
|
|||
```python | |||
complete_flow = CompleteDFTvsMLBenchmarkWorkflow( | |||
ml_models=["M3GNet"], ..., | |||
ml_models=["M3GNet"], ..., apply_data_preprocessing=True, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What preprocessing happens and would the whole fit work without it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Data preprocessing sorts out data points with large forces. It's not implemented for anything else than GAP, but it will also divide the dataset into the different data types (phonon, rattled). Which means that most of the time, nothing happens during this step.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But does the workflow run without it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fit won't automatically work without it because in this step the data is also divided up into train and test set, so one would need to provide these files.
I want to do a code refactoring and restructure some Makers and the MLIP modules, also check VASP Maker and custodian settings, plus improve the unit test setup (like using automate2 mock_vasp direct import, match unit tests to src structure).
This PR might take a while and will not change anything in the current code functionalities or features.
I am just collecting all issues that don't alter the functionality or features of the code under the umbrella of this PR. Tasks might have to be split up into several smaller PRs over time
current ToDos:
from __future__ import annotations
from all filesrand_stuc
labels and suffixes torattled
to distinguish it from the RSS partbenchmark_kwargs={"relax_maker_kwargs": {"relax_cell": False, "relax_kwargs": "other_kwargs"}}
final ToDo: