-
Notifications
You must be signed in to change notification settings - Fork 82
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ProjwfcParser
: refactor code and temp retrieve PDOS files
#749
base: main
Are you sure you want to change the base?
Conversation
4040d49
to
581bd57
Compare
Note: the Python 3.6 tests fail because More importantly, the tests succeed for 581bd57 and Python >= 3.7, without touching the tests. I've also run both the old parser and refactored one for the 4 different spin settings, and compared the content of the orbitals and PDOS arrays. So I'm quite confident that the parsing is unaltered. |
7edefbe
to
521e34b
Compare
The `generate_calc_job_node` fixture has a `retrieve_temporary` input that allows the user to specify a list of files in the fixture test folder that should be copied to the temporary retrieved folder (as defined by the `_retrieve_temporary_list` class attribute in the corresponding calculation job). Here we extend this functionality to allow the user to specify glob patterns in the `retrieve_temporary` input. Second, we also adapt the code to use the `pathlib` module for handling paths. While doing so, we also revert a change in 0874d95 that introduced a try-except to handle the case where the XML is not in the fixture test folder. This is no longer necessary, because the `glob` call will only return files that are actually present in the fixture test folder.
521e34b
to
d7d344a
Compare
@Minotakm @t-reents @AndresOrtegaGuerrero I still have to add the Lowdin parsing, but other than that the PR is ready for review. A line-by-line code review might be quite a bit of work, I'm mainly asking you to test the new parser class and see if it still has any issues. E.g. I think you mentioned the parsing for some cases (spinorbit?) was incorrect. |
Thanks @mbercx I will give it a spin and see how it goes. |
The `ProjwfcParser` relied on passing a single `out_info_dict` dictionary between several functions to store the parsed information. This made the parser code so entangled that it was almost unreadable and difficult to trace down bugs. Here we refactor the class to properly isolate the various parsing functions as class methods, and make the code more readable in general. The parent class is also changed to the `BaseParser` one, so it can rely on the basic stdout parsing this class provides. Another substantial change is that the `.pdos` files, which can be many and are parsed completely, are now only temporarily retrieved instead of stored in the AiiDA repository in the `retrieved` folder.
d7d344a
to
bec5f23
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @mbercx, I went through it and it looks nice! Definitely easier to understand what is going on, compared to the last version.
As you mentioned, there was still an issue, namely the proper parsing of the total DOS #1041. I added the fix as suggestions (let me know if I should open a PR to your branch instead) and quickly tested them. @AndresOrtegaGuerrero, please have a look as well.
@t-reents thanks a lot for the review! Feel free to add a commit here with your changes. If possible it would be great if:
This PR will then have 4 commits to merge in the end. I think that is the most clear for future developers. |
thank you @mbercx I will test it once the new commits are merged then :) |
Currently, the `ProjwfcParser` does not correctly parse the total DOS data in the case of spin-polarised calculations. Only the first column is parsed, which corresponds to spin-up density only. This issue passed the tests unnoticed since the regression tests only check the attributes of the `Dos` output, not the actual content. Here we adapt the `ProjwfcParser` to treat the spin-polarised case `nspin = 2` case correctly, constructing an `XyData` node that has two `y` arrays, one for each spin. The tests for the spin-polarised case are also adapted to add the `XyData` contents to the regression files.
55e1253
to
121d1bb
Compare
energy, dos, pdos_array = self._parse_pdos_files(retrieved_temporary_folder, nspin, spinorbit, logs) | ||
energy, dos_node, pdos_array = self._parse_pdos_files(retrieved_temporary_folder, nspin, spinorbit, logs) | ||
|
||
self.out('Dos', dos_node) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd be inclined to change this output name to dos
(all lowercase), but that is technically backwards-incompatible. However, for the spin-polarised case, by fixing the bug we are also adding an extra array to the XyData
output, so we are already breaking backwards compatibility in a sense? Of course changing the output link name would also break backwards compatibility for the non-spin-polarised case.
Thoughts @t-reents @sphuber @Minotakm @AndresOrtegaGuerrero?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How much is the impact on previous calculations , not much right ?
('states/eV', 'states/eV') | ||
) | ||
else: | ||
dos_node.set_y(pdostot_array[:, 1], 'Dos', 'states/eV') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar comment as above: I'd use lowercase dos
for the array name here.
Massive refactoring of the
ProjwfcParser
to make the code less entangledand more readable.
TODO: