-
-
Notifications
You must be signed in to change notification settings - Fork 60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parsing of Mulliken charges from SIESTA stdout #691
Conversation
I think it looks very good. Only thing would be to streamline the headers against voronoi and hirshfeld, and also make it work for simpler spin-systems. |
Thanks for having a look!
Do you mean that the headers should be the same for the three different types of charges?
Good point, I'll look into that and update the implementation. |
Thanks for extending sisl!
Yes, in the returned dataframe it would be ideal if they were the same. This should just be checked.
If this proves too hard, just start with this one, finish it up, then the other thing can always be added. |
Once you are done with this, I would like to revisit it to make #695 functional for |
I just noticed that one of the fields in the dataframe for Voronoi and Hirshfeld is |
I also looked at the way the parsing of the Hirshfeld and Voronoi charges is currently being tested for inspiration. It reads an output that contains Voronoi and Hirshfeld charges from the |
I see. And as for test, anything you prefer would be fine with me. :) |
I used I looked into the I also started looking into making it work for simpler spin cases (non-polarized and colinear) but that looks more difficult than I thought because apparently the complete layout of the Mulliken charges changes for those spin types :'-) I will try to add to add some tests and look into the parsing of those cases tomorrow. |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #691 +/- ##
==========================================
+ Coverage 86.68% 86.92% +0.23%
==========================================
Files 402 403 +1
Lines 51883 52509 +626
==========================================
+ Hits 44976 45642 +666
+ Misses 6907 6867 -40 ☔ View full report in Codecov by Sentry. |
Good, I'll open an issue to try and change this (or the other).
Yes, in terms of the documentation, you don't need to do anything else :)
Ok, thanks, yes, the Mulliken is pretty difficult to parse, as it has many options. It should probably be refactored in Siesta, because it really is a night-mare to parse ;) |
I tried adding the parsing for simpler spin types. The parsing did get more complicated and I hope the code is not too convoluted now. I didn't get around to adding the tests yet but I'll try to get to that next week. |
src/sisl/io/siesta/stdout.py
Outdated
atom_idx = [] | ||
header = ["e"] | ||
_parse_spin_pol() | ||
elif ret[2] == IDX_POL: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know if it makes sense to use self.info.spin == Spin.POLARIZED
instead? The self.info.spin
should be populated well before any mulliken charges is created.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I didn't know that. If that works that would certainly be preferable. I'll give it a try.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you sure self.info.spin
gets populated? I tried this but I get,
AttributeError: stdoutSileSiesta.info.spin does not exist, did you mistype?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added an additional spin
property to the InfoAttr
dict at the start of stdOutSileSiesta
and now it works (see commit fece834). Was this sort of what you had in mind?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ha, sorry, I had added that commit just after you branched off, see here
sisl/src/sisl/io/siesta/stdout.py
Line 41 in 13a67c7
def _parse_spin(attr, match): |
So basically we did the same thing, yes, it was exactly how I imagined it. If you could revert your commit, then rebase, or ask me to rebase :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I reverted my last commit to remove any changes I made to add the self.info.spin
property and then rebased on the latest version of main
. Everything should be okay again now :)
I still have to add the tests to finalize this PR but I'll do my very best to allocate some time to do this next week.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds really great!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In case you are not aware, files should go in the https://github.com/zerothi/sisl-files repository, and then the submodule files
should be bumped, if you need help, let me know. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't get around to this unfortunately and I'll be away next week so I won't be able to finalize this until the first week of April. I hope this is okay.
fece834
to
d6f6ca9
Compare
no worries, let me know when/if you need feedback! :) |
Hi @ahkole sorry to ping, do you need any help here? |
Hi, no need to apologize for pinging. I agree that this should be wrapped up and it was starting to be overshadowed by other projects. I am now trying to add the test cases. I have forked the
Do you know what's causing this and how to fix it? |
I may have already figured out what's causing this issue. This |
I don't mind either :)
Yeah, you can't do that, if you wan't to run a specific test, you should do something like this: pytest --pyargs sisl -k stdout_charges
|
It required some trial and error on how to point the test suite to the correct files (luckily the error messages usefully referred to the use of a
I read the discussion but am not 100% sure I understand what the structure should be. Am I understanding correctly that I should reshape the folder https://github.com/zerothi/sisl-files/tree/main/tests/sisl/io/siesta/outs that contains the current files that are used by the |
No, you don't need to reshape the folders (currently).
Is this better? Let me know! 😄 |
I think so. I'll just give it a try and then let you review. |
I've made a first try of adding files for the Mulliken tests, see zerothi/sisl-files#15 . Is this what close to what you wanted? Or are there now too many files in a single folder and should each file be moved to its own subfolder? |
Yes, sub-folder would be nice! :) |
Sorry to ping you again, I think your addition would be nice to have in a new release, what needs to be done here? |
Hi, yes sorry that this is still open. It would also be nice for myself to finally wrap this up. The files for the tests have been merged into |
Sorry, I had missed this. It isn't too important, I am right now reworking the directory structure, so if you retain this in a single directory, it would be ideal, then I will re-arrange accordingly, sorry for the delayed response. |
I'm not entirely sure I understand what you mean but I'll describe how I'm doing it right now: In In I am running into some issues with the last test with Mulliken moments at every SCF step. The tests fail for that case so I have to figure out why, I'll look at that in the coming days. |
Ok, let me know if you need assistance! |
Long, long, long overdue but I have finally taken the time to finalize this PR. I fixed the issue that was causing some of the tests to crash (turns out it wasn't such a smart idea to keep parsing as long as you find new species for NC/SOC if you have more than 1 MD step). I have also synced everything with upstream/main and merged my tests into the new structure of the Can you check if everything is okay now to be merged? |
nspecies is what is currently used in Atom, so better stick with that name. Signed-off-by: Nick Papior <[email protected]>
Signed-off-by: Nick Papior <[email protected]>
And a long overdue merge, I finalized a few things... Once CI passes, I'll merge! Thanks! |
Thanks @ahkole! |
isort .
andblack .
[24.2.0] at top-leveldocs/
CHANGELOG.md
I was missing the parsing of the Mulliken charges from SIESTA stdout so I started with an implementation of the parsing. Right now it only parses the total charges (so no orbital decomposition) but that is already very useful for me.
The parsing seems to work for the cases that I have tested but I still have to add some more robust tests (and do the other stuff from the todo list above). Therefore, I marked the PR as a draft. Any feedback about whether the implementation is going in the right direction or whether certain things should be handled differently would be much appreciated.