Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor steps in blending code #453

Merged
merged 40 commits into from
Feb 3, 2025
Merged

Conversation

sidekock
Copy link
Contributor

See #443

sidekock and others added 30 commits November 18, 2024 11:28
Co-authored-by: mats-knmi <[email protected]>
….velocity_perturbations = [] in __initialize_random_generators
Copy link

codecov bot commented Jan 20, 2025

Codecov Report

Attention: Patch coverage is 72.22222% with 10 lines in your changes missing coverage. Please review.

Project coverage is 84.31%. Comparing base (a7dae54) to head (5fad664).
Report is 1 commits behind head on master.

Files with missing lines Patch % Lines
pysteps/nowcasts/steps.py 72.22% 10 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #453      +/-   ##
==========================================
+ Coverage   84.26%   84.31%   +0.05%     
==========================================
  Files         160      160              
  Lines       13067    13251     +184     
==========================================
+ Hits        11011    11173     +162     
- Misses       2056     2078      +22     
Flag Coverage Δ
unit_tests 84.31% <72.22%> (+0.05%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@sidekock
Copy link
Contributor Author

Both visual tests and assert statements give the same conclusion: the output is exactly the same!

@sidekock
Copy link
Contributor Author

@dnerini The codecov check is failing but the only things it is actuary failing on in the deepcopys I have done in the code and on the timing which does not seem worth it to write tests for as this is a very basic subtraction. Would it be able to disable it for these things or how does this work? I know you where able to do it for the re-factoring of the steps nowcasting.

Copy link
Contributor

@RubenImhoff RubenImhoff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @sidekock and @mats-knmi,

Thanks a lot for your work here! I'm sorry I was not around the past days to answer some questions and think along. However, this looks great! With the large number of changes, it is becoming difficult to check everything thoroughly, so at least I'm happy to see that it gives exactly the same results. I think a to do for a new PR could be updating the documentation, but that is probably easier and cleaner once this PR is pushed.

To further reduce memory usage, both this array and the ``velocity_models`` array
can be given as float32. They will then be converted to float64 before computations
to minimize loss in precision.
# TODO: compare old and new version of the code, run a benchmark to compare the two
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this TODO still necessary? I can imagine we'd like to have this done prior to merging this PR.

Copy link
Contributor Author

@sidekock sidekock Jan 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The two versions do now provide identical output. I just want to make sure this is the case for all configurations, meaning:

  • Signle core single member
  • Single core multiple member
  • Multiple core single member
  • Multiple core multiple members (less cores than members)
  • Multiple core multiple members (more cores than members)

The reason I want to do this is that I discovered that the pysteps tests do not properly handle this (all tests passed in a previous version of the code but it crashed on issues related to number of cores and number of members).
If you also propose other tests, I would love to hear suggestions!
The tests would exist of running both the master and refactored branch and then doing the following checks:

  • DataArray.identical(other) @mats-knmi do you know if this actually checks the values of the nowcast or only the structure and metadata. Its a bit unclear to me but since it is very fast, I would assume it does not test the data itself.
  • Depending on the previous answer, I could also check some random fields of random times and members.
  • Lastly I can do a visual inspection but if the previous two are fine I do not think this matter but it's a matter of having enough confidence before I actually make this the default code :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Identical should do an element wise comparison (at least that is what I gather from the docs). The docs say that identical is the same as equals but also checks metadata. The docs for equals say that it compares not only the structure but also the contents of the datasets. Doing a very simple test I do seem to be able to confirm this.

>>> xr.Dataset(data_vars={"x": ("x", np.array([1,2,3,4], dtype=np.float64)), "y": ("x", np.array([4,3,2,1], dtype=np.float64))}).identical(xr.Dataset(data_vars={"x": ("x", np.array([1,2,3,4], dtype=np.float64)), "y": ("x", np.array([4,3,2,1], dtype=np.float64))}))
True
>>> xr.Dataset(data_vars={"x": ("x", np.array([1,2,3,4], dtype=np.float64)), "y": ("x", np.array([4,3,2,1], dtype=np.float64))}).identical(xr.Dataset(data_vars={"x": ("x", np.array([1,2,3,4], dtype=np.float64)), "y": ("x", np.array([4,3,2,2], dtype=np.float64))}))
False

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I should have done such a test myself but totally forgot. Ill do the last checks in the next few days and then we can close this 6 -month-work-in-progress :)

@dnerini
Copy link
Member

dnerini commented Jan 21, 2025

@dnerini The codecov check is failing but the only things it is actuary failing on in the deepcopys I have done in the code and on the timing which does not seem worth it to write tests for as this is a very basic subtraction. Would it be able to disable it for these things or how does this work? I know you where able to do it for the re-factoring of the steps nowcasting.

@sidekock we can simply ignore the failing check and merge whenever you feel it's ready

@sidekock
Copy link
Contributor Author

sidekock commented Feb 1, 2025

Results of my test on belgian NWP blending:

  • Single core single member ✔️
  • Single core multiple member ✔️
  • Multiple core single member ✔️
  • Multiple core multiple members (less cores than members) ✔️
  • Multiple core multiple members (more cores than members) ✔️
  • Test with longer lead time (6 hours with 5min resolution, previous ones where all 1hour) ✔️

All tests seem to pass. I will run all the pysteps tests one more time and if they all pass I think it is time to release this into the wild :)

@sidekock
Copy link
Contributor Author

sidekock commented Feb 1, 2025

@RubenImhoff @mats-knmi all tests are still succeeding and all the extra tests that I have run say the two datasets are identical. Seems we can Squash and merge or do you think more tests need to be done?

@mats-knmi
Copy link
Contributor

Sounds good, I don't have anything else, so squash and merge away!

@RubenImhoff
Copy link
Contributor

Same for me, fantastic work! :)

@sidekock
Copy link
Contributor Author

sidekock commented Feb 3, 2025

@dnerini Do I also have to bump the version or is this something I leave up to you? Or is this not something that is done for code refactoring?

@mats-knmi
Copy link
Contributor

I would like a verion bump, even if it is just for the reproducibility with the RNG that I merged earlier, since that would help me out with the tests that we have running.

@sidekock
Copy link
Contributor Author

sidekock commented Feb 3, 2025

I would like a verion bump, even if it is just for the reproducibility with the RNG that I merged earlier, since that would help me out with the tests that we have running.

Doing some last changes to the documentation and then ill be on that to make a last commit before I merge

@sidekock sidekock merged commit 92deda1 into master Feb 3, 2025
8 of 9 checks passed
@dnerini
Copy link
Member

dnerini commented Feb 3, 2025

Fantastic to see this merged! Congratulations for all the hard work you put into this, well done!

@sidekock
Copy link
Contributor Author

sidekock commented Feb 3, 2025

Fantastic to see this merged! Congratulations for all the hard work you put into this, well done!

@dnerini Thanks! The tests for the docs to seem to be failing due to the codecov:
image

Also the version does not seem to be bumped. Don't know what I did wrong because I changed the version in the PKG-info

@dnerini
Copy link
Member

dnerini commented Feb 3, 2025

@sidekock the error while compiling the docs was caused by a deprecation in scipy (see #455), while the version needed to be bumped also in the setup.py (5ede008).

now it should be all good, shall we release v1.14?

@sidekock
Copy link
Contributor Author

sidekock commented Feb 3, 2025

Yes, please! Then I can finally cross this one on my to-do list :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

4 participants