-
Notifications
You must be signed in to change notification settings - Fork 201
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CI with Julia v1.11 #3836
base: main
Are you sure you want to change the base?
CI with Julia v1.11 #3836
Conversation
I installed julia 1.11 on the Caltech cluster, but we haven't made a module yet (but it's coming today or so) |
There's only one manifest? |
I assume he's referring to this new feature of Julia 1.11: https://julialang.org/blog/2024/10/julia-1.11-highlights/#manifest_versioning |
The PR adds another Manifest specifically for v1.11 and keeps the older Manifest that works for v1.10. Or we can just have one Manifest (the one for v1.11) and drop the one for v1.10 |
The Manifest was deleted in #3783 |
oh great! I missed that! |
Deleting it seemed to help increase the likelihood that CI passed. Although, it did not fully solve the problem (and note a few other changes were also made on #3783). |
Noting that internal_tide.jl gives NaN with Julia v1.11 while all is OK with Julia v1.10; something with immersed boundaries....? I'm looking into it. |
I think it's a plotting issue. We are filling up the immersed boundaries with NaN and, apparently, we cannot plot NaNs anymore? The error says: ERROR: LoadError: On worker 2:
| Looking up a non-finite or NaN value in a colormap is undefined. |
I ran the script and the actual simulation NaN-ed. |
That means Oceanangians isn't compatible with julia 1.11. Do any other tests catch the issue? We can use this opportunity to add more tests. |
I’m trying to make an mwe |
There are regression tests, but not sure if they are with immersed boundary or not. |
There are some tests but quite fragmented, not very exhaustive and that run only for 10 time steps, here for diffusion, here for advection and here for distributed vs serial. There is scope to add more tests for the immersed boundary grid, especially for a partial cell bottom, which I believe is still not tested. |
Good to add tests though I don't know why 100+ time steps matters specifically? |
For a regression test, 10 should be plenty. |
I can confirm that Julia 1.11 is changing how the immersed boundary (and free surface?) works. Here's the difference (including halos) in the internal tide u field after 1 time step (difference is Julia 1.10 - Julia 1.11 with + = red). Color range is ±0.003 m/s so it's a pretty significant for just 1 time step. Probably time to pull out Debugger.jl and step through a time step to find what causes the difference! |
Well, Debugger.jl doesn't work in Julia 1.11 either lol. X-Ref: JuliaDebug/Debugger.jl#361 |
That is weird and alarming. |
@wsmoses @vchuravy do you have any guesses how we could possibly observe a numerical difference between julia 1.10 and 1.11? This does not seem to be a syntax issue but rather a numerical calculation issue. Unless there is a latent syntax issue that has changed functionality which I suppose is also possible... |
@ali-ramadhan it might be worth testing the nonhydrostatic model and also problems without immersed boundaries to pinpoint the problem. |
The random number generator doesn’t guarantee the same results even with a seed across versions |
Good point! 👍🏼 Good to keep that in mind! But the differences we see in the internal_tide.jl example shouldn't be due to random number generator. |
One possibility is that syntax changed for something in a subtle way, so the code still runs but some function is being called incorrectly. Not sure what that could be though |
Here a list of changes: the change to We should check if there are changes only on CPU, or on both CPU and GPU. @ali-ramadhan I'm assuming your test was on CPU. |
Good find. Is there a way to redefine/import |
I don't think we use |
This PR switches the CI to use Julia v1.11.
It also adds a Manifest with
v1.11
ending so that there is still compatibility with previous versions.Note the the distributed CI still does not have Julia v1.11 (right @Sbozzolo?) so there Julia v1.10 is used. This is possible because there are two Manifests.