You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I tried separating the hierarchical flux kernels (testing performance with potentially reduced register usage), see 3aad08b (and related inline PPM 5954a71 and Kokkos math 9d91ebf).
Performance still tbd, but an issue showed up when trying to change the loop pattern.
The default one
Not working here refers to that the code compiles and runs, but that the results are incorrect (indicated by triggering the rho <= 0 fail-safe in ConsToPrim.
I tried separating the hierarchical flux kernels (testing performance with potentially reduced register usage), see 3aad08b (and related inline PPM 5954a71 and Kokkos math 9d91ebf).
Performance still tbd, but an issue showed up when trying to change the loop pattern.
The default one
works.
The
mdrage_tag
one does NOTMoreover, the raw Kokkos one also does NOT work
Not working here refers to that the code compiles and runs, but that the results are incorrect (indicated by triggering the
rho <= 0
fail-safe inConsToPrim
.Tested on JEDI with
Right now, completely unclear what/why/how/... just opening issue to keep track/revisit next year.
The text was updated successfully, but these errors were encountered: