Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MDRange not working in some test case #134

Open
pgrete opened this issue Dec 23, 2024 · 0 comments
Open

MDRange not working in some test case #134

pgrete opened this issue Dec 23, 2024 · 0 comments
Labels
bug Something isn't working

Comments

@pgrete
Copy link
Contributor

pgrete commented Dec 23, 2024

I tried separating the hierarchical flux kernels (testing performance with potentially reduced register usage), see 3aad08b (and related inline PPM 5954a71 and Kokkos math 9d91ebf).

Performance still tbd, but an issue showed up when trying to change the loop pattern.
The default one

    parthenon::par_for(
        DEFAULT_LOOP_PATTERN, "x1 recon", parthenon::DevExecSpace(),
        0, cons_in.GetDim(5) - 1, 0, cons_in.GetDim(4) - 1, kb.s, kb.e, jb.s, jb.e, ib.s - 1, ib.e + 1

works.
The mdrage_tag one does NOT

    parthenon::par_for(
        parthenon::loop_pattern_mdrange_tag, "x1 recon", parthenon::DevExecSpace(),
        0, cons_in.GetDim(5) - 1, 0, cons_in.GetDim(4) - 1, kb.s, kb.e, jb.s, jb.e, ib.s - 1, ib.e + 1

Moreover, the raw Kokkos one also does NOT work

   Kokkos::parallel_for("x1 flux KMDRange",
         Kokkos::MDRangePolicy<Kokkos::Rank<5>>(parthenon::DevExecSpace(),
          {0, 0, kb.s, jb.s, ib.s - 1},
          {cons_in.GetDim(5), cons_in.GetDim(4), kb.e + 1, jb.e + 1, ib.e + 2},
          {1, 1, 1, 1, ib.e + 2 - (ib.s - 1) }),

Not working here refers to that the code compiles and runs, but that the results are incorrect (indicated by triggering the rho <= 0 fail-safe in ConsToPrim.

Tested on JEDI with

Currently Loaded Modules:
  1) Stages/2024     (S)   6) CUDA/12            (g)  11) hwloc/2.9.1          (g)  16) PMIx/4.2.6                21) ncurses/.6.4     (H)  26) Szip/.2.1.1      (H)  31) libffi/.3.4.4 (H)
  2) GCCcore/.12.3.0 (H)   7) NVHPC/24.3-CUDA-12 (g)  12) OpenSSL/1.1               17) NCCL/default-CUDA-12 (g)  22) bzip2/.1.0.8     (H)  27) HDF5/1.14.2           32) Python/3.11.3
  3) zlib/.1.2.13    (H)   8) XZ/.5.4.2          (H)  13) libevent/.2.1.12     (H)  18) UCC/default          (g)  23) cURL/8.0.1            28) libreadline/.8.2 (H)  33) Ninja/1.11.1
  4) binutils/.2.40  (H)   9) libxml2/.2.11.4    (H)  14) UCX-settings/RC-CUDA      19) MPI-settings/CUDA         24) libarchive/3.6.2      29) Tcl/8.6.13
  5) numactl/2.0.16       10) libpciaccess/.0.17 (H)  15) UCX/default          (g)  20) OpenMPI/4.1.6        (g)  25) CMake/3.26.3          30) SQLite/.3.42.0   (H)

Right now, completely unclear what/why/how/... just opening issue to keep track/revisit next year.

@pgrete pgrete added the bug Something isn't working label Dec 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant