diff --git a/CHANGELOG.md b/CHANGELOG.md index 181a58b2698c..051e8cc3fe66 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -3,6 +3,8 @@ ## Current develop ### Added (new features/APIs/variables/...) +- [[PR 911]](https://github.com/parthenon-hpc-lab/parthenon/pull/911) Add infrastructure for geometric multi-grid +- [[PR 971]](https://github.com/parthenon-hpc-lab/parthenon/pull/971) Add UserWorkBeforeLoop - [[PR 907]](https://github.com/parthenon-hpc-lab/parthenon/pull/907) PEP1: Allow subclassing StateDescriptor - [[PR 932]](https://github.com/parthenon-hpc-lab/parthenon/pull/932) Add GetOrAddFlag to metadata - [[PR 931]](https://github.com/parthenon-hpc-lab/parthenon/pull/931) Allow SparsePacks with subsets of blocks diff --git a/doc/sphinx/src/boundary_communication.rst b/doc/sphinx/src/boundary_communication.rst index c83858727a7f..3df126de6f02 100644 --- a/doc/sphinx/src/boundary_communication.rst +++ b/doc/sphinx/src/boundary_communication.rst @@ -1,3 +1,5 @@ +.. _boundary_communication: + Boundary communication-in-one concepts ====================================== @@ -22,20 +24,60 @@ array of all the kinds of caches needed for the various kinds of boundary operations performed. We note that this infrastructure is more general than just ghost halos. -The same machinery could, for example, be used to prolongate or restrict -an entire meshblock. +The same machinery is used for communicating the interiors of meshblocks +that are restricted and/or prolongated between geometric multi-grid levels. +Additionally, with the ownership model for non-face fields, the basic +communication infrastructure could deal with flux correction as well +(although currently flux correction uses a somewhat separate code path). Buffer subsets -------------- Sometimes it is desirable, for example for custom prolongation and -restriction, to loop over a subset of the ghost sub-halos, rather than +restriction or to communicate on a single geometric multi-grid level, +to loop over a subset of the ghost sub-halos, rather than all halos at once. This is enabled by the ``buffer_subsets`` and ``buffer_subsets_h`` arrays, which are contained in ``BvarsSubCache_t``. The ``buffer_subsets`` array is a matrix, where the rows index the subset, and the columns point to the indices in the ``bnd_info`` array containing the subset of sub-halos you wish to operate on. +To communicate across a particular boundary type, the templated +boundary communication routines (see :boundary_comm_tasks:`boundary_comm_tasks`.) +should be instantiated with the desired ``BoundaryType``, i.e. + +.. code:: cpp + SendBoundBufs(md); + +The different ``BoundaryType``s are: + +- ``any``: Communications are performed between all leaf blocks (i.e. the + standard Parthenon grid that does not include multi-grid related blocks). +- ``local``: Communications are performed between all leaf blocks that + are on the current rank. *Currently, this option should not be used + because there are possibly associated bugs. This and nonlocal would + only be used as a potential performance enhancement, calling both + should result in the same communication as just calling + BoundaryType::any.* +- ``nonlocal``: Communications are performed between all leaf blocks that + are on different ranks than the current rank. *Currently, this option + should not be used because there are possibly associated bugs.* +- ``flxcor_send`` and ``flxcor_recv``: Used only for flux correction + routines, currently cannot be passed to regular boundary communication + routines. +- ``gmg_same``: Communicates ghost halos between blocks in the + same geometric multi-grid level. +- ``gmg_restrict_send`` and ``gmg_restrict_recv``: For restricting + block interiors between geometric multi-grid levels, i.e. inter-grid + communication. *It is probably not necessary to have separate + communicators for sending and receiving, but for now this is the way + if was written* +- ``gmg_prolongate_send`` and ``gmg_prolongate_recv``: For prolongating + block interiors between geometric multi-grid levels, i.e. inter-grid + communication. *It is probably not necessary to have separate + communicators for sending and receiving, but for now this is the way + if was written* + .. _sparse boundary comm: Sparse boundary communication @@ -327,6 +369,8 @@ generalizes to more realistic problems not being run with all ranks on the same node. See ``InitializeBufferCache(...)`` for how to choose the ordering.* +.. _boundary_comm_tasks: + Boundary Communication Tasks ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -426,3 +470,7 @@ cacheing if desired. - ``LoadAndSendFluxCorrections(std::shared_ptr>&)`` - ``ReceiveFluxCorrections(std::shared_ptr>&)`` - ``SetFluxCorrections(std::shared_ptr>&)`` + +*Now that non-cell-centered fields are implemented in Parthenon, the +flux correction tasks can be unified with the boundary communication +above.* diff --git a/doc/sphinx/src/interface/state.rst b/doc/sphinx/src/interface/state.rst index b2a91fa8e4cc..0841e8c5a1c7 100644 --- a/doc/sphinx/src/interface/state.rst +++ b/doc/sphinx/src/interface/state.rst @@ -112,6 +112,13 @@ several useful features and functions. deletgates to the ``std::function`` member ``PostStepDiagnosticsMesh`` if set (defaults to ``nullptr`` an therefore a no-op) to print diagnostics after the time-integration advance +- ``void UserWorkBeforeLoopMesh(Mesh *, ParameterInput *pin, SimTime + &tm)`` performs a per-package, mesh-wide calculation after the mesh + has been generated, and problem generators called, but before any + time evolution. This work is done both on first initialization and + on restart. If you would like to avoid doing the work upon restart, + you can check for the const ``is_restart`` member field of the ``Mesh`` + object. The reasoning for providing ``FillDerived*`` and ``EstimateTimestep*`` function pointers appropriate for usage with both ``MeshData`` and diff --git a/doc/sphinx/src/mesh/mesh.rst b/doc/sphinx/src/mesh/mesh.rst index a13b003079d7..1c046bc14c87 100644 --- a/doc/sphinx/src/mesh/mesh.rst +++ b/doc/sphinx/src/mesh/mesh.rst @@ -50,3 +50,64 @@ member. time-integration advance. The default behavior calls to each package's (StateDesrcriptor's) ``PreStepDiagnostics`` method which, in turn, delegates to a ``std::function`` member that defaults to a no-op. + +Multi-grid Grids Stored in ``Mesh`` +----------------------------------- + +If the parameter ``parthenon/mesh/multigrid`` is set to ``true``, the ``Mesh`` +constructor and AMR routines populate both +``std::vector Mesh::gmg_grid_locs`` and +``std::vector gmg_block_lists``, where each entry into the vectors +describes one level of the of the geometric multi-grid (GMG) mesh. For refined +meshes, each GMG level only includes blocks that are at a given logical level +(starting from the finest logical level on the grid and including both internal +and leaf nodes in the refinement tree) as well as leaf blocks on the next coarser +level that are neighbors to finer blocks, which implies that below the root grid +level the blocks may not cover the entire mesh. For levels above the root grid, +blocks may change shape so that they only cover the domain of the root grid. Note +that leaf blocks may be contained in multiple blocklists, and the lists all point +to the same block (not a separate copy). To be explicit, when ``parthenon/mesh/multigrid`` is set to ``true`` blocks corresponding to *all* internal nodes of the refinement tree are created, in addition to the leaf node blocks that are normally created. + +*GMG Implementation Note:* +The reason for including two levels in the GMG block lists is for dealing with +accurately setting the boundary values of the fine blocks. Convergence can be poor +or non-exististent if the fine-coarse boundaries of a given level are not +self-consistently updated (since the boundary prolongation from the coarse grid to +the fine grid also depends on interior values of the fine grid that are being updated +by a smoothing operation). This means that each smoothing step, boundary communication +must occur between all blocks corresponding to all internal and leaf nodes at a given +level in the tree and with all leaf nodes at the next coarser level which abut blocks +at the current level. Therefore, the GMG block lists contain blocks at two levels, but +smoothing operations etc. should usually only occur on the subset of those blocks that +are at the fine level. + +To work with these GMG levels, ``MeshData`` objects containing these blocks can +be recovered from a ``Mesh`` pointer using + +.. code:: c++ + auto &md = pmesh->gmg_mesh_data[level].GetOrAdd(level, "base", partition_idx); + +This ``MeshData`` will include blocks at the current level and possibly some +blocks at the next coarser level. Often, one will only want to operate on blocks +on the finer level (the coarser blocks are required mainly for boundary +communication). To make packs containing only a subset of blocks from a +GMG ``MeshData`` pointer ``md``, one would use + +.. code:: c++ + int nblocks = md->NumBlocks(); + std::vector include_block(nblocks, true); + for (int b = 0; b < nblocks; ++b) + include_block[b] = + (md->grid.logical_level == md->GetBlockData(b)->GetBlockPointer()->loc.level()); + + auto desc = parthenon::MakePackDescriptor(md.get()); + auto pack = desc.GetPack(md.get(), include_block); + +In addition to creating the ``LogicalLocation`` and block lists for the GMG levels, +``Mesh`` fills neigbor arrays in ``MeshBlock`` for intra- and inter-GMG block list +communication (i.e. boundary communication and internal prolongation/restriction, +respectively). Communication within and between GMG levels can be done by calling +boundary communication routines with the boundary tags ``gmg_same``, +``gmg_restrict_send``, ``gmg_restrict_recv``, ``gmg_prolongate_send``, +``gmg_prolongate_recv`` (see :boundary_communication:`boundary_communication`). + diff --git a/doc/sphinx/src/solvers.rst b/doc/sphinx/src/solvers.rst index 3eb83eb156fb..6377ac117b4a 100644 --- a/doc/sphinx/src/solvers.rst +++ b/doc/sphinx/src/solvers.rst @@ -3,9 +3,20 @@ Solvers ======= -Parthenon does not yet provide an exhaustive set of solvers. Currently, -a few basic building blocks are provided and we hope to develop more -capability in the future. +Parthenon does not yet provide an exhaustive set of plug and play solvers. +Nevertheless, the building blocks required for implementing Krylov subspace +methods (i.e. global reductions for vector dot products) like CG, BiCGStab, +and GMRES are available. An example of a Parthenon based implementation of +BiCGStab can be found in ``examples/poisson_gmg``. Additionally, the +infrastructure required for implementing multigrid solvers is also +included in Parthenon. The requisite hierarchy of grids is produced if +``parthenon/mesh/multigrid=true`` is set in the parameter input. An example +of a multi-grid based linear solver in Parthenon is also given in +``examples/poisson_gmg`` (and also an example of using multi-grid as a +preconditioner for BiCGStab). We plan to build wrappers that simplify the +use of these methods in down stream codes in the future. Note that the +example code does not currently rely on the Stencil and SparseMatrixAccessor +code described below. Stencil ------- diff --git a/example/CMakeLists.txt b/example/CMakeLists.txt index eb01d46e51bb..2e4229ca57b5 100644 --- a/example/CMakeLists.txt +++ b/example/CMakeLists.txt @@ -19,4 +19,5 @@ add_subdirectory(particles) add_subdirectory(particle_leapfrog) add_subdirectory(particle_tracers) add_subdirectory(poisson) +add_subdirectory(poisson_gmg) add_subdirectory(sparse_advection) diff --git a/example/advection/advection_package.cpp b/example/advection/advection_package.cpp index f618de1758c3..49200d0da87c 100644 --- a/example/advection/advection_package.cpp +++ b/example/advection/advection_package.cpp @@ -1,5 +1,5 @@ //======================================================================================== -// (C) (or copyright) 2020-2021. Triad National Security, LLC. All rights reserved. +// (C) (or copyright) 2020-2023. Triad National Security, LLC. All rights reserved. // // This program was produced under U.S. Government contract 89233218CNA000001 for Los // Alamos National Laboratory (LANL), which is operated by Triad National Security, LLC @@ -13,12 +13,14 @@ #include #include +#include #include #include #include #include #include +#include #include #include "advection_package.hpp" @@ -215,10 +217,19 @@ std::shared_ptr Initialize(ParameterInput *pin) { } pkg->CheckRefinementBlock = CheckRefinement; pkg->EstimateTimestepBlock = EstimateTimestepBlock; + pkg->UserWorkBeforeLoopMesh = AdvectionGreetings; return pkg; } +void AdvectionGreetings(Mesh *pmesh, ParameterInput *pin, parthenon::SimTime &tm) { + if (parthenon::Globals::my_rank == 0) { + std::cout << "Hello from the advection package in the advection example!\n" + << "This run is a restart: " << pmesh->is_restart << "\n" + << std::endl; + } +} + AmrTag CheckRefinement(MeshBlockData *rc) { // refine on advected, for example. could also be a derived quantity auto pmb = rc->GetBlockPointer(); diff --git a/example/advection/advection_package.hpp b/example/advection/advection_package.hpp index 7b529308f793..85d1d6371e5a 100644 --- a/example/advection/advection_package.hpp +++ b/example/advection/advection_package.hpp @@ -1,5 +1,5 @@ //======================================================================================== -// (C) (or copyright) 2020. Triad National Security, LLC. All rights reserved. +// (C) (or copyright) 2020-2023. Triad National Security, LLC. All rights reserved. // // This program was produced under U.S. Government contract 89233218CNA000001 for Los // Alamos National Laboratory (LANL), which is operated by Triad National Security, LLC @@ -21,6 +21,7 @@ namespace advection_package { using namespace parthenon::package::prelude; std::shared_ptr Initialize(ParameterInput *pin); +void AdvectionGreetings(Mesh *pmesh, ParameterInput *pin, parthenon::SimTime &tm); AmrTag CheckRefinement(MeshBlockData *rc); void PreFill(MeshBlockData *rc); void SquareIt(MeshBlockData *rc); diff --git a/example/poisson_gmg/CMakeLists.txt b/example/poisson_gmg/CMakeLists.txt new file mode 100644 index 000000000000..99c235986db2 --- /dev/null +++ b/example/poisson_gmg/CMakeLists.txt @@ -0,0 +1,27 @@ +#========================================================================================= +# (C) (or copyright) 2023. Triad National Security, LLC. All rights reserved. +# +# This program was produced under U.S. Government contract 89233218CNA000001 for Los +# Alamos National Laboratory (LANL), which is operated by Triad National Security, LLC +# for the U.S. Department of Energy/National Nuclear Security Administration. All rights +# in the program are reserved by Triad National Security, LLC, and the U.S. Department +# of Energy/National Nuclear Security Administration. The Government is granted for +# itself and others acting on its behalf a nonexclusive, paid-up, irrevocable worldwide +# license in this material to reproduce, prepare derivative works, distribute copies to +# the public, perform publicly and display publicly, and to permit others to do so. +#========================================================================================= + +get_property(DRIVER_LIST GLOBAL PROPERTY DRIVERS_USED_IN_TESTS) +if( "poisson-gmg-example" IN_LIST DRIVER_LIST OR NOT PARTHENON_DISABLE_EXAMPLES) + add_executable( + poisson-gmg-example + poisson_driver.cpp + poisson_driver.hpp + poisson_package.cpp + poisson_package.hpp + main.cpp + parthenon_app_inputs.cpp + ) + target_link_libraries(poisson-gmg-example PRIVATE Parthenon::parthenon) + lint_target(poisson-gmg-example) +endif() diff --git a/example/poisson_gmg/main.cpp b/example/poisson_gmg/main.cpp new file mode 100644 index 000000000000..d62d05d291ff --- /dev/null +++ b/example/poisson_gmg/main.cpp @@ -0,0 +1,93 @@ +//======================================================================================== +// (C) (or copyright) 2023. Triad National Security, LLC. All rights reserved. +// +// This program was produced under U.S. Government contract 89233218CNA000001 for Los +// Alamos National Laboratory (LANL), which is operated by Triad National Security, LLC +// for the U.S. Department of Energy/National Nuclear Security Administration. All rights +// in the program are reserved by Triad National Security, LLC, and the U.S. Department +// of Energy/National Nuclear Security Administration. The Government is granted for +// itself and others acting on its behalf a nonexclusive, paid-up, irrevocable worldwide +// license in this material to reproduce, prepare derivative works, distribute copies to +// the public, perform publicly and display publicly, and to permit others to do so. +//======================================================================================== + +#include "bvals/boundary_conditions_generic.hpp" +#include "parthenon_manager.hpp" + +#include "poisson_driver.hpp" + +using namespace parthenon; +using namespace parthenon::BoundaryFunction; +// We need to register FixedFace boundary conditions by hand since they can't +// be chosen in the parameter input file. FixedFace boundary conditions assume +// Dirichlet booundary conditions on the face of the domain and linearly extrapolate +// into the ghosts to ensure the linear reconstruction on the block face obeys the +// chosen boundary condition. Just setting the ghost zones of CC variables to a fixed +// value results in poor MG convergence because the effective BC at the face +// changes with MG level. +template +auto GetBoundaryCondition() { + return [](std::shared_ptr> &rc, bool coarse) -> void { + using namespace parthenon; + using namespace parthenon::BoundaryFunction; + GenericBC(rc, coarse, 0.0); + }; +} + +int main(int argc, char *argv[]) { + using parthenon::ParthenonManager; + using parthenon::ParthenonStatus; + ParthenonManager pman; + + // Redefine parthenon defaults + pman.app_input->ProcessPackages = poisson_example::ProcessPackages; + pman.app_input->MeshProblemGenerator = poisson_example::ProblemGenerator; + + // call ParthenonInit to initialize MPI and Kokkos, parse the input deck, and set up + auto manager_status = pman.ParthenonInitEnv(argc, argv); + if (manager_status == ParthenonStatus::complete) { + pman.ParthenonFinalize(); + return 0; + } + if (manager_status == ParthenonStatus::error) { + pman.ParthenonFinalize(); + return 1; + } + // Now that ParthenonInit has been called and setup succeeded, the code can now + // make use of MPI and Kokkos + + // Set boundary conditions + pman.app_input->boundary_conditions[parthenon::BoundaryFace::inner_x1] = + GetBoundaryCondition(); + pman.app_input->boundary_conditions[parthenon::BoundaryFace::inner_x2] = + GetBoundaryCondition(); + pman.app_input->boundary_conditions[parthenon::BoundaryFace::inner_x3] = + GetBoundaryCondition(); + pman.app_input->boundary_conditions[parthenon::BoundaryFace::outer_x1] = + GetBoundaryCondition(); + pman.app_input->boundary_conditions[parthenon::BoundaryFace::outer_x2] = + GetBoundaryCondition(); + pman.app_input->boundary_conditions[parthenon::BoundaryFace::outer_x3] = + GetBoundaryCondition(); + pman.ParthenonInitPackagesAndMesh(); + + // This needs to be scoped so that the driver object is destructed before Finalize + bool success = true; + { + // Initialize the driver + poisson_example::PoissonDriver driver(pman.pinput.get(), pman.app_input.get(), + pman.pmesh.get()); + + // This line actually runs the simulation + auto driver_status = driver.Execute(); + if (driver_status != parthenon::DriverStatus::complete || + driver.final_rms_residual > 1.e-10 || driver.final_rms_error > 1.e-12) + success = false; + } + // call MPI_Finalize and Kokkos::finalize if necessary + pman.ParthenonFinalize(); + if (Globals::my_rank == 0) printf("success: %i\n", success); + + // MPI and Kokkos can no longer be used + return static_cast(!success); +} diff --git a/example/poisson_gmg/parthenon_app_inputs.cpp b/example/poisson_gmg/parthenon_app_inputs.cpp new file mode 100644 index 000000000000..8a51c59f7baf --- /dev/null +++ b/example/poisson_gmg/parthenon_app_inputs.cpp @@ -0,0 +1,110 @@ +//======================================================================================== +// (C) (or copyright) 2023. Triad National Security, LLC. All rights reserved. +// +// This program was produced under U.S. Government contract 89233218CNA000001 for Los +// Alamos National Laboratory (LANL), which is operated by Triad National Security, LLC +// for the U.S. Department of Energy/National Nuclear Security Administration. All rights +// in the program are reserved by Triad National Security, LLC, and the U.S. Department +// of Energy/National Nuclear Security Administration. The Government is granted for +// itself and others acting on its behalf a nonexclusive, paid-up, irrevocable worldwide +// license in this material to reproduce, prepare derivative works, distribute copies to +// the public, perform publicly and display publicly, and to permit others to do so. +//======================================================================================== +#include +#include +#include + +#include + +#include "config.hpp" +#include "defs.hpp" +#include "poisson_package.hpp" +#include "utils/error_checking.hpp" + +using namespace parthenon::package::prelude; +using namespace parthenon; + +// *************************************************// +// redefine some weakly linked parthenon functions *// +// *************************************************// + +namespace poisson_example { + +void ProblemGenerator(Mesh *pm, ParameterInput *pin, MeshData *md) { + auto pmb = md->GetBlockData(0)->GetBlockPointer(); + const int ndim = md->GetMeshPointer()->ndim; + + Real x0 = pin->GetOrAddReal("poisson", "x0", 0.0); + Real y0 = pin->GetOrAddReal("poisson", "y0", 0.0); + Real z0 = pin->GetOrAddReal("poisson", "z0", 0.0); + Real radius0 = pin->GetOrAddReal("poisson", "radius", 0.1); + Real interior_D = pin->GetOrAddReal("poisson", "interior_D", 1.0); + Real exterior_D = pin->GetOrAddReal("poisson", "exterior_D", 1.0); + + auto desc = + parthenon::MakePackDescriptor(md); + auto pack = desc.GetPack(md); + + constexpr auto te = poisson_package::te; + using TE = parthenon::TopologicalElement; + auto &cellbounds = pmb->cellbounds; + auto ib = cellbounds.GetBoundsI(IndexDomain::entire, te); + auto jb = cellbounds.GetBoundsJ(IndexDomain::entire, te); + auto kb = cellbounds.GetBoundsK(IndexDomain::entire, te); + pmb->par_for( + "Poisson::ProblemGenerator", 0, pack.GetNBlocks() - 1, kb.s, kb.e, jb.s, jb.e, ib.s, + ib.e, KOKKOS_LAMBDA(const int b, const int k, const int j, const int i) { + const auto &coords = pack.GetCoordinates(b); + Real x1 = coords.X<1, te>(i); + Real x2 = coords.X<2, te>(j); + Real x3 = coords.X<2, te>(k); + Real x1f = coords.X<1, TE::F1>(k, j, i); + Real x2f = coords.X<2, TE::F2>(k, j, i); + Real x3f = coords.X<2, TE::F3>(k, j, i); + Real dx1 = coords.Dxc<1>(k, j, i); + Real dx2 = coords.Dxc<2>(k, j, i); + Real dx3 = coords.Dxc<3>(k, j, i); + Real rad = (x1 - x0) * (x1 - x0); + if (ndim > 1) rad += (x2 - y0) * (x2 - y0); + if (ndim > 2) rad += (x3 - z0) * (x3 - z0); + rad = std::sqrt(rad); + Real val = 0.0; + if (rad < radius0) { + val = 1.0; // / (4.0 / 3.0 * M_PI * std::pow(rad, 3)); + } + pack(b, te, poisson_package::rhs(), k, j, i) = val; + pack(b, te, poisson_package::res_err(), k, j, i) = 0.0; + pack(b, te, poisson_package::u(), k, j, i) = 0.0; + + pack(b, te, poisson_package::exact(), k, j, i) = -exp(-10.0 * rad * rad); + auto inside_region = [ndim](Real x, Real y, Real z) { + bool inside1 = (x < -0.25) && (x > -0.75); + if (ndim > 1) inside1 = inside1 && (y < 0.5) && (y > -0.5); + if (ndim > 2) inside1 = inside1 && (z < 0.25) && (z > -0.25); + + bool inside2 = (x < 0.5) && (x > -0.75); + if (ndim > 1) inside2 = inside2 && (y < -0.25) && (y > -0.75); + if (ndim > 2) inside2 = inside2 && (z < 0.25) && (z > -0.25); + + return inside1 || inside2; + }; + pack(b, TE::F1, poisson_package::D(), k, j, i) = + inside_region(x1f, x2, x3) ? interior_D : exterior_D; + pack(b, TE::F2, poisson_package::D(), k, j, i) = + inside_region(x1, x2f, x3) ? interior_D : exterior_D; + pack(b, TE::F3, poisson_package::D(), k, j, i) = + inside_region(x1, x2, x3f) ? interior_D : exterior_D; + }); +} + +Packages_t ProcessPackages(std::unique_ptr &pin) { + Packages_t packages; + auto pkg = poisson_package::Initialize(pin.get()); + packages.Add(pkg); + + return packages; +} + +} // namespace poisson_example diff --git a/example/poisson_gmg/parthinput.poisson b/example/poisson_gmg/parthinput.poisson new file mode 100644 index 000000000000..7ec0878059ec --- /dev/null +++ b/example/poisson_gmg/parthinput.poisson @@ -0,0 +1,86 @@ +# ======================================================================================== +# Parthenon performance portable AMR framework +# Copyright(C) 2020-2023 The Parthenon collaboration +# Licensed under the 3-clause BSD License, see LICENSE file for details +# ======================================================================================== +# (C) (or copyright) 2023. Triad National Security, LLC. All rights reserved. +# +# This program was produced under U.S. Government contract 89233218CNA000001 for Los +# Alamos National Laboratory (LANL), which is operated by Triad National Security, LLC +# for the U.S. Department of Energy/National Nuclear Security Administration. All rights +# in the program are reserved by Triad National Security, LLC, and the U.S. Department +# of Energy/National Nuclear Security Administration. The Government is granted for +# itself and others acting on its behalf a nonexclusive, paid-up, irrevocable worldwide +# license in this material to reproduce, prepare derivative works, distribute copies to +# the public, perform publicly and display publicly, and to permit others to do so. +# ======================================================================================== + + +problem_id = poisson + + +refinement = static +multigrid = true + +nx1 = 64 +x1min = -1.0 +x1max = 1.0 +ix1_bc = user +ox1_bc = user + +nx2 = 64 +x2min = -1.0 +x2max = 1.0 +ix2_bc = user +ox2_bc = user + +nx3 = 1 +x3min = 0.0 +x3max = 1.0 +ix3_bc = periodic +ox3_bc = periodic + + +nx1 = 32 +nx2 = 32 +nx3 = 1 + + +#nlim = -1 +#tlim = 1.0 +#integrator = rk2 +#ncycle_out_mesh = -10000 + + +file_type = hdf5 +dt = 0.05 +variables = poisson.res_err, poisson.u, poisson.x, poisson.r, poisson.rhs, poisson.p, poisson.s, poisson.t, poisson.v, poisson.exact +ghost_zones = true + + +x1min = -1.0 +x1max = -0.75 +x2min = -1.0 +x2max = -0.75 +level = 3 + + +solver = BiCGSTAB +precondition = true +precondition_vcycles = 1 +restart_threshold = 0.0 +flux_correct = true +smoother = SRJ2 +do_FAS = true +diagonal_alpha = 0.0 +max_iterations = 15 +pre_smooth_iterations = 2 +post_smooth_iterations = 2 +jacobi_damping = 0.0 + +x0 = 0.0 +y0 = 0.0 +z0 = 0.0 +radius = 0.5 +interior_D = 100.0 +exterior_D = 1.0 diff --git a/example/poisson_gmg/poisson_driver.cpp b/example/poisson_gmg/poisson_driver.cpp new file mode 100644 index 000000000000..2aeb1d6116cd --- /dev/null +++ b/example/poisson_gmg/poisson_driver.cpp @@ -0,0 +1,586 @@ +//======================================================================================== +// (C) (or copyright) 2023. Triad National Security, LLC. All rights reserved. +// +// This program was produced under U.S. Government contract 89233218CNA000001 for Los +// Alamos National Laboratory (LANL), which is operated by Triad National Security, LLC +// for the U.S. Department of Energy/National Nuclear Security Administration. All rights +// in the program are reserved by Triad National Security, LLC, and the U.S. Department +// of Energy/National Nuclear Security Administration. The Government is granted for +// itself and others acting on its behalf a nonexclusive, paid-up, irrevocable worldwide +// license in this material to reproduce, prepare derivative works, distribute copies to +// the public, perform publicly and display publicly, and to permit others to do so. +//======================================================================================== + +#include +#include +#include +#include + +// Local Includes +#include "amr_criteria/refinement_package.hpp" +#include "bvals/comms/bvals_in_one.hpp" +#include "interface/metadata.hpp" +#include "interface/update.hpp" +#include "mesh/meshblock_pack.hpp" +#include "parthenon/driver.hpp" +#include "poisson_driver.hpp" +#include "poisson_package.hpp" +#include "prolong_restrict/prolong_restrict.hpp" + +using namespace parthenon::driver::prelude; + +namespace poisson_example { + +parthenon::DriverStatus PoissonDriver::Execute() { + pouts->MakeOutputs(pmesh, pinput); + ConstructAndExecuteTaskLists<>(this); + pouts->MakeOutputs(pmesh, pinput); + return DriverStatus::complete; +} + +TaskCollection PoissonDriver::MakeTaskCollection(BlockList_t &blocks) { + auto pkg = pmesh->packages.Get("poisson_package"); + auto solver = pkg->Param("solver"); + if (solver == "BiCGSTAB") { + return MakeTaskCollectionMGBiCGSTAB(blocks); + } else if (solver == "MG") { + return MakeTaskCollectionMG(blocks); + } else { + PARTHENON_FAIL("Unknown solver type."); + } + return TaskCollection(); +} + +template +TaskID AddJacobiIteration(TaskList &tl, TaskID depends_on, bool multilevel, Real omega, + std::shared_ptr> &md) { + using namespace parthenon; + using namespace poisson_package; + TaskID none(0); + + auto comm = AddBoundaryExchangeTasks(depends_on, tl, md, multilevel); + auto flux = tl.AddTask(comm, CalculateFluxes, md); + auto mat_mult = tl.AddTask(flux, FluxMultiplyMatrix, md, false); + return tl.AddTask(mat_mult, FluxJacobi, md, omega, + GSType::all); +} + +template +TaskID AddSRJIteration(TaskList &tl, TaskID depends_on, int stages, bool multilevel, + std::shared_ptr> &md) { + using namespace parthenon; + using namespace poisson_package; + int ndim = md->GetParentPointer()->ndim; + + std::array, 3> omega_M1{ + {{1.0, 0.0, 0.0}, {1.0, 0.0, 0.0}, {1.0, 0.0, 0.0}}}; + // Damping factors from Yang & Mittal (2017) + std::array, 3> omega_M2{ + {{0.8723, 0.5395, 0.0000}, {1.3895, 0.5617, 0.0000}, {1.7319, 0.5695, 0.0000}}}; + std::array, 3> omega_M3{ + {{0.9372, 0.6667, 0.5173}, {1.6653, 0.8000, 0.5264}, {2.2473, 0.8571, 0.5296}}}; + + if (stages == 0) return depends_on; + auto omega = omega_M1; + if (stages == 2) omega = omega_M2; + if (stages == 3) omega = omega_M3; + // This copy is to set the boundaries of temp that will not be updated by boundary + // communication to the values in u + depends_on = tl.AddTask(depends_on, CopyData, md); + auto jacobi1 = AddJacobiIteration(tl, depends_on, multilevel, + omega[ndim - 1][0], md); + if (stages < 2) { + return tl.AddTask(jacobi1, CopyData, md); + } + auto jacobi2 = AddJacobiIteration(tl, jacobi1, multilevel, + omega[ndim - 1][1], md); + if (stages < 3) return jacobi2; + auto jacobi3 = AddJacobiIteration(tl, jacobi2, multilevel, + omega[ndim - 1][2], md); + return tl.AddTask(jacobi3, CopyData, md); +} + +template +TaskID Axpy(TaskList &tl, TaskID depends_on, std::shared_ptr> &md, + Real weight_Ax, Real weight_y, bool only_interior, bool do_flux_cor = false) { + using namespace parthenon; + using namespace poisson_package; + auto flux_res = tl.AddTask(depends_on, CalculateFluxes, md); + if (do_flux_cor && !only_md_level) { + auto start_flxcor = tl.AddTask(flux_res, StartReceiveFluxCorrections, md); + auto send_flxcor = tl.AddTask(flux_res, LoadAndSendFluxCorrections, md); + auto recv_flxcor = tl.AddTask(send_flxcor, ReceiveFluxCorrections, md); + flux_res = tl.AddTask(recv_flxcor, SetFluxCorrections, md); + } + auto Ax_res = tl.AddTask(flux_res, FluxMultiplyMatrix, md, + only_interior); + return tl.AddTask(Ax_res, + AddFieldsAndStoreInteriorSelect, md, + weight_Ax, weight_y, only_interior); +} + +template +TaskID DotProduct(TaskID dependency_in, TaskRegion ®ion, TaskList &tl, int partition, + int ®_dep_id, AllReduce *adotb, + std::shared_ptr> &md) { + using namespace parthenon; + using namespace poisson_package; + auto zero_adotb = (partition == 0 ? tl.AddTask( + dependency_in, + [](AllReduce *r) { + r->val = 0.0; + return TaskStatus::complete; + }, + adotb) + : dependency_in); + region.AddRegionalDependencies(reg_dep_id, partition, zero_adotb); + reg_dep_id++; + auto get_adotb = tl.AddTask(zero_adotb, DotProductLocal, md, &(adotb->val)); + region.AddRegionalDependencies(reg_dep_id, partition, get_adotb); + reg_dep_id++; + auto start_global_adotb = + (partition == 0 + ? tl.AddTask(get_adotb, &AllReduce::StartReduce, adotb, MPI_SUM) + : get_adotb); + auto finish_global_adotb = + tl.AddTask(start_global_adotb, &AllReduce::CheckReduce, adotb); + region.AddRegionalDependencies(reg_dep_id, partition, finish_global_adotb); + reg_dep_id++; + return finish_global_adotb; +} + +TaskID PoissonDriver::AddMultiGridTasksLevel(TaskRegion ®ion, TaskList &tl, + TaskID dependency, int partition, + int ®_dep_id, int level, int min_level, + int max_level) { + using namespace parthenon; + using namespace poisson_package; + + auto pkg = pmesh->packages.Get("poisson_package"); + auto damping = pkg->Param("jacobi_damping"); + auto smoother = pkg->Param("smoother"); + int pre_stages = pkg->Param("pre_smooth_iterations"); + int post_stages = pkg->Param("post_smooth_iterations"); + bool do_FAS = pkg->Param("do_FAS"); + if (smoother == "none") { + pre_stages = 0; + post_stages = 0; + } else if (smoother == "SRJ1") { + pre_stages = 1; + post_stages = 1; + } else if (smoother == "SRJ2") { + pre_stages = 2; + post_stages = 2; + } else if (smoother == "SRJ3") { + pre_stages = 3; + post_stages = 3; + } else { + PARTHENON_FAIL("Unknown solver type."); + } + + int ndim = pmesh->ndim; + bool multilevel = (level != min_level); + + auto &md = pmesh->gmg_mesh_data[level].GetOrAdd(level, "base", partition); + // 0. Receive residual from finer level if there is one + auto set_from_finer = dependency; + if (level < max_level) { + // Fill fields with restricted values + auto recv_from_finer = + tl.AddTask(set_from_finer, ReceiveBoundBufs, md); + set_from_finer = + tl.AddTask(recv_from_finer, SetBounds, md); + region.AddRegionalDependencies(reg_dep_id, partition, set_from_finer); + reg_dep_id++; + // 1. Copy residual from dual purpose communication field to the rhs, should be + // actual RHS for finest level + auto copy_u = tl.AddTask(set_from_finer, CopyData, md); + if (!do_FAS) { + auto zero_u = tl.AddTask(copy_u, SetToZero, md); + auto copy_rhs = tl.AddTask(set_from_finer, CopyData, md); + set_from_finer = zero_u | copy_u | copy_rhs; + } else { + set_from_finer = AddBoundaryExchangeTasks( + set_from_finer, tl, md, multilevel); + // This should set the rhs only in blocks that correspond to interior nodes, the + // RHS of leaf blocks that are on this GMG level should have already been set on + // entry into multigrid + set_from_finer = + Axpy(tl, set_from_finer, md, 1.0, 1.0, true); + set_from_finer = set_from_finer | copy_u; + } + } else { + set_from_finer = tl.AddTask(set_from_finer, CopyData, md); + } + + // 2. Do pre-smooth and fill solution on this level + auto pre_smooth = AddSRJIteration(tl, set_from_finer, + pre_stages, multilevel, md); + // If we are finer than the coarsest level: + auto post_smooth = pre_smooth; + if (level > min_level) { + // 3. Communicate same level boundaries so that u is up to date everywhere + auto comm_u = + AddBoundaryExchangeTasks(pre_smooth, tl, md, multilevel); + + // 4. Caclulate residual and store in communication field + auto residual = Axpy(tl, comm_u, md, -1.0, 1.0, false); + + // 5. Restrict communication field and send to next level + auto communicate_to_coarse = + tl.AddTask(residual, SendBoundBufs, md); + + auto coarser = AddMultiGridTasksLevel(region, tl, communicate_to_coarse, partition, + reg_dep_id, level - 1, min_level, max_level); + + // 6. Receive error field into communication field and prolongate + auto recv_from_coarser = + tl.AddTask(coarser, ReceiveBoundBufs, md); + auto set_from_coarser = + tl.AddTask(recv_from_coarser, SetBounds, md); + auto prolongate = tl.AddTask(set_from_coarser, + ProlongateBounds, md); + region.AddRegionalDependencies(reg_dep_id, partition, prolongate); + reg_dep_id++; + + // 7. Correct solution on this level with res_err field and store in + // communication field + auto update_sol = + tl.AddTask(prolongate, AddFieldsAndStore, md, 1.0, 1.0); + + // 8. Post smooth using communication field and stored RHS + post_smooth = AddSRJIteration(tl, update_sol, post_stages, + multilevel, md); + } else { + post_smooth = tl.AddTask(pre_smooth, CopyData, md); + } + + // 9. Send communication field to next finer level (should be error field for that + // level) + auto final_task = post_smooth; + if (level < max_level) { + auto copy_over = post_smooth; + if (!do_FAS) { + copy_over = tl.AddTask(post_smooth, CopyData, md); + } else { + auto calc_err = + tl.AddTask(post_smooth, AddFieldsAndStore, md, 1.0, -1.0); + copy_over = calc_err; + } + auto boundary = + AddBoundaryExchangeTasks(copy_over, tl, md, multilevel); + final_task = + tl.AddTask(boundary, SendBoundBufs, md); + } else { + final_task = + AddBoundaryExchangeTasks(post_smooth, tl, md, multilevel); + } + return final_task; +} + +TaskCollection PoissonDriver::MakeTaskCollectionMG(BlockList_t &blocks) { + using namespace parthenon; + using namespace poisson_package; + TaskCollection tc; + TaskID none(0); + + auto pkg = pmesh->packages.Get("poisson_package"); + auto max_iterations = pkg->Param("max_iterations"); + + const int num_partitions = pmesh->DefaultNumPartitions(); + int min_level = 0; + int max_level = pmesh->GetGMGMaxLevel(); + + { + TaskRegion ®ion = tc.AddRegion(num_partitions); + for (int i = 0; i < num_partitions; ++i) { + TaskList &tl = region[i]; + auto &md = pmesh->mesh_data.GetOrAdd("base", i); + auto copy_exact = tl.AddTask(none, CopyData, md); + auto comm = AddBoundaryExchangeTasks(copy_exact, tl, md, true); + auto get_rhs = Axpy(tl, comm, md, 1.0, 0.0, false, false); + auto zero_u = tl.AddTask(get_rhs, SetToZero, md); + } + } + + for (int ivcycle = 0; ivcycle < max_iterations; ++ivcycle) { + int mg_reg_dep_id = 0; + TaskRegion ®ion = tc.AddRegion(num_partitions); + for (int i = 0; i < num_partitions; ++i) { + TaskList &tl = region[i]; + AddMultiGridTasksLevel(region, tl, none, i, mg_reg_dep_id, max_level, min_level, + max_level); + } + + int reg_dep_id = 0; + TaskRegion ®ion_res = tc.AddRegion(num_partitions); + for (int i = 0; i < num_partitions; ++i) { + TaskList &tl = region_res[i]; + auto &md = pmesh->mesh_data.GetOrAdd("base", i); + auto calc_pointwise_res = Axpy(tl, none, md, -1.0, 1.0, false); + auto get_res = DotProduct(calc_pointwise_res, region_res, tl, i, + reg_dep_id, &residual, md); + auto calc_err = + tl.AddTask(get_res, AddFieldsAndStore, md, 1.0, -1.0); + auto get_err = DotProduct(calc_err, region_res, tl, i, reg_dep_id, + &rhat0r, md); + if (i == 0) { + tl.AddTask( + get_err, + [&](PoissonDriver *driver) { + Real rms_res = std::sqrt(driver->residual.val / pmesh->GetTotalCells()); + Real rms_err = std::sqrt(driver->rhat0r.val / pmesh->GetTotalCells()); + this->final_rms_error = rms_err; + this->final_rms_residual = rms_res; + if (Globals::my_rank == 0) + printf("RMS residual: %e RMS error: %e\n", rms_res, rms_err); + return TaskStatus::complete; + }, + this); + } + AddBoundaryExchangeTasks(get_res, tl, md, true); + } + } + + return tc; +} + +TaskCollection PoissonDriver::MakeTaskCollectionMGBiCGSTAB(BlockList_t &blocks) { + using namespace parthenon; + using namespace poisson_package; + TaskCollection tc; + TaskID none(0); + + auto pkg = pmesh->packages.Get("poisson_package"); + auto max_iterations = pkg->Param("max_iterations"); + auto precondition = pkg->Param("precondition"); + bool flux_correct = pkg->Param("flux_correct"); + const int num_partitions = pmesh->DefaultNumPartitions(); + int n_vcycles = pkg->Param("precondition_vcycles"); + + Real restart_threshold = pkg->Param("restart_threshold"); + + int min_level = 0; + int max_level = pmesh->GetGMGMaxLevel(); + + auto AddGMGRegion = [&]() { + if (precondition) { + int mg_reg_dep_id = 0; + TaskRegion ®ion = tc.AddRegion(num_partitions); + for (int i = 0; i < num_partitions; ++i) { + TaskList &tl = region[i]; + AddMultiGridTasksLevel(region, tl, none, i, mg_reg_dep_id, max_level, min_level, + max_level); + } + } else { + TaskRegion ®ion = tc.AddRegion(num_partitions); + for (int i = 0; i < num_partitions; ++i) { + TaskList &tl = region[i]; + auto &md = pmesh->mesh_data.GetOrAdd("base", i); + auto copy_u = tl.AddTask(none, CopyData, md); + } + } + }; + + // Solving A x = rhs with BiCGSTAB possibly with pre-conditioner M^{-1} such that A M ~ + // I Initialization: x <- 0, r <- rhs, rhat0 <- rhs, rhat0r_old <- (rhat0, r), p <- r, u + // <- 0 + { + int reg_dep_id = 0; + TaskRegion ®ion = tc.AddRegion(num_partitions); + for (int i = 0; i < num_partitions; ++i) { + TaskList &tl = region[i]; + auto &md = pmesh->mesh_data.GetOrAdd("base", i); + auto zero_x = tl.AddTask(none, SetToZero, md); + auto zero_u = tl.AddTask(none, SetToZero, md); + auto copy_r = tl.AddTask(none, CopyData, md); + auto copy_p = tl.AddTask(none, CopyData, md); + auto copy_rhsbase = tl.AddTask(none, CopyData, md); + auto copy_rhat0 = tl.AddTask(none, CopyData, md); + auto get_rhat0r = + DotProduct(none, region, tl, i, reg_dep_id, &rhat0r, md); + if (i == 0) { + tl.AddTask( + get_rhat0r, + [this](PoissonDriver *driver) { + driver->rhat0r_old = driver->rhat0r.val; + driver->rhat0r.val = 0.0; + driver->rhat0v.val = 0.0; + driver->ts.val = 0.0; + driver->tt.val = 0.0; + driver->residual.val = 0.0; + return TaskStatus::complete; + }, + this); + } + } + } + + for (int ivcycle = 0; ivcycle < max_iterations; ++ivcycle) { + // 1. u <- M p (rhs = p is set in previous cycle or initialization) + for (int v = 0; v < n_vcycles; ++v) + AddGMGRegion(); + { + TaskRegion ®ion = tc.AddRegion(num_partitions); + int reg_dep_id = 0; + for (int i = 0; i < num_partitions; ++i) { + TaskList &tl = region[i]; + auto &md = pmesh->mesh_data.GetOrAdd("base", i); + + // 2. v <- A u + auto comm = AddBoundaryExchangeTasks(none, tl, md, true); + auto get_v = Axpy(tl, comm, md, 1.0, 0.0, false, flux_correct); + + // 3. rhat0v <- (rhat0, v) + auto get_rhat0v = + DotProduct(get_v, region, tl, i, reg_dep_id, &rhat0v, md); + + // 4. h <- x + alpha u (alpha = rhat0r_old / rhat0v) + auto correct_h = tl.AddTask( + get_rhat0v, + [](PoissonDriver *driver, std::shared_ptr> &md) { + Real alpha = driver->rhat0r_old / driver->rhat0v.val; + return AddFieldsAndStore(md, 1.0, alpha); + }, + this, md); + + // 5. s <- r - alpha v (alpha = rhat0r_old / rhat0v) + auto correct_s = tl.AddTask( + get_rhat0v, + [](PoissonDriver *driver, std::shared_ptr> &md) { + Real alpha = driver->rhat0r_old / driver->rhat0v.val; + return AddFieldsAndStore(md, 1.0, -alpha); + }, + this, md); + + // 4.1 res_err <- A h - rhs + auto comm_h = + AddBoundaryExchangeTasks(correct_h, tl, md, true); + auto get_res = + Axpy(tl, comm_h, md, -1.0, 1.0, false, flux_correct); + + // Check and print out residual + get_res = DotProduct(get_res, region, tl, i, reg_dep_id, + &residual, md); + + region.AddRegionalDependencies(reg_dep_id, i, get_res); + if (i == 0) { + tl.AddTask( + get_res, + [&](PoissonDriver *driver) { + Real rms_err = std::sqrt(driver->residual.val / pmesh->GetTotalCells()); + if (Globals::my_rank == 0) + printf("RMS residual: %e (rhat0, r): %e (rhat0, v): %e\n", rms_err, + driver->rhat0r_old, driver->rhat0v.val); + return TaskStatus::complete; + }, + this); + } + // Setup for MG Precondition + tl.AddTask(correct_s, CopyData, md); + tl.AddTask(correct_s, SetToZero, md); + } + } + // 6. u <- M s (rhs = s) + for (int v = 0; v < n_vcycles; ++v) + AddGMGRegion(); + { + TaskRegion ®ion = tc.AddRegion(num_partitions); + int reg_dep_id = 0; + for (int i = 0; i < num_partitions; ++i) { + TaskList &tl = region[i]; + auto &md = pmesh->mesh_data.GetOrAdd("base", i); + + // 7. t <- A u + auto comm = AddBoundaryExchangeTasks(none, tl, md, true); + auto get_t = Axpy(tl, comm, md, 1.0, 0.0, false, flux_correct); + + // 8. omega <- (t,s) / (t,t) + auto get_ts = DotProduct(get_t, region, tl, i, reg_dep_id, &ts, md); + auto get_tt = DotProduct(get_t, region, tl, i, reg_dep_id, &tt, md); + + // 9. x <- h + omega u + auto correct_x = tl.AddTask( + get_tt | get_ts, + [](PoissonDriver *driver, std::shared_ptr> &md) { + Real omega = driver->ts.val / driver->tt.val; + return AddFieldsAndStore(md, 1.0, omega); + }, + this, md); + + // 10. r <- s - omega t + auto correct_r = tl.AddTask( + get_tt | get_ts, + [](PoissonDriver *driver, std::shared_ptr> &md) { + Real omega = driver->ts.val / driver->tt.val; + return AddFieldsAndStore(md, 1.0, -omega); + }, + this, md); + + // 10.5. Restart if (rhat0, r) is below threshold + region.AddRegionalDependencies(reg_dep_id, i, correct_r | correct_x); + reg_dep_id++; + auto restart = tl.AddTask( + correct_r, + [restart_threshold](PoissonDriver *driver, int i, + std::shared_ptr> &md) { + if (i == 0 && std::abs(driver->rhat0r_old) < restart_threshold) + printf("Restart rhat0r_old = %e (%e)\n", driver->rhat0r_old, + restart_threshold); + if (std::abs(driver->rhat0r_old) < restart_threshold) { + CopyData(md); + CopyData(md); + } + return TaskStatus::complete; + }, + this, i, md); + + // 11. rhat0r <- (rhat0, r) + auto get_rhat0r = + DotProduct(correct_r, region, tl, i, reg_dep_id, &rhat0r, md); + + // 12. beta <- rhat0r / rhat0r_old * alpha / omega + // 13. p <- r + beta * (p - omega * v) + auto update_p = tl.AddTask( + get_rhat0r, + [restart_threshold](PoissonDriver *driver, + std::shared_ptr> &md) { + if (std::abs(driver->rhat0r_old) >= restart_threshold) { + Real alpha = driver->rhat0r_old / driver->rhat0v.val; + Real omega = driver->ts.val / driver->tt.val; + Real beta = driver->rhat0r.val / driver->rhat0r_old * alpha / omega; + AddFieldsAndStore(md, 1.0, -omega); + return AddFieldsAndStore(md, 1.0, beta); + } + return TaskStatus::complete; + }, + this, md); + + // Set up for MG in next cycle + tl.AddTask(update_p, CopyData, md); + tl.AddTask(update_p, SetToZero, md); + + // 14. rhat0r_old <- rhat0r, zero all reductions + region.AddRegionalDependencies(reg_dep_id, i, update_p | correct_x); + if (i == 0) { + tl.AddTask( + update_p | correct_x, + [](PoissonDriver *driver) { + driver->rhat0r_old = driver->rhat0r.val; + driver->rhat0r.val = 0.0; + driver->rhat0v.val = 0.0; + driver->ts.val = 0.0; + driver->tt.val = 0.0; + driver->residual.val = 0.0; + return TaskStatus::complete; + }, + this); + } + } + } + } + + return tc; +} + +} // namespace poisson_example diff --git a/example/poisson_gmg/poisson_driver.hpp b/example/poisson_gmg/poisson_driver.hpp new file mode 100644 index 000000000000..7ea9b212befb --- /dev/null +++ b/example/poisson_gmg/poisson_driver.hpp @@ -0,0 +1,62 @@ +//======================================================================================== +// (C) (or copyright) 2021-2023. Triad National Security, LLC. All rights reserved. +// +// This program was produced under U.S. Government contract 89233218CNA000001 for Los +// Alamos National Laboratory (LANL), which is operated by Triad National Security, LLC +// for the U.S. Department of Energy/National Nuclear Security Administration. All rights +// in the program are reserved by Triad National Security, LLC, and the U.S. Department +// of Energy/National Nuclear Security Administration. The Government is granted for +// itself and others acting on its behalf a nonexclusive, paid-up, irrevocable worldwide +// license in this material to reproduce, prepare derivative works, distribute copies to +// the public, perform publicly and display publicly, and to permit others to do so. +//======================================================================================== + +#ifndef EXAMPLE_POISSON_GMG_POISSON_DRIVER_HPP_ +#define EXAMPLE_POISSON_GMG_POISSON_DRIVER_HPP_ + +#include +#include + +#include +#include +#include + +namespace poisson_example { +using namespace parthenon::driver::prelude; + +class PoissonDriver : public Driver { + public: + PoissonDriver(ParameterInput *pin, ApplicationInput *app_in, Mesh *pm) + : Driver(pin, app_in, pm) { + InitializeOutputs(); + } + // This next function essentially defines the driver. + TaskCollection MakeTaskCollection(BlockList_t &blocks); + TaskCollection MakeTaskCollectionProRes(BlockList_t &blocks); + TaskCollection MakeTaskCollectionMG(BlockList_t &blocks); + TaskCollection MakeTaskCollectionMGCG(BlockList_t &blocks); + TaskCollection MakeTaskCollectionMGBiCGSTAB(BlockList_t &blocks); + + DriverStatus Execute() override; + + TaskID AddMultiGridTasksLevel(TaskRegion ®ion, TaskList &tl, TaskID dependency, + int partition, int ®_dep_id, int level, int min_level, + int max_level); + void AddRestrictionProlongationLevel(TaskRegion ®ion, int level, int min_level, + int max_level); + + Real final_rms_error, final_rms_residual; + + private: + // Necessary reductions for BiCGStab dot products and residual calculations + AllReduce rtr, pAp, rhat0v, rhat0r, ts, tt, residual; + Real rtr_old, rhat0r_old; + AllReduce update_norm; +}; + +void ProblemGenerator(Mesh *pm, parthenon::ParameterInput *pin, MeshData *md); +parthenon::Packages_t ProcessPackages(std::unique_ptr &pin); + +} // namespace poisson_example + +#endif // EXAMPLE_POISSON_GMG_POISSON_DRIVER_HPP_ diff --git a/example/poisson_gmg/poisson_package.cpp b/example/poisson_gmg/poisson_package.cpp new file mode 100644 index 000000000000..561b0e73d2ef --- /dev/null +++ b/example/poisson_gmg/poisson_package.cpp @@ -0,0 +1,137 @@ +//======================================================================================== +// (C) (or copyright) 2021-2023. Triad National Security, LLC. All rights reserved. +// +// This program was produced under U.S. Government contract 89233218CNA000001 for Los +// Alamos National Laboratory (LANL), which is operated by Triad National Security, LLC +// for the U.S. Department of Energy/National Nuclear Security Administration. All rights +// in the program are reserved by Triad National Security, LLC, and the U.S. Department +// of Energy/National Nuclear Security Administration. The Government is granted for +// itself and others acting on its behalf a nonexclusive, paid-up, irrevocable worldwide +// license in this material to reproduce, prepare derivative works, distribute copies to +// the public, perform publicly and display publicly, and to permit others to do so. +//======================================================================================== + +#include +#include +#include +#include +#include +#include + +#include +#include +#include +#include + +#include "defs.hpp" +#include "kokkos_abstraction.hpp" +#include "poisson_package.hpp" + +using namespace parthenon::package::prelude; +using parthenon::HostArray1D; +namespace poisson_package { + +std::shared_ptr Initialize(ParameterInput *pin) { + auto pkg = std::make_shared("poisson_package"); + + int max_poisson_iterations = pin->GetOrAddInteger("poisson", "max_iterations", 10000); + pkg->AddParam<>("max_iterations", max_poisson_iterations); + + int pre_smooth_iterations = pin->GetOrAddInteger("poisson", "pre_smooth_iterations", 2); + pkg->AddParam<>("pre_smooth_iterations", pre_smooth_iterations); + + int post_smooth_iterations = + pin->GetOrAddInteger("poisson", "post_smooth_iterations", 2); + pkg->AddParam<>("post_smooth_iterations", post_smooth_iterations); + + Real diagonal_alpha = pin->GetOrAddReal("poisson", "diagonal_alpha", 0.0); + pkg->AddParam<>("diagonal_alpha", diagonal_alpha); + + Real jacobi_damping = pin->GetOrAddReal("poisson", "jacobi_damping", 0.5); + pkg->AddParam<>("jacobi_damping", jacobi_damping); + + std::string smoother_method = pin->GetOrAddString("poisson", "smoother", "SRJ2"); + pkg->AddParam<>("smoother", smoother_method); + + bool do_FAS = pin->GetOrAddBoolean("poisson", "do_FAS", false); + pkg->AddParam<>("do_FAS", do_FAS); + + std::string solver = pin->GetOrAddString("poisson", "solver", "MG"); + pkg->AddParam<>("solver", solver); + + bool precondition = pin->GetOrAddBoolean("poisson", "precondition", true); + pkg->AddParam<>("precondition", precondition); + + int precondition_vcycles = pin->GetOrAddInteger("poisson", "precondition_vcycles", 1); + pkg->AddParam<>("precondition_vcycles", precondition_vcycles); + + bool flux_correct = pin->GetOrAddBoolean("poisson", "flux_correct", false); + pkg->AddParam<>("flux_correct", flux_correct); + + Real restart_threshold = pin->GetOrAddReal("poisson", "restart_threshold", 0.0); + pkg->AddParam<>("restart_threshold", restart_threshold); + + int check_interval = pin->GetOrAddInteger("poisson", "check_interval", 100); + pkg->AddParam<>("check_interval", check_interval); + + Real err_tol = pin->GetOrAddReal("poisson", "error_tolerance", 1.e-8); + pkg->AddParam<>("error_tolerance", err_tol); + + bool fail_flag = pin->GetOrAddBoolean("poisson", "fail_without_convergence", false); + pkg->AddParam<>("fail_without_convergence", fail_flag); + + bool warn_flag = pin->GetOrAddBoolean("poisson", "warn_without_convergence", true); + pkg->AddParam<>("warn_without_convergence", warn_flag); + + // res_err enters a multigrid level as the residual from the previous level, which + // is the rhs, and leaves as the solution for that level, which is the error for the + // next finer level + auto te_type = Metadata::Cell; + if (GetTopologicalType(te) == parthenon::TopologicalType::Node) { + te_type = Metadata::Node; + } else if (GetTopologicalType(te) == parthenon::TopologicalType::Edge) { + te_type = Metadata::Edge; + } else if (GetTopologicalType(te) == parthenon::TopologicalType::Face) { + te_type = Metadata::Face; + } + using namespace parthenon::refinement_ops; + auto mres_err = Metadata({te_type, Metadata::Independent, Metadata::FillGhost, + Metadata::GMGRestrict, Metadata::GMGProlongate}); + mres_err.RegisterRefinementOps(); + pkg->AddField(res_err::name(), mres_err); + + auto mD = Metadata( + {Metadata::Independent, Metadata::OneCopy, Metadata::Face, Metadata::GMGRestrict}); + mD.RegisterRefinementOps(); + pkg->AddField(D::name(), mD); + + auto mflux_comm = Metadata({te_type, Metadata::Independent, Metadata::FillGhost, + Metadata::WithFluxes, Metadata::GMGRestrict}); + mflux_comm.RegisterRefinementOps(); + pkg->AddField(u::name(), mflux_comm); + + auto mflux = Metadata( + {te_type, Metadata::Independent, Metadata::FillGhost, Metadata::WithFluxes}); + mflux.RegisterRefinementOps(); + pkg->AddField(temp::name(), mflux); + + auto m_no_ghost = Metadata({te_type, Metadata::Derived, Metadata::OneCopy}); + // BiCGSTAB fields + pkg->AddField(rhat0::name(), m_no_ghost); + pkg->AddField(v::name(), m_no_ghost); + pkg->AddField(h::name(), mflux); + pkg->AddField(s::name(), m_no_ghost); + pkg->AddField(t::name(), m_no_ghost); + pkg->AddField(x::name(), m_no_ghost); + pkg->AddField(r::name(), m_no_ghost); + pkg->AddField(p::name(), m_no_ghost); + + // Other storage fields + pkg->AddField(exact::name(), m_no_ghost); + pkg->AddField(u0::name(), m_no_ghost); + pkg->AddField(rhs::name(), m_no_ghost); + pkg->AddField(rhs_base::name(), m_no_ghost); + + return pkg; +} +} // namespace poisson_package diff --git a/example/poisson_gmg/poisson_package.hpp b/example/poisson_gmg/poisson_package.hpp new file mode 100644 index 000000000000..7655594ca686 --- /dev/null +++ b/example/poisson_gmg/poisson_package.hpp @@ -0,0 +1,404 @@ +//======================================================================================== +// (C) (or copyright) 2023. Triad National Security, LLC. All rights reserved. +// +// This program was produced under U.S. Government contract 89233218CNA000001 for Los +// Alamos National Laboratory (LANL), which is operated by Triad National Security, LLC +// for the U.S. Department of Energy/National Nuclear Security Administration. All rights +// in the program are reserved by Triad National Security, LLC, and the U.S. Department +// of Energy/National Nuclear Security Administration. The Government is granted for +// itself and others acting on its behalf a nonexclusive, paid-up, irrevocable worldwide +// license in this material to reproduce, prepare derivative works, distribute copies to +// the public, perform publicly and display publicly, and to permit others to do so. +//======================================================================================== +#ifndef EXAMPLE_POISSON_GMG_POISSON_PACKAGE_HPP_ +#define EXAMPLE_POISSON_GMG_POISSON_PACKAGE_HPP_ + +#include +#include +#include +#include + +#include +#include + +#define VARIABLE(ns, varname) \ + struct varname : public parthenon::variable_names::base_t { \ + template \ + KOKKOS_INLINE_FUNCTION varname(Ts &&...args) \ + : parthenon::variable_names::base_t(std::forward(args)...) {} \ + static std::string name() { return #ns "." #varname; } \ + } + +namespace poisson_package { +using namespace parthenon::package::prelude; +// GMG fields +VARIABLE(poisson, res_err); +VARIABLE(poisson, D); +VARIABLE(poisson, u); +VARIABLE(poisson, rhs); +VARIABLE(poisson, u0); +VARIABLE(poisson, temp); +VARIABLE(poisson, exact); +VARIABLE(poisson, rhs_base); + +// BiCGStab fields +VARIABLE(poisson, rhat0); +VARIABLE(poisson, v); +VARIABLE(poisson, h); +VARIABLE(poisson, s); +VARIABLE(poisson, t); +VARIABLE(poisson, x); +VARIABLE(poisson, r); +VARIABLE(poisson, p); + +// This just provides a convenient short hand for TE::CC and will make it +// easier for testing solves with different topological elements in the +// future (although other types of fields require significantly different +// condition boundary implementations) +constexpr parthenon::TopologicalElement te = parthenon::TopologicalElement::CC; + +std::shared_ptr Initialize(ParameterInput *pin); + +template +TaskStatus CopyData(std::shared_ptr> &md) { + using TE = parthenon::TopologicalElement; + IndexRange ib = md->GetBoundsI(IndexDomain::entire, te); + IndexRange jb = md->GetBoundsJ(IndexDomain::entire, te); + IndexRange kb = md->GetBoundsK(IndexDomain::entire, te); + + int nblocks = md->NumBlocks(); + std::vector include_block(nblocks, true); + if (only_md_level) { + for (int b = 0; b < nblocks; ++b) + include_block[b] = + (md->grid.logical_level == md->GetBlockData(b)->GetBlockPointer()->loc.level()); + } + + auto desc = parthenon::MakePackDescriptor(md.get()); + auto pack = desc.GetPack(md.get(), include_block); + parthenon::par_for( + DEFAULT_LOOP_PATTERN, "SetPotentialToZero", DevExecSpace(), 0, + pack.GetNBlocks() - 1, kb.s, kb.e, jb.s, jb.e, ib.s, ib.e, + KOKKOS_LAMBDA(const int b, const int k, const int j, const int i) { + pack(b, te, out(), k, j, i) = pack(b, te, in(), k, j, i); + }); + return TaskStatus::complete; +} + +template +TaskStatus AddFieldsAndStoreInteriorSelect(std::shared_ptr> &md, + Real wa = 1.0, Real wb = 1.0, + bool only_interior = false) { + using TE = parthenon::TopologicalElement; + IndexRange ib = md->GetBoundsI(IndexDomain::entire, te); + IndexRange jb = md->GetBoundsJ(IndexDomain::entire, te); + IndexRange kb = md->GetBoundsK(IndexDomain::entire, te); + + int nblocks = md->NumBlocks(); + std::vector include_block(nblocks, true); + if (only_interior) { + // The neighbors array will only be set for a block if its a leaf block + for (int b = 0; b < nblocks; ++b) + include_block[b] = md->GetBlockData(b)->GetBlockPointer()->neighbors.size() == 0; + } + + if (only_md_level) { + for (int b = 0; b < nblocks; ++b) + include_block[b] = + include_block[b] && + (md->grid.logical_level == md->GetBlockData(b)->GetBlockPointer()->loc.level()); + } + + auto desc = parthenon::MakePackDescriptor(md.get()); + auto pack = desc.GetPack(md.get(), include_block); + parthenon::par_for( + DEFAULT_LOOP_PATTERN, "SetPotentialToZero", DevExecSpace(), 0, + pack.GetNBlocks() - 1, kb.s, kb.e, jb.s, jb.e, ib.s, ib.e, + KOKKOS_LAMBDA(const int b, const int k, const int j, const int i) { + pack(b, te, out(), k, j, i) = + wa * pack(b, te, a_t(), k, j, i) + wb * pack(b, te, b_t(), k, j, i); + }); + return TaskStatus::complete; +} + +template +TaskStatus AddFieldsAndStore(std::shared_ptr> &md, Real wa = 1.0, + Real wb = 1.0) { + return AddFieldsAndStoreInteriorSelect(md, wa, wb, false); +} + +template +TaskStatus SetToZero(std::shared_ptr> &md) { + int nblocks = md->NumBlocks(); + std::vector include_block(nblocks, true); + if (only_md_level) { + for (int b = 0; b < nblocks; ++b) + include_block[b] = + (md->grid.logical_level == md->GetBlockData(b)->GetBlockPointer()->loc.level()); + } + auto desc = parthenon::MakePackDescriptor(md.get()); + auto pack = desc.GetPack(md.get(), include_block); + const size_t scratch_size_in_bytes = 0; + const int scratch_level = 1; + const int ng = parthenon::Globals::nghost; + parthenon::par_for_outer( + DEFAULT_OUTER_LOOP_PATTERN, "Poisson::SetToZero", DevExecSpace(), + scratch_size_in_bytes, scratch_level, 0, pack.GetNBlocks() - 1, + KOKKOS_LAMBDA(parthenon::team_mbr_t member, const int b) { + auto cb = GetIndexShape(pack(b, te, 0), ng); + const auto &coords = pack.GetCoordinates(b); + IndexRange ib = cb.GetBoundsI(IndexDomain::interior, te); + IndexRange jb = cb.GetBoundsJ(IndexDomain::interior, te); + IndexRange kb = cb.GetBoundsK(IndexDomain::interior, te); + parthenon::par_for_inner( + parthenon::inner_loop_pattern_simdfor_tag, member, kb.s, kb.e, jb.s, jb.e, + ib.s, ib.e, [&](int k, int j, int i) { pack(b, te, var(), k, j, i) = 0.0; }); + }); + return TaskStatus::complete; +} + +template +TaskStatus DotProductLocal(std::shared_ptr> &md, Real *reduce_sum) { + using TE = parthenon::TopologicalElement; + IndexRange ib = md->GetBoundsI(IndexDomain::interior, te); + IndexRange jb = md->GetBoundsJ(IndexDomain::interior, te); + IndexRange kb = md->GetBoundsK(IndexDomain::interior, te); + + auto desc = parthenon::MakePackDescriptor(md.get()); + auto pack = desc.GetPack(md.get()); + Real gsum(0); + parthenon::par_reduce( + parthenon::loop_pattern_mdrange_tag, "DotProduct", DevExecSpace(), 0, + pack.GetNBlocks() - 1, kb.s, kb.e, jb.s, jb.e, ib.s, ib.e, + KOKKOS_LAMBDA(const int b, const int k, const int j, const int i, Real &lsum) { + lsum += pack(b, te, a_t(), k, j, i) * pack(b, te, b_t(), k, j, i); + }, + Kokkos::Sum(gsum)); + *reduce_sum += gsum; + return TaskStatus::complete; +} + +template +TaskStatus CalculateFluxes(std::shared_ptr> &md) { + using namespace parthenon; + const int ndim = md->GetMeshPointer()->ndim; + IndexRange ib = md->GetBoundsI(IndexDomain::interior, te); + IndexRange jb = md->GetBoundsJ(IndexDomain::interior, te); + IndexRange kb = md->GetBoundsK(IndexDomain::interior, te); + + using TE = parthenon::TopologicalElement; + + int nblocks = md->NumBlocks(); + std::vector include_block(nblocks, true); + + if (only_md_level) { + for (int b = 0; b < nblocks; ++b) + include_block[b] = + (md->grid.logical_level == md->GetBlockData(b)->GetBlockPointer()->loc.level()); + } + + auto desc = parthenon::MakePackDescriptor(md.get(), {}, {PDOpt::WithFluxes}); + auto pack = desc.GetPack(md.get(), include_block); + parthenon::par_for( + DEFAULT_LOOP_PATTERN, "CaclulateFluxes", DevExecSpace(), 0, pack.GetNBlocks() - 1, + kb.s, kb.e, jb.s, jb.e, ib.s, ib.e, + KOKKOS_LAMBDA(const int b, const int k, const int j, const int i) { + const auto &coords = pack.GetCoordinates(b); + Real dx1 = coords.template Dxc(k, j, i); + pack.flux(b, X1DIR, var_t(), k, j, i) = + pack(b, TE::F1, D(), k, j, i) / dx1 * + (pack(b, te, var_t(), k, j, i - 1) - pack(b, te, var_t(), k, j, i)); + if (i == ib.e) + pack.flux(b, X1DIR, var_t(), k, j, i + 1) = + pack(b, TE::F1, D(), k, j, i + 1) / dx1 * + (pack(b, te, var_t(), k, j, i) - pack(b, te, var_t(), k, j, i + 1)); + + if (ndim > 1) { + Real dx2 = coords.template Dxc(k, j, i); + pack.flux(b, X2DIR, var_t(), k, j, i) = + pack(b, TE::F2, D(), k, j, i) * + (pack(b, te, var_t(), k, j - 1, i) - pack(b, te, var_t(), k, j, i)) / dx2; + if (j == jb.e) + pack.flux(b, X2DIR, var_t(), k, j + 1, i) = + pack(b, TE::F2, D(), k, j + 1, i) * + (pack(b, te, var_t(), k, j, i) - pack(b, te, var_t(), k, j + 1, i)) / dx2; + } + + if (ndim > 2) { + Real dx3 = coords.template Dxc(k, j, i); + pack.flux(b, X3DIR, var_t(), k, j, i) = + pack(b, TE::F3, D(), k, j, i) * + (pack(b, te, var_t(), k - 1, j, i) - pack(b, te, var_t(), k, j, i)) / dx3; + if (k == kb.e) + pack.flux(b, X2DIR, var_t(), k + 1, j, i) = + pack(b, TE::F3, D(), k + 1, j, i) * + (pack(b, te, var_t(), k, j, i) - pack(b, te, var_t(), k + 1, j, i)) / dx3; + } + }); + return TaskStatus::complete; +} + +template +TaskStatus FluxMultiplyMatrix(std::shared_ptr> &md, bool only_interior) { + using namespace parthenon; + const int ndim = md->GetMeshPointer()->ndim; + IndexRange ib = md->GetBoundsI(IndexDomain::interior, te); + IndexRange jb = md->GetBoundsJ(IndexDomain::interior, te); + IndexRange kb = md->GetBoundsK(IndexDomain::interior, te); + + auto pkg = md->GetMeshPointer()->packages.Get("poisson_package"); + const auto alpha = pkg->Param("diagonal_alpha"); + + int nblocks = md->NumBlocks(); + std::vector include_block(nblocks, true); + if (only_interior) { + for (int b = 0; b < nblocks; ++b) + include_block[b] = md->GetBlockData(b)->GetBlockPointer()->neighbors.size() == 0; + } + + if (only_md_level) { + for (int b = 0; b < nblocks; ++b) + include_block[b] = + include_block[b] && + (md->grid.logical_level == md->GetBlockData(b)->GetBlockPointer()->loc.level()); + } + + auto desc = + parthenon::MakePackDescriptor(md.get(), {}, {PDOpt::WithFluxes}); + auto pack = desc.GetPack(md.get(), include_block); + parthenon::par_for( + DEFAULT_LOOP_PATTERN, "FluxMultiplyMatrix", DevExecSpace(), 0, + pack.GetNBlocks() - 1, kb.s, kb.e, jb.s, jb.e, ib.s, ib.e, + KOKKOS_LAMBDA(const int b, const int k, const int j, const int i) { + const auto &coords = pack.GetCoordinates(b); + Real dx1 = coords.template Dxc(k, j, i); + pack(b, te, out_t(), k, j, i) = -alpha * pack(b, te, in_t(), k, j, i); + pack(b, te, out_t(), k, j, i) += (pack.flux(b, X1DIR, in_t(), k, j, i) - + pack.flux(b, X1DIR, in_t(), k, j, i + 1)) / + dx1; + + if (ndim > 1) { + Real dx2 = coords.template Dxc(k, j, i); + pack(b, te, out_t(), k, j, i) += (pack.flux(b, X2DIR, in_t(), k, j, i) - + pack.flux(b, X2DIR, in_t(), k, j + 1, i)) / + dx2; + } + + if (ndim > 2) { + Real dx3 = coords.template Dxc(k, j, i); + pack(b, te, out_t(), k, j, i) += (pack.flux(b, X3DIR, in_t(), k, j, i) - + pack.flux(b, X3DIR, in_t(), k + 1, j, i)) / + dx3; + } + }); + return TaskStatus::complete; +} + +enum class GSType { all, red, black }; +template +TaskStatus FluxJacobi(std::shared_ptr> &md, double weight, + GSType gs_type = GSType::all) { + using namespace parthenon; + const int ndim = md->GetMeshPointer()->ndim; + IndexRange ib = md->GetBoundsI(IndexDomain::interior, te); + IndexRange jb = md->GetBoundsJ(IndexDomain::interior, te); + IndexRange kb = md->GetBoundsK(IndexDomain::interior, te); + + auto pkg = md->GetMeshPointer()->packages.Get("poisson_package"); + const auto alpha = pkg->Param("diagonal_alpha"); + + int nblocks = md->NumBlocks(); + std::vector include_block(nblocks, true); + + if (only_md_level) { + for (int b = 0; b < nblocks; ++b) + include_block[b] = + (md->grid.logical_level == md->GetBlockData(b)->GetBlockPointer()->loc.level()); + } + + auto desc = parthenon::MakePackDescriptor(md.get()); + auto pack = desc.GetPack(md.get(), include_block); + parthenon::par_for( + DEFAULT_LOOP_PATTERN, "CaclulateFluxes", DevExecSpace(), 0, pack.GetNBlocks() - 1, + kb.s, kb.e, jb.s, jb.e, ib.s, ib.e, + KOKKOS_LAMBDA(const int b, const int k, const int j, const int i) { + const auto &coords = pack.GetCoordinates(b); + if ((i + j + k) % 2 == 1 && gs_type == GSType::red) return; + if ((i + j + k) % 2 == 0 && gs_type == GSType::black) return; + // Build the unigrid diagonal of the matrix + Real dx1 = coords.template Dxc(k, j, i); + Real diag_elem = + -(pack(b, TE::F1, D(), k, j, i) + pack(b, TE::F1, D(), k, j, i + 1)) / + (dx1 * dx1) - + alpha; + if (ndim > 1) { + Real dx2 = coords.template Dxc(k, j, i); + diag_elem -= + (pack(b, TE::F2, D(), k, j, i) + pack(b, TE::F2, D(), k, j + 1, i)) / + (dx2 * dx2); + } + if (ndim > 2) { + Real dx3 = coords.template Dxc(k, j, i); + diag_elem -= + (pack(b, TE::F3, D(), k, j, i) + pack(b, TE::F3, D(), k + 1, j, i)) / + (dx3 * dx3); + } + + // Get the off-diagonal contribution to Ax = (D + L + U)x = y + Real off_diag = + pack(b, te, div_t(), k, j, i) - diag_elem * pack(b, te, in_t(), k, j, i); + + Real val = pack(b, te, rhs(), k, j, i) - off_diag; + pack(b, te, out_t(), k, j, i) = + weight * val / diag_elem + (1.0 - weight) * pack(b, te, in_t(), k, j, i); + }); + return TaskStatus::complete; +} + +template +TaskStatus PrintChosenValues(std::shared_ptr> &md, + const std::string &label) { + using TE = parthenon::TopologicalElement; + + auto desc = parthenon::MakePackDescriptor(md.get()); + auto pack = desc.GetPack(md.get()); + std::array names{vars::name()...}; + printf("%s\n", label.c_str()); + int col_num = 0; + for (auto &name : names) { + printf("var %i: %s\n", col_num, name.c_str()); + col_num++; + } + const int ng = parthenon::Globals::nghost; + parthenon::par_for( + DEFAULT_LOOP_PATTERN, "Print", DevExecSpace(), 0, pack.GetNBlocks() - 1, 0, 0, 0, 0, + KOKKOS_LAMBDA(const int b, int, int) { + auto cb = GetIndexShape(pack(b, te, 0), ng); + const auto &coords = pack.GetCoordinates(b); + IndexRange ib = cb.GetBoundsI(IndexDomain::entire, te); + IndexRange jb = cb.GetBoundsJ(IndexDomain::entire, te); + IndexRange kb = cb.GetBoundsK(IndexDomain::entire, te); + for (int k = kb.s; k <= kb.e; ++k) { + for (int j = jb.s; j <= jb.e; ++j) { + for (int i = ib.s; i <= ib.e; ++i) { + Real x = coords.template X<1, te>(i); + Real y = coords.template X<2, te>(j); + Real dx1 = coords.template Dxc<1>(k, j, i); + Real dx2 = coords.template Dxc<2>(k, j, i); + std::array vals{pack(b, te, vars(), k, j, i)...}; + printf("b = %i i = %2i x = %e dx1 = %e ", b, i, x, dx1); + for (int v = 0; v < sizeof...(vars); ++v) { + printf("%e ", vals[v]); + } + printf("\n"); + } + } + } + }); + printf("Done with MeshData\n\n"); + return TaskStatus::complete; +} + +} // namespace poisson_package + +#endif // EXAMPLE_POISSON_GMG_POISSON_PACKAGE_HPP_ diff --git a/src/CMakeLists.txt b/src/CMakeLists.txt index 54a226c55383..088fe6269556 100644 --- a/src/CMakeLists.txt +++ b/src/CMakeLists.txt @@ -160,9 +160,11 @@ add_library(parthenon mesh/amr_loadbalance.cpp mesh/domain.hpp + mesh/logical_location.cpp mesh/logical_location.hpp mesh/mesh_refinement.cpp mesh/mesh_refinement.hpp + mesh/mesh-gmg.cpp mesh/mesh.cpp mesh/mesh.hpp mesh/meshblock.hpp @@ -221,6 +223,7 @@ add_library(parthenon utils/alias_method.cpp utils/alias_method.hpp utils/array_to_tuple.hpp + utils/bit_hacks.hpp utils/buffer_utils.cpp utils/buffer_utils.hpp utils/change_rundir.cpp diff --git a/src/application_input.hpp b/src/application_input.hpp index 543cb7a5100f..a9ec96c1551e 100644 --- a/src/application_input.hpp +++ b/src/application_input.hpp @@ -48,6 +48,7 @@ struct ApplicationInput { PostStepDiagnosticsInLoop = nullptr; std::function UserWorkAfterLoop = nullptr; + std::function UserWorkBeforeLoop = nullptr; BValFunc boundary_conditions[BOUNDARY_NFACES] = {nullptr}; SBValFunc swarm_boundary_conditions[BOUNDARY_NFACES] = {nullptr}; diff --git a/src/basic_types.hpp b/src/basic_types.hpp index 4a5fa41a6f24..9898140f82b4 100644 --- a/src/basic_types.hpp +++ b/src/basic_types.hpp @@ -16,6 +16,7 @@ #include #include #include +#include #include @@ -36,14 +37,48 @@ using Real = double; #endif #endif +// needed for arrays dimensioned over grid directions +// enumerator type only used in Mesh::EnrollUserMeshGenerator() +// X0DIR time-like direction +// X1DIR x, r, etc... +// X2DIR y, theta, etc... +// X3DIR z, phi, etc... +enum CoordinateDirection { NODIR = -1, X0DIR = 0, X1DIR = 1, X2DIR = 2, X3DIR = 3 }; +enum class BlockLocation { Left = 0, Center = 1, Right = 2 }; enum class TaskStatus { fail, complete, incomplete, iterate, skip, waiting }; + enum class AmrTag : int { derefine = -1, same = 0, refine = 1 }; enum class RefinementOp_t { Prolongation, Restriction, None }; // JMM: Not clear this is the best place for this but it minimizes // circular dependency nonsense. -constexpr int NUM_BNDRY_TYPES = 5; -enum class BoundaryType : int { local, nonlocal, any, flxcor_send, flxcor_recv }; +constexpr int NUM_BNDRY_TYPES = 10; +enum class BoundaryType : int { + local, + nonlocal, + any, + flxcor_send, + flxcor_recv, + gmg_same, + gmg_restrict_send, + gmg_restrict_recv, + gmg_prolongate_send, + gmg_prolongate_recv +}; + +constexpr bool IsSender(BoundaryType btype) { + if (btype == BoundaryType::flxcor_recv) return false; + if (btype == BoundaryType::gmg_restrict_recv) return false; + if (btype == BoundaryType::gmg_prolongate_recv) return false; + return true; +} + +constexpr bool IsReceiver(BoundaryType btype) { + if (btype == BoundaryType::flxcor_send) return false; + if (btype == BoundaryType::gmg_restrict_send) return false; + if (btype == BoundaryType::gmg_prolongate_send) return false; + return true; +} // Enumeration for accessing a field on different locations of the grid: // CC = cell center of (i, j, k) @@ -109,6 +144,14 @@ TopologicalType GetTopologicalType(TopologicalElement el) { } } +inline std::vector GetTopologicalElements(TopologicalType tt) { + using TE = TopologicalElement; + using TT = TopologicalType; + if (tt == TT::Node) return {TE::NN}; + if (tt == TT::Edge) return {TE::E1, TE::E2, TE::E3}; + if (tt == TT::Face) return {TE::F1, TE::F2, TE::F3}; + return {TE::CC}; +} using TE = TopologicalElement; // Returns one if the I coordinate of el is offset from the zone center coordinates, // and zero otherwise diff --git a/src/bvals/boundary_conditions_generic.hpp b/src/bvals/boundary_conditions_generic.hpp index 9ee8cf5eba8e..b7300e6298d2 100644 --- a/src/bvals/boundary_conditions_generic.hpp +++ b/src/bvals/boundary_conditions_generic.hpp @@ -32,7 +32,7 @@ namespace parthenon { namespace BoundaryFunction { enum class BCSide { Inner, Outer }; -enum class BCType { Outflow, Reflect, ConstantDeriv, Fixed }; +enum class BCType { Outflow, Reflect, ConstantDeriv, Fixed, FixedFace }; template void GenericBC(std::shared_ptr> &rc, bool coarse, @@ -95,6 +95,9 @@ void GenericBC(std::shared_ptr> &rc, bool coarse, q(b, el, l, k, j, i) = (reflect ? -1.0 : 1.0) * q(b, el, l, X3 ? offset - k : k, X2 ? offset - j : j, X1 ? offset - i : i); + } else if (TYPE == BCType::FixedFace) { + q(b, el, l, k, j, i) = 2.0 * val - q(b, el, l, X3 ? offset - k : k, + X2 ? offset - j : j, X1 ? offset - i : i); } else if (TYPE == BCType::ConstantDeriv) { Real dq = q(b, el, l, X3 ? ref + offsetin : k, X2 ? ref + offsetin : j, X1 ? ref + offsetin : i) - diff --git a/src/bvals/bvals.hpp b/src/bvals/bvals.hpp index 718499a9f804..35bba37746a3 100644 --- a/src/bvals/bvals.hpp +++ b/src/bvals/bvals.hpp @@ -80,7 +80,7 @@ class BoundaryBase { static int FindBufferID(int ox1, int ox2, int ox3, int fi1, int fi2); void - SearchAndSetNeighbors(MeshBlockTree &tree, int *ranklist, int *nslist, + SearchAndSetNeighbors(Mesh *mesh, MeshBlockTree &tree, int *ranklist, int *nslist, const std::unordered_set &newly_refined = {}); protected: diff --git a/src/bvals/bvals_base.cpp b/src/bvals/bvals_base.cpp index 057493f186e8..7c43c4917197 100644 --- a/src/bvals/bvals_base.cpp +++ b/src/bvals/bvals_base.cpp @@ -3,7 +3,7 @@ // Copyright(C) 2014 James M. Stone and other code contributors // Licensed under the 3-clause BSD License, see LICENSE file for details //======================================================================================== -// (C) (or copyright) 2020-2021. Triad National Security, LLC. All rights reserved. +// (C) (or copyright) 2020-2023. Triad National Security, LLC. All rights reserved. // // This program was produced under U.S. Government contract 89233218CNA000001 for Los // Alamos National Laboratory (LANL), which is operated by Triad National Security, LLC @@ -27,6 +27,7 @@ #include // stringstream #include // runtime_error #include // c_str() +#include #include "globals.hpp" #include "mesh/logical_location.hpp" @@ -96,6 +97,38 @@ void NeighborBlock::SetNeighbor(LogicalLocation inloc, int irank, int ilevel, in return; } +NeighborBlock::NeighborBlock(Mesh *mesh, LogicalLocation loc, int rank, int gid, int lid, + std::array offsets, NeighborConnect type, int bid, + int target_id, int fi1, int fi2) + : snb{rank, loc.level(), lid, gid}, ni{offsets[0], offsets[1], offsets[2], + fi1, fi2, type}, + bufid{bid}, eid{0}, targetid{target_id}, fid{BoundaryFace::undef}, loc{loc}, + ownership(true), block_size(mesh->GetBlockSize(loc)) { + // TODO(LFR): Look and see if this stuff gets used anywhere + if (ni.type == NeighborConnect::face) { + if (ni.ox1 == -1) + fid = BoundaryFace::inner_x1; + else if (ni.ox1 == 1) + fid = BoundaryFace::outer_x1; + else if (ni.ox2 == -1) + fid = BoundaryFace::inner_x2; + else if (ni.ox2 == 1) + fid = BoundaryFace::outer_x2; + else if (ni.ox3 == -1) + fid = BoundaryFace::inner_x3; + else if (ni.ox3 == 1) + fid = BoundaryFace::outer_x3; + } + if (ni.type == NeighborConnect::edge) { + if (ni.ox3 == 0) + eid = ((((ni.ox1 + 1) >> 1) | ((ni.ox2 + 1) & 2))); + else if (ni.ox2 == 0) + eid = (4 + (((ni.ox1 + 1) >> 1) | ((ni.ox3 + 1) & 2))); + else if (ni.ox1 == 0) + eid = (8 + (((ni.ox2 + 1) >> 1) | ((ni.ox3 + 1) & 2))); + } +} + //---------------------------------------------------------------------------------------- //! \fn BoundaryBase::BoundaryBase(Mesh *pm, LogicalLocation iloc, RegionSize isize, // BoundaryFlag *input_bcs) @@ -302,7 +335,7 @@ int BoundaryBase::CreateBvalsMPITag(int lid, int bufid) { // TODO(felker): break-up this long function void BoundaryBase::SearchAndSetNeighbors( - MeshBlockTree &tree, int *ranklist, int *nslist, + Mesh *mesh, MeshBlockTree &tree, int *ranklist, int *nslist, const std::unordered_set &newly_refined) { Kokkos::Profiling::pushRegion("SearchAndSetNeighbors"); MeshBlockTree *neibt; @@ -635,7 +668,7 @@ void BoundaryBase::SearchAndSetNeighbors( void BoundaryBase::SetNeighborOwnership( const std::unordered_set &newly_refined) { // Set neighbor block ownership - std::set allowed_neighbors; + std::unordered_set allowed_neighbors; allowed_neighbors.insert(loc); // Insert the location of this block for (int n = 0; n < nneighbor; ++n) allowed_neighbors.insert(neighbor[n].loc); diff --git a/src/bvals/bvals_interfaces.hpp b/src/bvals/bvals_interfaces.hpp index 8cfa4f3fc5a2..f0a5116476ac 100644 --- a/src/bvals/bvals_interfaces.hpp +++ b/src/bvals/bvals_interfaces.hpp @@ -144,10 +144,15 @@ struct NeighborBlock { // aggregate and POD type. Inheritance breaks standard-la BoundaryFace fid; LogicalLocation loc; block_ownership_t ownership; + RegionSize block_size; void SetNeighbor(LogicalLocation inloc, int irank, int ilevel, int igid, int ilid, int iox1, int iox2, int iox3, NeighborConnect itype, int ibid, int itargetid, int ifi1 = 0, int ifi2 = 0); + NeighborBlock() = default; + NeighborBlock(Mesh *mesh, LogicalLocation loc, int rank, int gid, int lid, + std::array offsets, NeighborConnect type, int bid, int target_id, + int ifi1, int ifi2); }; //---------------------------------------------------------------------------------------- diff --git a/src/bvals/comms/bnd_info.cpp b/src/bvals/comms/bnd_info.cpp index b3b569e75ae8..fa4545705524 100644 --- a/src/bvals/comms/bnd_info.cpp +++ b/src/bvals/comms/bnd_info.cpp @@ -3,7 +3,7 @@ // Copyright(C) 2022 The Parthenon collaboration // Licensed under the 3-clause BSD License, see LICENSE file for details //======================================================================================== -// (C) (or copyright) 2020-2022. Triad National Security, LLC. All rights reserved. +// (C) (or copyright) 2020-2023. Triad National Security, LLC. All rights reserved. // // This program was produced under U.S. Government contract 89233218CNA000001 for Los // Alamos National Laboratory (LANL), which is operated by Triad National Security, LLC @@ -38,7 +38,12 @@ namespace { enum class InterfaceType { SameToSame, CoarseToFine, FineToCoarse }; -enum class IndexRangeType { Interior, Exterior, Shared }; +enum class IndexRangeType { + BoundaryInteriorSend, + BoundaryExteriorRecv, + InteriorSend, + InteriorRecv +}; using namespace parthenon; @@ -77,7 +82,7 @@ SpatiallyMaskedIndexer6D CalcIndices(const NeighborBlock &nb, TopologicalElement el, IndexRangeType ir_type, bool prores, std::array tensor_shape) { const auto &ni = nb.ni; - auto loc = pmb->loc; + const auto &loc = pmb->loc; auto shape = pmb->cellbounds; // Both prolongation and restriction always operate in the coarse // index space. Also need to use the coarse index space if the @@ -85,11 +90,25 @@ SpatiallyMaskedIndexer6D CalcIndices(const NeighborBlock &nb, // interior or exterior cells if (prores || nb.loc.level() < loc.level()) shape = pmb->c_cellbounds; + // Re-create the index space for the neighbor block (either the main block or + // the coarse buffer as required) + int coarse_fac = 1; + if (nb.loc.level() > loc.level()) coarse_fac = 2; + auto neighbor_shape = IndexShape(nb.block_size.nx(X3DIR) / coarse_fac, + nb.block_size.nx(X2DIR) / coarse_fac, + nb.block_size.nx(X1DIR) / coarse_fac, Globals::nghost); + IndexDomain interior = IndexDomain::interior; std::array bounds{shape.GetBoundsI(interior, el), shape.GetBoundsJ(interior, el), shape.GetBoundsK(interior, el)}; + std::array neighbor_bounds{neighbor_shape.GetBoundsI(interior, el), + neighbor_shape.GetBoundsJ(interior, el), + neighbor_shape.GetBoundsK(interior, el)}; + std::array not_symmetry{!pmb->block_size.symmetry(X1DIR), + !pmb->block_size.symmetry(X2DIR), + !pmb->block_size.symmetry(X3DIR)}; // Account for the fact that the neighbor block may duplicate // some active zones on the loading block for face, edge, and nodal // fields, so the boundary of the neighbor block is one deeper into @@ -97,15 +116,15 @@ SpatiallyMaskedIndexer6D CalcIndices(const NeighborBlock &nb, std::array top_offset{TopologicalOffsetI(el), TopologicalOffsetJ(el), TopologicalOffsetK(el)}; std::array block_offset = {ni.ox1, ni.ox2, ni.ox3}; - std::array logic_loc{loc.lx1(), loc.lx2(), loc.lx3()}; - std::array nb_logic_loc{nb.loc.lx1(), nb.loc.lx2(), nb.loc.lx3()}; - int interior_offset = ir_type == IndexRangeType::Interior ? Globals::nghost : 0; - int exterior_offset = ir_type == IndexRangeType::Exterior ? Globals::nghost : 0; + int interior_offset = + ir_type == IndexRangeType::BoundaryInteriorSend ? Globals::nghost : 0; + int exterior_offset = + ir_type == IndexRangeType::BoundaryExteriorRecv ? Globals::nghost : 0; if (prores) { // The coarse ghosts cover twice as much volume as the fine ghosts, so when working in - // the exterior we must only go over the coarse ghosts that have corresponding fine - // ghosts + // the exterior (i.e. ghosts) we must only go over the coarse ghosts that have + // corresponding fine ghosts exterior_offset /= 2; } @@ -115,7 +134,7 @@ SpatiallyMaskedIndexer6D CalcIndices(const NeighborBlock &nb, s[dir] = bounds[dir].s; e[dir] = bounds[dir].e; if ((loc.level() < nb.loc.level()) && - bounds[dir].e > bounds[dir].s) { // Check that this dimension has ghost zones + not_symmetry[dir]) { // Check that this dimension has ghost zones // The requested neighbor block is at a finer level, so it only abuts // approximately half of the zones in any given direction with offset zero. If we // are asking for an interior index range, we also send nghost "extra" zones in @@ -123,16 +142,33 @@ SpatiallyMaskedIndexer6D CalcIndices(const NeighborBlock &nb, // for non-cell centered values the number of grid points may be odd, so we pick // up an extra zone that is communicated. I think this is ok, but something to // keep in mind if there are issues. - const int half_grid = (bounds[dir].e - bounds[dir].s + 1) / 2; - s[dir] += nb_logic_loc[dir] % 2 == 1 ? half_grid - interior_offset : 0; - e[dir] -= nb_logic_loc[dir] % 2 == 0 ? half_grid - interior_offset : 0; + const int extra_zones = (bounds[dir].e - bounds[dir].s + 1) - + (neighbor_bounds[dir].e - neighbor_bounds[dir].s + 1); + s[dir] += nb.loc.l(dir) % 2 == 1 ? extra_zones - interior_offset : 0; + e[dir] -= nb.loc.l(dir) % 2 == 0 ? extra_zones - interior_offset : 0; + if (ir_type == IndexRangeType::InteriorSend) { + // Include ghosts of finer block coarse array in message + s[dir] -= Globals::nghost; + e[dir] += Globals::nghost; + } } - if (loc.level() > nb.loc.level() && bounds[dir].e > bounds[dir].s) { + if (loc.level() > nb.loc.level() && not_symmetry[dir]) { // If we are setting (i.e. have non-zero exterior_offset) from a neighbor block // that is coarser, we got extra ghost zones from the neighbor (see inclusion of // interior_offset in the above if block) - s[dir] -= logic_loc[dir] % 2 == 1 ? exterior_offset : 0; - e[dir] += logic_loc[dir] % 2 == 0 ? exterior_offset : 0; + s[dir] -= loc.l(dir) % 2 == 1 ? exterior_offset : 0; + e[dir] += loc.l(dir) % 2 == 0 ? exterior_offset : 0; + if (ir_type == IndexRangeType::InteriorRecv) { + // Include ghosts of finer block coarse array in message + s[dir] -= Globals::nghost; + e[dir] += Globals::nghost; + } + } + // Prolongate into ghosts of interior receiver since we have the data available, + // having this is important for AMR MG + if (prores && not_symmetry[dir] && IndexRangeType::InteriorRecv == ir_type) { + s[dir] -= Globals::nghost / 2; + e[dir] += Globals::nghost / 2; } } else if (block_offset[dir] > 0) { s[dir] = bounds[dir].e - interior_offset + 1 - top_offset[dir]; @@ -148,16 +184,16 @@ SpatiallyMaskedIndexer6D CalcIndices(const NeighborBlock &nb, // index range, it is unecessary. This is probably not immediately obvious, // but it is possible to convince oneself that dealing with ownership in // only exterior index ranges works correctly - if (ir_type == IndexRangeType::Exterior) { + if (ir_type == IndexRangeType::BoundaryExteriorRecv) { int sox1 = -ni.ox1; int sox2 = -ni.ox2; int sox3 = -ni.ox3; if (nb.loc.level() < loc.level()) { // For coarse to fine interfaces, we are passing zones from only an // interior corner of the cell, never an entire face or edge - if (sox1 == 0) sox1 = logic_loc[0] % 2 == 1 ? 1 : -1; - if (sox2 == 0) sox2 = logic_loc[1] % 2 == 1 ? 1 : -1; - if (sox3 == 0) sox3 = logic_loc[2] % 2 == 1 ? 1 : -1; + if (sox1 == 0) sox1 = loc.l(0) % 2 == 1 ? 1 : -1; + if (sox2 == 0) sox2 = loc.l(1) % 2 == 1 ? 1 : -1; + if (sox3 == 0) sox3 = loc.l(2) % 2 == 1 ? 1 : -1; } owns = GetIndexRangeMaskFromOwnership(el, nb.ownership, sox1, sox2, sox3); } @@ -173,7 +209,7 @@ int GetBufferSize(std::shared_ptr pmb, const NeighborBlock &nb, // will always be enough storage auto &cb = pmb->cellbounds; int topo_comp = (v->IsSet(Metadata::Face) || v->IsSet(Metadata::Edge)) ? 3 : 1; - const IndexDomain in = IndexDomain::interior; + const IndexDomain in = IndexDomain::entire; // The plus 2 instead of 1 is to account for the possible size of face, edge, and nodal // fields const int isize = cb.ie(in) - cb.is(in) + 2; @@ -203,10 +239,12 @@ BndInfo BndInfo::GetSendBndInfo(std::shared_ptr pmb, const NeighborBl auto elements = v->GetTopologicalElements(); out.ntopological_elements = elements.size(); + auto idx_range_type = IndexRangeType::BoundaryInteriorSend; + if (std::abs(nb.ni.ox1) + std::abs(nb.ni.ox2) + std::abs(nb.ni.ox3) == 0) + idx_range_type = IndexRangeType::InteriorSend; for (auto el : elements) { int idx = static_cast(el) % 3; - out.idxer[idx] = - CalcIndices(nb, pmb, el, IndexRangeType::Interior, false, {Nt, Nu, Nv}); + out.idxer[idx] = CalcIndices(nb, pmb, el, idx_range_type, false, {Nt, Nu, Nv}); } if (nb.snb.level < mylevel) { out.var = v->coarse_s.Get(); @@ -216,7 +254,48 @@ BndInfo BndInfo::GetSendBndInfo(std::shared_ptr pmb, const NeighborBl return out; } +BndInfo BndInfo::GetSetBndInfo(std::shared_ptr pmb, const NeighborBlock &nb, + std::shared_ptr> v, + CommBuffer::owner_t> *buf) { + BndInfo out; + out.buf = buf->buffer(); + auto buf_state = buf->GetState(); + if (buf_state == BufferState::received) { + out.buf_allocated = true; + PARTHENON_DEBUG_REQUIRE(v->IsAllocated(), "Variable must be allocated to receive"); + } else if (buf_state == BufferState::received_null) { + out.buf_allocated = false; + } else { + PARTHENON_FAIL("Buffer should be in a received state."); + } + out.allocated = v->IsAllocated(); + + int Nv = v->GetDim(4); + int Nu = v->GetDim(5); + int Nt = v->GetDim(6); + + int mylevel = pmb->loc.level(); + + auto elements = v->GetTopologicalElements(); + out.ntopological_elements = elements.size(); + auto idx_range_type = IndexRangeType::BoundaryExteriorRecv; + if (std::abs(nb.ni.ox1) + std::abs(nb.ni.ox2) + std::abs(nb.ni.ox3) == 0) + idx_range_type = IndexRangeType::InteriorRecv; + for (auto el : elements) { + int idx = static_cast(el) % 3; + out.idxer[idx] = CalcIndices(nb, pmb, el, idx_range_type, false, {Nt, Nu, Nv}); + } + if (nb.snb.level < mylevel) { + out.var = v->coarse_s.Get(); + } else { + out.var = v->data.Get(); + } + + return out; +} + ProResInfo ProResInfo::GetInteriorRestrict(std::shared_ptr pmb, + const NeighborBlock & /*nb*/, std::shared_ptr> v) { ProResInfo out; @@ -234,23 +313,21 @@ ProResInfo ProResInfo::GetInteriorRestrict(std::shared_ptr pmb, out.fine = v->data.Get(); out.coarse = v->coarse_s.Get(); - NeighborBlock nb; - // Make the neighbor block coincide with this block - nb.SetNeighbor(pmb->loc, Globals::my_rank, mylevel, 0, 0, 0, 0, 0, - NeighborConnect::none, 0, 0); - nb.ownership = block_ownership_t(true); + NeighborBlock nb(pmb->pmy_mesh, pmb->loc, Globals::my_rank, 0, 0, {0, 0, 0}, + NeighborConnect::none, 0, 0, 0, 0); auto elements = v->GetTopologicalElements(); out.ntopological_elements = elements.size(); for (auto el : elements) { out.idxer[static_cast(el)] = - CalcIndices(nb, pmb, el, IndexRangeType::Interior, true, {Nt, Nu, Nv}); + CalcIndices(nb, pmb, el, IndexRangeType::InteriorSend, true, {Nt, Nu, Nv}); } out.refinement_op = RefinementOp_t::Restriction; return out; } ProResInfo ProResInfo::GetInteriorProlongate(std::shared_ptr pmb, + const NeighborBlock & /*nb*/, std::shared_ptr> v) { ProResInfo out; @@ -268,17 +345,14 @@ ProResInfo ProResInfo::GetInteriorProlongate(std::shared_ptr pmb, out.fine = v->data.Get(); out.coarse = v->coarse_s.Get(); - NeighborBlock nb; - // Make the neighbor block coincide with this block - nb.SetNeighbor(pmb->loc, Globals::my_rank, mylevel, 0, 0, 0, 0, 0, - NeighborConnect::none, 0, 0); - nb.ownership = block_ownership_t(true); + NeighborBlock nb(pmb->pmy_mesh, pmb->loc, Globals::my_rank, 0, 0, {0, 0, 0}, + NeighborConnect::none, 0, 0, 0, 0); auto elements = v->GetTopologicalElements(); out.ntopological_elements = elements.size(); for (auto el : {TE::CC, TE::F1, TE::F2, TE::F3, TE::E1, TE::E2, TE::E3, TE::NN}) out.idxer[static_cast(el)] = - CalcIndices(nb, pmb, el, IndexRangeType::Exterior, true, {Nt, Nu, Nv}); + CalcIndices(nb, pmb, el, IndexRangeType::InteriorRecv, true, {Nt, Nu, Nv}); out.refinement_op = RefinementOp_t::Prolongation; return out; } @@ -306,9 +380,8 @@ ProResInfo ProResInfo::GetSend(std::shared_ptr pmb, const NeighborBlo out.ntopological_elements = elements.size(); if (nb.snb.level < mylevel) { for (auto el : elements) { - int idx = static_cast(el) % 3; - out.idxer[static_cast(el)] = - CalcIndices(nb, pmb, el, IndexRangeType::Interior, true, {Nt, Nu, Nv}); + out.idxer[static_cast(el)] = CalcIndices( + nb, pmb, el, IndexRangeType::BoundaryInteriorSend, true, {Nt, Nu, Nv}); out.refinement_op = RefinementOp_t::Restriction; } } @@ -346,14 +419,13 @@ ProResInfo ProResInfo::GetSet(std::shared_ptr pmb, const NeighborBloc auto elements = v->GetTopologicalElements(); out.ntopological_elements = elements.size(); for (auto el : elements) { - int idx = static_cast(el) % 3; if (nb.snb.level < mylevel) { out.refinement_op = RefinementOp_t::Prolongation; } else { if (restricted) { out.refinement_op = RefinementOp_t::Restriction; - out.idxer[static_cast(el)] = - CalcIndices(nb, pmb, el, IndexRangeType::Exterior, true, {Nt, Nu, Nv}); + out.idxer[static_cast(el)] = CalcIndices( + nb, pmb, el, IndexRangeType::BoundaryExteriorRecv, true, {Nt, Nu, Nv}); } } } @@ -369,49 +441,12 @@ ProResInfo ProResInfo::GetSet(std::shared_ptr pmb, const NeighborBloc // 10 indexers per bound info even if the field isn't allocated if (nb.snb.level < mylevel) { for (auto el : {TE::CC, TE::F1, TE::F2, TE::F3, TE::E1, TE::E2, TE::E3, TE::NN}) - out.idxer[static_cast(el)] = - CalcIndices(nb, pmb, el, IndexRangeType::Exterior, true, {Nt, Nu, Nv}); + out.idxer[static_cast(el)] = CalcIndices( + nb, pmb, el, IndexRangeType::BoundaryExteriorRecv, true, {Nt, Nu, Nv}); } return out; } -BndInfo BndInfo::GetSetBndInfo(std::shared_ptr pmb, const NeighborBlock &nb, - std::shared_ptr> v, - CommBuffer::owner_t> *buf) { - BndInfo out; - out.buf = buf->buffer(); - auto buf_state = buf->GetState(); - if (buf_state == BufferState::received) { - out.buf_allocated = true; - } else if (buf_state == BufferState::received_null) { - out.buf_allocated = false; - } else { - PARTHENON_FAIL("Buffer should be in a received state."); - } - out.allocated = v->IsAllocated(); - - int Nv = v->GetDim(4); - int Nu = v->GetDim(5); - int Nt = v->GetDim(6); - - int mylevel = pmb->loc.level(); - - auto elements = v->GetTopologicalElements(); - out.ntopological_elements = elements.size(); - for (auto el : elements) { - int idx = static_cast(el) % 3; - out.idxer[idx] = - CalcIndices(nb, pmb, el, IndexRangeType::Exterior, false, {Nt, Nu, Nv}); - } - if (nb.snb.level < mylevel) { - out.var = v->coarse_s.Get(); - } else { - out.var = v->data.Get(); - } - - return out; -} - BndInfo BndInfo::GetSendCCFluxCor(std::shared_ptr pmb, const NeighborBlock &nb, std::shared_ptr> v, CommBuffer::owner_t> *buf) { diff --git a/src/bvals/comms/bnd_info.hpp b/src/bvals/comms/bnd_info.hpp index d49303221745..4fcc9ed3f706 100644 --- a/src/bvals/comms/bnd_info.hpp +++ b/src/bvals/comms/bnd_info.hpp @@ -3,7 +3,7 @@ // Copyright(C) 2020 The Parthenon collaboration // Licensed under the 3-clause BSD License, see LICENSE file for details //======================================================================================== -// (C) (or copyright) 2020-2022. Triad National Security, LLC. All rights reserved. +// (C) (or copyright) 2020-2023. Triad National Security, LLC. All rights reserved. // // This program was produced under U.S. Government contract 89233218CNA000001 for Los // Alamos National Laboratory (LANL), which is operated by Triad National Security, LLC @@ -89,13 +89,19 @@ struct ProResInfo { // These are are used to generate the BndInfo struct for various // kinds of boundary types and operations. + static ProResInfo GetNull(std::shared_ptr pmb, const NeighborBlock &nb, + std::shared_ptr> v) { + return ProResInfo(); + } static ProResInfo GetSend(std::shared_ptr pmb, const NeighborBlock &nb, std::shared_ptr> v); static ProResInfo GetSet(std::shared_ptr pmb, const NeighborBlock &nb, std::shared_ptr> v); static ProResInfo GetInteriorProlongate(std::shared_ptr pmb, + const NeighborBlock &nb, std::shared_ptr> v); static ProResInfo GetInteriorRestrict(std::shared_ptr pmb, + const NeighborBlock &nb, std::shared_ptr> v); }; diff --git a/src/bvals/comms/boundary_communication.cpp b/src/bvals/comms/boundary_communication.cpp index fc5231c2da79..351ab6873a5a 100644 --- a/src/bvals/comms/boundary_communication.cpp +++ b/src/bvals/comms/boundary_communication.cpp @@ -3,7 +3,7 @@ // Copyright(C) 2022 The Parthenon collaboration // Licensed under the 3-clause BSD License, see LICENSE file for details //======================================================================================== -// (C) (or copyright) 2022. Triad National Security, LLC. All rights reserved. +// (C) (or copyright) 2022-2023. Triad National Security, LLC. All rights reserved. // // This program was produced under U.S. Government contract 89233218CNA000001 for Los // Alamos National Laboratory (LANL), which is operated by Triad National Security, LLC @@ -67,10 +67,18 @@ TaskStatus SendBoundBufs(std::shared_ptr> &md) { return TaskStatus::incomplete; } - if (rebuild) - RebuildBufferCache(md, nbound, BndInfo::GetSendBndInfo, - ProResInfo::GetSend); - + if (rebuild) { + if constexpr (bound_type == BoundaryType::gmg_restrict_send) { + RebuildBufferCache(md, nbound, BndInfo::GetSendBndInfo, + ProResInfo::GetInteriorRestrict); + } else if constexpr (bound_type == BoundaryType::gmg_prolongate_send) { + RebuildBufferCache(md, nbound, BndInfo::GetSendBndInfo, + ProResInfo::GetNull); + } else { + RebuildBufferCache(md, nbound, BndInfo::GetSendBndInfo, + ProResInfo::GetSend); + } + } // Restrict auto pmb = md->GetBlockData(0)->GetBlockPointer(); StateDescriptor *resolved_packages = pmb->resolved_packages.get(); @@ -152,6 +160,10 @@ template TaskStatus SendBoundBufs(std::shared_ptr(std::shared_ptr> &); template TaskStatus SendBoundBufs(std::shared_ptr> &); +template TaskStatus +SendBoundBufs(std::shared_ptr> &); +template TaskStatus +SendBoundBufs(std::shared_ptr> &); template TaskStatus StartReceiveBoundBufs(std::shared_ptr> &md) { @@ -175,6 +187,10 @@ template TaskStatus StartReceiveBoundBufs(std::shared_ptr> &); template TaskStatus StartReceiveBoundBufs(std::shared_ptr> &); +template TaskStatus +StartReceiveBoundBufs(std::shared_ptr> &); +template TaskStatus StartReceiveBoundBufs( + std::shared_ptr> &); template TaskStatus ReceiveBoundBufs(std::shared_ptr> &md) { @@ -220,6 +236,10 @@ template TaskStatus ReceiveBoundBufs(std::shared_ptr> &); template TaskStatus ReceiveBoundBufs(std::shared_ptr> &); +template TaskStatus +ReceiveBoundBufs(std::shared_ptr> &); +template TaskStatus +ReceiveBoundBufs(std::shared_ptr> &); template TaskStatus SetBounds(std::shared_ptr> &md) { @@ -229,10 +249,19 @@ TaskStatus SetBounds(std::shared_ptr> &md) { auto &cache = md->GetBvarsCache().GetSubCache(bound_type, false); auto [rebuild, nbound] = CheckReceiveBufferCacheForRebuild(md); - if (rebuild) - RebuildBufferCache(md, nbound, BndInfo::GetSetBndInfo, - ProResInfo::GetSet); + if (rebuild) { + if constexpr (bound_type == BoundaryType::gmg_prolongate_recv) { + RebuildBufferCache(md, nbound, BndInfo::GetSetBndInfo, + ProResInfo::GetInteriorProlongate); + } else if constexpr (bound_type == BoundaryType::gmg_restrict_recv) { + RebuildBufferCache(md, nbound, BndInfo::GetSetBndInfo, + ProResInfo::GetNull); + } else { + RebuildBufferCache(md, nbound, BndInfo::GetSetBndInfo, + ProResInfo::GetSet); + } + } // const Real threshold = Globals::sparse_config.allocation_threshold; auto &bnd_info = cache.bnd_info; Kokkos::parallel_for( @@ -301,6 +330,10 @@ TaskStatus SetBounds(std::shared_ptr> &md) { template TaskStatus SetBounds(std::shared_ptr> &); template TaskStatus SetBounds(std::shared_ptr> &); template TaskStatus SetBounds(std::shared_ptr> &); +template TaskStatus +SetBounds(std::shared_ptr> &); +template TaskStatus +SetBounds(std::shared_ptr> &); template TaskStatus ProlongateBounds(std::shared_ptr> &md) { @@ -310,9 +343,20 @@ TaskStatus ProlongateBounds(std::shared_ptr> &md) { auto &cache = md->GetBvarsCache().GetSubCache(bound_type, false); auto [rebuild, nbound] = CheckReceiveBufferCacheForRebuild(md); - if (rebuild) - RebuildBufferCache(md, nbound, BndInfo::GetSetBndInfo, - ProResInfo::GetSet); + + if (rebuild) { + if constexpr (bound_type == BoundaryType::gmg_prolongate_recv) { + RebuildBufferCache(md, nbound, BndInfo::GetSetBndInfo, + ProResInfo::GetInteriorProlongate); + } else if constexpr (bound_type == BoundaryType::gmg_restrict_recv) { + RebuildBufferCache(md, nbound, BndInfo::GetSetBndInfo, + ProResInfo::GetNull); + } else { + RebuildBufferCache(md, nbound, BndInfo::GetSetBndInfo, + ProResInfo::GetSet); + } + } + if (nbound > 0 && pmesh->multilevel) { auto pmb = md->GetBlockData(0)->GetBlockPointer(); StateDescriptor *resolved_packages = pmb->resolved_packages.get(); @@ -333,15 +377,19 @@ template TaskStatus ProlongateBounds(std::shared_ptr> &); template TaskStatus ProlongateBounds(std::shared_ptr> &); +template TaskStatus +ProlongateBounds(std::shared_ptr> &); // Adds all relevant boundary communication to a single task list +template TaskID AddBoundaryExchangeTasks(TaskID dependency, TaskList &tl, std::shared_ptr> &md, bool multilevel) { // TODO(LFR): Splitting up the boundary tasks while doing prolongation can cause some // possible issues for sparse fields. In particular, the order in which // fields are allocated and then set could potentially result in different // results if the default sparse value is non-zero. - const auto any = BoundaryType::any; + // const auto any = BoundaryType::any; + static_assert(bounds == BoundaryType::any || bounds == BoundaryType::gmg_same); // const auto local = BoundaryType::local; // const auto nonlocal = BoundaryType::nonlocal; @@ -361,17 +409,23 @@ TaskID AddBoundaryExchangeTasks(TaskID dependency, TaskList &tl, // auto out = (pro_local | pro); - auto send = tl.AddTask(dependency, SendBoundBufs, md); - auto recv = tl.AddTask(dependency, ReceiveBoundBufs, md); - auto set = tl.AddTask(recv, SetBounds, md); + auto send = tl.AddTask(dependency, SendBoundBufs, md); + auto recv = tl.AddTask(dependency, ReceiveBoundBufs, md); + auto set = tl.AddTask(recv, SetBounds, md); auto pro = set; if (md->GetMeshPointer()->multilevel) { auto cbound = tl.AddTask(set, ApplyBoundaryConditionsOnCoarseOrFineMD, md, true); - pro = tl.AddTask(cbound, ProlongateBounds, md); + pro = tl.AddTask(cbound, ProlongateBounds, md); } auto fbound = tl.AddTask(pro, ApplyBoundaryConditionsOnCoarseOrFineMD, md, false); return fbound; } +template TaskID +AddBoundaryExchangeTasks(TaskID, TaskList &, + std::shared_ptr> &, bool); +template TaskID +AddBoundaryExchangeTasks(TaskID, TaskList &, + std::shared_ptr> &, bool); } // namespace parthenon diff --git a/src/bvals/comms/build_boundary_buffers.cpp b/src/bvals/comms/build_boundary_buffers.cpp index 01868f22ec9b..a76d86dadb94 100644 --- a/src/bvals/comms/build_boundary_buffers.cpp +++ b/src/bvals/comms/build_boundary_buffers.cpp @@ -3,7 +3,7 @@ // Copyright(C) 2022 The Parthenon collaboration // Licensed under the 3-clause BSD License, see LICENSE file for details //======================================================================================== -// (C) (or copyright) 2022. Triad National Security, LLC. All rights reserved. +// (C) (or copyright) 2022-2023. Triad National Security, LLC. All rights reserved. // // This program was produced under U.S. Government contract 89233218CNA000001 for Los // Alamos National Laboratory (LANL), which is operated by Triad National Security, LLC @@ -44,69 +44,70 @@ template void BuildBoundaryBufferSubset(std::shared_ptr> &md, Mesh::comm_buf_map_t &buf_map) { Mesh *pmesh = md->GetMeshPointer(); - ForEachBoundary(md, [&](sp_mb_t pmb, sp_mbd_t /*rc*/, nb_t &nb, - const sp_cv_t v) { - // Calculate the required size of the buffer for this boundary - int buf_size = GetBufferSize(pmb, nb, v); - - // Add a buffer pool if one does not exist for this size - if (pmesh->pool_map.count(buf_size) == 0) { - pmesh->pool_map.emplace(std::make_pair( - buf_size, buf_pool_t([buf_size](buf_pool_t *pool) { - using buf_t = buf_pool_t::base_t; - // TODO(LFR): Make nbuf a user settable parameter - const int nbuf = 200; - buf_t chunk("pool buffer", buf_size * nbuf); - for (int i = 1; i < nbuf; ++i) { - pool->AddFreeObjectToPool( - buf_t(chunk, std::make_pair(i * buf_size, (i + 1) * buf_size))); - } - return buf_t(chunk, std::make_pair(0, buf_size)); - }))); - } - - const int receiver_rank = nb.snb.rank; - const int sender_rank = Globals::my_rank; - - int tag = 0; + ForEachBoundary( + md, [&](sp_mb_t pmb, sp_mbd_t /*rc*/, nb_t &nb, const sp_cv_t v) { + // Calculate the required size of the buffer for this boundary + int buf_size = GetBufferSize(pmb, nb, v); + + // Add a buffer pool if one does not exist for this size + if (pmesh->pool_map.count(buf_size) == 0) { + pmesh->pool_map.emplace(std::make_pair( + buf_size, buf_pool_t([buf_size](buf_pool_t *pool) { + using buf_t = buf_pool_t::base_t; + // TODO(LFR): Make nbuf a user settable parameter + const int nbuf = 200; + buf_t chunk("pool buffer", buf_size * nbuf); + for (int i = 1; i < nbuf; ++i) { + pool->AddFreeObjectToPool( + buf_t(chunk, std::make_pair(i * buf_size, (i + 1) * buf_size))); + } + return buf_t(chunk, std::make_pair(0, buf_size)); + }))); + } + + const int receiver_rank = nb.snb.rank; + const int sender_rank = Globals::my_rank; + + int tag = 0; #ifdef MPI_PARALLEL - // Get a bi-directional mpi tag for this pair of blocks - tag = pmesh->tag_map.GetTag(pmb, nb); - auto comm_label = v->label(); - if constexpr (BTYPE == BoundaryType::flxcor_send || - BTYPE == BoundaryType::flxcor_recv) - comm_label += "_flcor"; - mpi_comm_t comm = pmesh->GetMPIComm(comm_label); + // Get a bi-directional mpi tag for this pair of blocks + tag = pmesh->tag_map.GetTag(pmb, nb); + auto comm_label = v->label(); + if constexpr (BTYPE == BoundaryType::flxcor_send || + BTYPE == BoundaryType::flxcor_recv) + comm_label += "_flcor"; + mpi_comm_t comm = pmesh->GetMPIComm(comm_label); #else // Setting to zero is fine here since this doesn't actually get used when everything // is on the same rank mpi_comm_t comm = 0; #endif - bool use_sparse_buffers = v->IsSet(Metadata::Sparse); - auto get_resource_method = [pmesh, buf_size]() { - return buf_pool_t::owner_t(pmesh->pool_map.at(buf_size).Get()); - }; - - // Build send buffer (unless this is a receiving flux boundary) - if constexpr (!(BTYPE == BoundaryType::flxcor_recv)) { - auto s_key = SendKey(pmb, nb, v); - PARTHENON_DEBUG_REQUIRE(buf_map.count(s_key) == 0, - "Two communication buffers have the same key."); - buf_map[s_key] = CommBuffer::owner_t>( - tag, sender_rank, receiver_rank, comm, get_resource_method, use_sparse_buffers); - } - - // Also build the non-local receive buffers here - if constexpr (!(BTYPE == BoundaryType::flxcor_send)) { - if (sender_rank != receiver_rank) { - auto r_key = ReceiveKey(pmb, nb, v); - buf_map[r_key] = CommBuffer::owner_t>( - tag, receiver_rank, sender_rank, comm, get_resource_method, - use_sparse_buffers); - } - } - }); + bool use_sparse_buffers = v->IsSet(Metadata::Sparse); + auto get_resource_method = [pmesh, buf_size]() { + return buf_pool_t::owner_t(pmesh->pool_map.at(buf_size).Get()); + }; + + // Build send buffer (unless this is a receiving flux boundary) + if constexpr (IsSender(BTYPE)) { + auto s_key = SendKey(pmb, nb, v); + if (buf_map.count(s_key) == 0) + buf_map[s_key] = CommBuffer::owner_t>( + tag, sender_rank, receiver_rank, comm, get_resource_method, + use_sparse_buffers); + } + + // Also build the non-local receive buffers here + if constexpr (IsReceiver(BTYPE)) { + if (sender_rank != receiver_rank) { + auto r_key = ReceiveKey(pmb, nb, v); + if (buf_map.count(r_key) == 0) + buf_map[r_key] = CommBuffer::owner_t>( + tag, receiver_rank, sender_rank, comm, get_resource_method, + use_sparse_buffers); + } + } + }); } } // namespace @@ -130,4 +131,27 @@ TaskStatus BuildBoundaryBuffers(std::shared_ptr> &md) { Kokkos::Profiling::popRegion(); // "Task_BuildSendBoundBufs" return TaskStatus::complete; } + +TaskStatus BuildGMGBoundaryBuffers(std::shared_ptr> &md) { + Kokkos::Profiling::pushRegion("Task_BuildSendBoundBufs"); + Mesh *pmesh = md->GetMeshPointer(); + auto &all_caches = md->GetBvarsCache(); + + // Clear the fast access vectors for this block since they are no longer valid + // after all MeshData call BuildBoundaryBuffers + all_caches.clear(); + BuildBoundaryBufferSubset(md, pmesh->boundary_comm_map); + BuildBoundaryBufferSubset(md, + pmesh->boundary_comm_map); + BuildBoundaryBufferSubset(md, + pmesh->boundary_comm_map); + BuildBoundaryBufferSubset(md, + pmesh->boundary_comm_map); + BuildBoundaryBufferSubset(md, + pmesh->boundary_comm_map); + + Kokkos::Profiling::popRegion(); // "Task_BuildSendBoundBufs" + return TaskStatus::complete; +} + } // namespace parthenon diff --git a/src/bvals/comms/bvals_in_one.hpp b/src/bvals/comms/bvals_in_one.hpp index 6ac792b18441..4559740a314f 100644 --- a/src/bvals/comms/bvals_in_one.hpp +++ b/src/bvals/comms/bvals_in_one.hpp @@ -3,7 +3,7 @@ // Copyright(C) 2020 The Parthenon collaboration // Licensed under the 3-clause BSD License, see LICENSE file for details //======================================================================================== -// (C) (or copyright) 2020-2022. Triad National Security, LLC. All rights reserved. +// (C) (or copyright) 2020-2023. Triad National Security, LLC. All rights reserved. // // This program was produced under U.S. Government contract 89233218CNA000001 for Los // Alamos National Laboratory (LANL), which is operated by Triad National Security, LLC @@ -72,11 +72,13 @@ TaskStatus ReceiveFluxCorrections(std::shared_ptr> &md); TaskStatus SetFluxCorrections(std::shared_ptr> &md); // Adds all relevant boundary communication to a single task list +template TaskID AddBoundaryExchangeTasks(TaskID dependency, TaskList &tl, std::shared_ptr> &md, bool multilevel); -// This task should not be called in down stream code +// These tasks should not be called in down stream code TaskStatus BuildBoundaryBuffers(std::shared_ptr> &md); +TaskStatus BuildGMGBoundaryBuffers(std::shared_ptr> &md); } // namespace parthenon diff --git a/src/bvals/comms/tag_map.cpp b/src/bvals/comms/tag_map.cpp index 732c83e7c01f..7b0b16c6e478 100644 --- a/src/bvals/comms/tag_map.cpp +++ b/src/bvals/comms/tag_map.cpp @@ -3,7 +3,7 @@ // Copyright(C) 2020-2022 The Parthenon collaboration // Licensed under the 3-clause BSD License, see LICENSE file for details //======================================================================================== -// (C) (or copyright) 2020-2022. Triad National Security, LLC. All rights reserved. +// (C) (or copyright) 2020-2023. Triad National Security, LLC. All rights reserved. // // This program was produced under U.S. Government contract 89233218CNA000001 for Los // Alamos National Laboratory (LANL), which is operated by Triad National Security, LLC @@ -33,9 +33,9 @@ TagMap::rank_pair_t TagMap::MakeChannelPair(const std::shared_ptr &pm BlockGeometricElementId bgei_nb{nb.snb.gid, location_idx_nb}; return UnorderedPair(bgei_me, bgei_nb); } - +template void TagMap::AddMeshDataToMap(std::shared_ptr> &md) { - ForEachBoundary(md, [&](sp_mb_t pmb, sp_mbd_t rc, nb_t &nb, const sp_cv_t v) { + ForEachBoundary(md, [&](sp_mb_t pmb, sp_mbd_t rc, nb_t &nb, const sp_cv_t v) { const int other_rank = nb.snb.rank; if (map_.count(other_rank) < 1) map_[other_rank] = rank_pair_map_t(); auto &pair_map = map_[other_rank]; @@ -43,6 +43,18 @@ void TagMap::AddMeshDataToMap(std::shared_ptr> &md) { pair_map[MakeChannelPair(pmb, nb)] = -1; }); } +template void +TagMap::AddMeshDataToMap(std::shared_ptr> &md); +template void +TagMap::AddMeshDataToMap(std::shared_ptr> &md); +template void TagMap::AddMeshDataToMap( + std::shared_ptr> &md); +template void TagMap::AddMeshDataToMap( + std::shared_ptr> &md); +template void TagMap::AddMeshDataToMap( + std::shared_ptr> &md); +template void TagMap::AddMeshDataToMap( + std::shared_ptr> &md); void TagMap::ResolveMap() { #ifdef MPI_PARALLEL diff --git a/src/bvals/comms/tag_map.hpp b/src/bvals/comms/tag_map.hpp index 1bcabfaa7697..8ecfb23d0fea 100644 --- a/src/bvals/comms/tag_map.hpp +++ b/src/bvals/comms/tag_map.hpp @@ -3,7 +3,7 @@ // Copyright(C) 2020-2022 The Parthenon collaboration // Licensed under the 3-clause BSD License, see LICENSE file for details //======================================================================================== -// (C) (or copyright) 2020-2022. Triad National Security, LLC. All rights reserved. +// (C) (or copyright) 2020-2023. Triad National Security, LLC. All rights reserved. // // This program was produced under U.S. Government contract 89233218CNA000001 for Los // Alamos National Laboratory (LANL), which is operated by Triad National Security, LLC @@ -97,6 +97,7 @@ class TagMap { void clear() { map_.clear(); } // Inserts all of the communication channels known about by MeshData md into the map + template void AddMeshDataToMap(std::shared_ptr> &md); // Once all MeshData objects have inserted their known channels into the map, we can diff --git a/src/coordinates/uniform_cartesian.hpp b/src/coordinates/uniform_cartesian.hpp index 71dc98537ed3..b10cbe0ad06c 100644 --- a/src/coordinates/uniform_cartesian.hpp +++ b/src/coordinates/uniform_cartesian.hpp @@ -165,9 +165,12 @@ class UniformCartesian { template KOKKOS_FORCEINLINE_FUNCTION Real X(const int idx) const { using TE = TopologicalElement; - bool constexpr X1EDGE = el == TE::F1 || el == TE::E2 || el == TE::E3 || el == TE::NN; - bool constexpr X2EDGE = el == TE::F2 || el == TE::E3 || el == TE::E1 || el == TE::NN; - bool constexpr X3EDGE = el == TE::F3 || el == TE::E1 || el == TE::E2 || el == TE::NN; + [[maybe_unused]] bool constexpr X1EDGE = + el == TE::F1 || el == TE::E2 || el == TE::E3 || el == TE::NN; + [[maybe_unused]] bool constexpr X2EDGE = + el == TE::F2 || el == TE::E3 || el == TE::E1 || el == TE::NN; + [[maybe_unused]] bool constexpr X3EDGE = + el == TE::F3 || el == TE::E1 || el == TE::E2 || el == TE::NN; if constexpr (dir == X1DIR && X1EDGE) { return xmin_[dir - 1] + idx * dx_[dir - 1]; // idx - 1/2 } else if constexpr (dir == X2DIR && X2EDGE) { diff --git a/src/defs.hpp b/src/defs.hpp index 15dc0d031f83..ace72628f035 100644 --- a/src/defs.hpp +++ b/src/defs.hpp @@ -61,14 +61,6 @@ static_assert(NDIM >= 3, // named, weakly typed / unscoped enums: //------------------ -// needed for arrays dimensioned over grid directions -// enumerator type only used in Mesh::EnrollUserMeshGenerator() -// X0DIR time-like direction -// X1DIR x, r, etc... -// X2DIR y, theta, etc... -// X3DIR z, phi, etc... -enum CoordinateDirection { NODIR = -1, X0DIR = 0, X1DIR = 1, X2DIR = 2, X3DIR = 3 }; - struct RegionSize { RegionSize() = default; RegionSize(std::array xmin, std::array xmax, std::array xrat, @@ -95,6 +87,15 @@ struct RegionSize { int &nx(CoordinateDirection dir) { return nx_[dir - 1]; } const int &nx(CoordinateDirection dir) const { return nx_[dir - 1]; } + // A "symmetry" direction is a a direction that posesses a translational symmetry + // (or rotational symmetry, etc. for non-cartesian coordinate systems) in the given + // problem. In practice, this mean that the Parthenon mesh was setup to have only + // size one in a symmetry direction, the block size is one in those directions, + // and there are no ghost zones in that direction. Since we support multi-grid + // mesh hierarchies where blocks are not all the same size above the root grid + // and can end up having size one even in a non-symmetry direction, we need a different + // identifier for checking if a direction is a symmetry direction beyond just + // checking if the size is one in that direction. bool &symmetry(CoordinateDirection dir) { return symmetry_[dir - 1]; } const bool &symmetry(CoordinateDirection dir) const { return symmetry_[dir - 1]; } }; diff --git a/src/driver/driver.cpp b/src/driver/driver.cpp index 84371eeaeb36..03acf820e536 100644 --- a/src/driver/driver.cpp +++ b/src/driver/driver.cpp @@ -75,6 +75,19 @@ DriverStatus EvolutionDriver::Execute() { // Defaults must be set across all ranks DumpInputParameters(); + // Before loop do work + // App input version + Kokkos::Profiling::pushRegion("Driver_UserWorkBeforeLoop"); + if (app_input->UserWorkBeforeLoop != nullptr) { + app_input->UserWorkBeforeLoop(pmesh, pinput, tm); + } + + // packages version + for (auto &[name, pkg] : pmesh->packages.AllPackages()) { + pkg->UserWorkBeforeLoop(pmesh, pinput, tm); + } + Kokkos::Profiling::popRegion(); // Driver_UserWorkBeforeLoop + Kokkos::Profiling::pushRegion("Driver_Main"); while (tm.KeepGoing()) { if (Globals::my_rank == 0) OutputCycleDiagnostics(); diff --git a/src/interface/data_collection.cpp b/src/interface/data_collection.cpp index 8a03e60a585d..3ac429fd6df4 100644 --- a/src/interface/data_collection.cpp +++ b/src/interface/data_collection.cpp @@ -54,25 +54,53 @@ std::shared_ptr &DataCollection::AddShallow(const std::string &label, return Add(label, src, flags, true); } -template <> std::shared_ptr> & -DataCollection>::GetOrAdd(const std::string &mbd_label, - const int &partition_id) { - const std::string label = mbd_label + "_part-" + std::to_string(partition_id); +GetOrAdd_impl(Mesh *pmy_mesh_, + std::map>> &containers_, + BlockList_t &block_list, const std::string &mbd_label, + const int &partition_id, const int gmg_level) { + std::string label = mbd_label + "_part-" + std::to_string(partition_id); + if (gmg_level >= 0) label = label + "_gmg-" + std::to_string(gmg_level); auto it = containers_.find(label); if (it == containers_.end()) { // TODO(someone) add caching of partitions to Mesh at some point const int pack_size = pmy_mesh_->DefaultPackSize(); - auto partitions = partition::ToSizeN(pmy_mesh_->block_list, pack_size); + auto partitions = partition::ToSizeN(block_list, pack_size); + // Account for possibly empty block_list + if (partitions.size() == 0) partitions = std::vector(1); for (auto i = 0; i < partitions.size(); i++) { - const std::string md_label = mbd_label + "_part-" + std::to_string(i); + std::string md_label = mbd_label + "_part-" + std::to_string(i); + if (gmg_level >= 0) md_label = md_label + "_gmg-" + std::to_string(gmg_level); containers_[md_label] = std::make_shared>(mbd_label); - containers_[md_label]->Set(partitions[i]); + containers_[md_label]->Set(partitions[i], pmy_mesh_); + if (gmg_level >= 0) { + int min_gmg_logical_level = pmy_mesh_->GetGMGMinLogicalLevel(); + containers_[md_label]->grid = GridIdentifier{GridType::two_level_composite, + gmg_level + min_gmg_logical_level}; + } else { + containers_[md_label]->grid = GridIdentifier{GridType::leaf, 0}; + } } } return containers_[label]; } +template <> +std::shared_ptr> & +DataCollection>::GetOrAdd(const std::string &mbd_label, + const int &partition_id) { + return GetOrAdd_impl(pmy_mesh_, containers_, pmy_mesh_->block_list, mbd_label, + partition_id, -1); +} + +template <> +std::shared_ptr> & +DataCollection>::GetOrAdd(int gmg_level, const std::string &mbd_label, + const int &partition_id) { + return GetOrAdd_impl(pmy_mesh_, containers_, pmy_mesh_->gmg_block_lists[gmg_level], + mbd_label, partition_id, gmg_level); +} + template class DataCollection>; template class DataCollection>; diff --git a/src/interface/data_collection.hpp b/src/interface/data_collection.hpp index 1699a12ffb0e..6119bf0a8ae2 100644 --- a/src/interface/data_collection.hpp +++ b/src/interface/data_collection.hpp @@ -21,7 +21,6 @@ namespace parthenon { class Mesh; - /// The DataCollection class is an abstract container that contains at least a /// "base" container of some type (e.g., of MeshData or MeshBlockData) plus /// additional containers identified by string labels. @@ -75,6 +74,8 @@ class DataCollection { void Set(const std::string &name, std::shared_ptr &d) { containers_[name] = d; } std::shared_ptr &GetOrAdd(const std::string &mbd_label, const int &partition_id); + std::shared_ptr &GetOrAdd(int gmg_level, const std::string &mbd_label, + const int &partition_id); void PurgeNonBase() { auto c = containers_.begin(); diff --git a/src/interface/mesh_data.hpp b/src/interface/mesh_data.hpp index b5ef97f642ce..c1da0f13e614 100644 --- a/src/interface/mesh_data.hpp +++ b/src/interface/mesh_data.hpp @@ -181,6 +181,12 @@ const MeshBlockPack

&PackOnMesh(M &map, BlockDataList_t &block_data_, } // namespace pack_on_mesh_impl +enum class GridType { none, leaf, two_level_composite, single_level_with_internal }; +struct GridIdentifier { + GridType type = GridType::none; + int logical_level = 0; +}; + /// The MeshData class is a container for cached MeshBlockPacks, i.e., it /// contains both the pointers to the MeshBlockData of the MeshBlocks contained /// in the object as well as maps to the cached MeshBlockPacks of VariablePacks or @@ -192,6 +198,8 @@ class MeshData { MeshData() = default; explicit MeshData(const std::string &name) : stage_name_(name) {} + GridIdentifier grid; + const auto &StageName() const { return stage_name_; } Mesh *GetMeshPointer() const { return pmy_mesh_; } @@ -210,14 +218,25 @@ class MeshData { auto &GetBvarsCache() { return bvars_cache_; } - IndexRange GetBoundsI(const IndexDomain &domain) const { - return block_data_[0]->GetBoundsI(domain); + template + IndexRange GetBoundsI(Ts &&...args) const { + if (block_data_.size() > 0) + return block_data_[0]->GetBoundsI(std::forward(args)...); + return IndexRange{-1, -2}; } - IndexRange GetBoundsJ(const IndexDomain &domain) const { - return block_data_[0]->GetBoundsJ(domain); + + template + IndexRange GetBoundsJ(Ts &&...args) const { + if (block_data_.size() > 0) + return block_data_[0]->GetBoundsJ(std::forward(args)...); + return IndexRange{-1, -2}; } - IndexRange GetBoundsK(const IndexDomain &domain) const { - return block_data_[0]->GetBoundsK(domain); + + template + IndexRange GetBoundsK(Ts &&...args) const { + if (block_data_.size() > 0) + return block_data_[0]->GetBoundsK(std::forward(args)...); + return IndexRange{-1, -2}; } template @@ -227,10 +246,10 @@ class MeshData { } } - void Set(BlockList_t blocks) { + void Set(BlockList_t blocks, Mesh *pmesh) { const int nblocks = blocks.size(); block_data_.resize(nblocks); - SetMeshPointer(blocks[0]->pmy_mesh); + SetMeshPointer(pmesh); for (int i = 0; i < nblocks; i++) { block_data_[i] = blocks[i]->meshblock_data.Get(stage_name_); } diff --git a/src/interface/meshblock_data.hpp b/src/interface/meshblock_data.hpp index 9d01a9a8eba5..18fd041ee77c 100644 --- a/src/interface/meshblock_data.hpp +++ b/src/interface/meshblock_data.hpp @@ -78,14 +78,17 @@ class MeshBlockData { void SetAllowedDt(const Real dt) const { GetBlockPointer()->SetAllowedDt(dt); } Mesh *GetMeshPointer() const { return GetBlockPointer()->pmy_mesh; } - IndexRange GetBoundsI(const IndexDomain &domain) const { - return GetBlockPointer()->cellbounds.GetBoundsI(domain); + template + IndexRange GetBoundsI(Ts &&...args) const { + return GetBlockPointer()->cellbounds.GetBoundsI(std::forward(args)...); } - IndexRange GetBoundsJ(const IndexDomain &domain) const { - return GetBlockPointer()->cellbounds.GetBoundsJ(domain); + template + IndexRange GetBoundsJ(Ts &&...args) const { + return GetBlockPointer()->cellbounds.GetBoundsJ(std::forward(args)...); } - IndexRange GetBoundsK(const IndexDomain &domain) const { - return GetBlockPointer()->cellbounds.GetBoundsK(domain); + template + IndexRange GetBoundsK(Ts &&...args) const { + return GetBlockPointer()->cellbounds.GetBoundsK(std::forward(args)...); } /// Set the pointer to the mesh block for this container diff --git a/src/interface/metadata.hpp b/src/interface/metadata.hpp index 3373a326c137..7a8658164ee3 100644 --- a/src/interface/metadata.hpp +++ b/src/interface/metadata.hpp @@ -110,6 +110,10 @@ PARTHENON_INTERNAL_FOR_FLAG(WithFluxes) \ /** the variable needs to be communicated across ranks during remeshing */ \ PARTHENON_INTERNAL_FOR_FLAG(ForceRemeshComm) \ + /** the variable participate in GMG calculations */ \ + PARTHENON_INTERNAL_FOR_FLAG(GMGProlongate) \ + /** the variable participate in GMG calculations */ \ + PARTHENON_INTERNAL_FOR_FLAG(GMGRestrict) \ /** the variable must always be allocated for new blocks **/ \ PARTHENON_INTERNAL_FOR_FLAG(ForceAllocOnNewBlocks) namespace parthenon { @@ -472,7 +476,8 @@ class Metadata { // Returns true if this variable should do prolongation/restriction // and false otherwise. bool IsRefined() const { - return (IsSet(Independent) || IsSet(FillGhost) || IsSet(ForceRemeshComm)); + return (IsSet(Independent) || IsSet(FillGhost) || IsSet(ForceRemeshComm) || + IsSet(GMGProlongate) || IsSet(GMGRestrict)); } const std::vector &Shape() const { return shape_; } diff --git a/src/interface/sparse_pack.hpp b/src/interface/sparse_pack.hpp index 1cafee22e3a6..35ac04225393 100644 --- a/src/interface/sparse_pack.hpp +++ b/src/interface/sparse_pack.hpp @@ -34,6 +34,17 @@ namespace parthenon { +KOKKOS_INLINE_FUNCTION +IndexShape GetIndexShape(const ParArray3D &arr, int ng) { + int extra_zone = std::max(TopologicalOffsetJ(arr.topological_element), + TopologicalOffsetK(arr.topological_element)); + extra_zone = std::max(extra_zone, TopologicalOffsetI(arr.topological_element)); + int nx1 = arr.GetDim(1) > 1 ? arr.GetDim(1) - extra_zone - 2 * ng : 0; + int nx2 = arr.GetDim(2) > 1 ? arr.GetDim(2) - extra_zone - 2 * ng : 0; + int nx3 = arr.GetDim(3) > 1 ? arr.GetDim(3) - extra_zone - 2 * ng : 0; + return IndexShape::GetFromSeparateInts(nx3, nx2, nx1, ng); +} + // Sparse pack index type which allows for relatively simple indexing // into non-variable name type based SparsePacks (i.e. objects of // type SparsePack<> which are created with a vector of variable diff --git a/src/interface/sparse_pack_base.cpp b/src/interface/sparse_pack_base.cpp index d370c0f5f734..e8250c4b58bc 100644 --- a/src/interface/sparse_pack_base.cpp +++ b/src/interface/sparse_pack_base.cpp @@ -216,6 +216,13 @@ SparsePackBase SparsePackBase::Build(T *pmd, const PackDescriptor &desc, pack_h(1, b, idx).vector_component = X2DIR; pack_h(2, b, idx).vector_component = X3DIR; } + + if (pv->IsSet(Metadata::Face)) { + pack_h(0, b, idx).topological_element = TopologicalElement::E1; + pack_h(1, b, idx).topological_element = TopologicalElement::E2; + pack_h(2, b, idx).topological_element = TopologicalElement::E3; + } + } else { // This is a cell, node, or a variable that doesn't have // topology information if (pack.coarse_) { @@ -232,6 +239,10 @@ SparsePackBase SparsePackBase::Build(T *pmd, const PackDescriptor &desc, pack_h(3, b, idx) = pv->flux[X3DIR].Get(0, t, u, v); } } + for (auto el : + GetTopologicalElements(pack_h(0, b, idx).topological_type)) { + pack_h(static_cast(el) % 3, b, idx).topological_element = el; + } PARTHENON_REQUIRE( pack_h(0, b, idx).size() > 0, "Seems like this variable might not actually be allocated."); diff --git a/src/interface/state_descriptor.hpp b/src/interface/state_descriptor.hpp index 5bb37ea83b60..41a5ee2d3d4e 100644 --- a/src/interface/state_descriptor.hpp +++ b/src/interface/state_descriptor.hpp @@ -406,6 +406,10 @@ class StateDescriptor { if (InitNewlyAllocatedVarsBlock != nullptr) return InitNewlyAllocatedVarsBlock(rc); } + void UserWorkBeforeLoop(Mesh *pmesh, ParameterInput *pin, SimTime &tm) const { + if (UserWorkBeforeLoopMesh != nullptr) return UserWorkBeforeLoopMesh(pmesh, pin, tm); + } + std::vector> amr_criteria; std::function *rc)> PreCommFillDerivedBlock = nullptr; @@ -416,6 +420,8 @@ class StateDescriptor { std::function *rc)> PostFillDerivedMesh = nullptr; std::function *rc)> FillDerivedBlock = nullptr; std::function *rc)> FillDerivedMesh = nullptr; + std::function UserWorkBeforeLoopMesh = + nullptr; std::function *rc)> PreStepDiagnosticsMesh = nullptr; diff --git a/src/interface/variable_state.hpp b/src/interface/variable_state.hpp index 43dfefa8af35..d8a2ea1ba746 100644 --- a/src/interface/variable_state.hpp +++ b/src/interface/variable_state.hpp @@ -63,6 +63,7 @@ struct VariableState : public empty_state_t { bool initialized = true; TopologicalType topological_type = TopologicalType::Cell; + TopologicalElement topological_element = TopologicalElement::CC; std::size_t tensor_components; std::size_t tensor_shape[3]; }; diff --git a/src/mesh/amr_loadbalance.cpp b/src/mesh/amr_loadbalance.cpp index 32ee6d398d37..1bebea12216a 100644 --- a/src/mesh/amr_loadbalance.cpp +++ b/src/mesh/amr_loadbalance.cpp @@ -28,6 +28,7 @@ #include #include #include +#include #include "parthenon_mpi.hpp" @@ -749,9 +750,9 @@ void Mesh::RedistributeAndRefineMeshBlocks(ParameterInput *pin, ApplicationInput if (newloc[nn].level() < loclist[on].level()) { auto pmb = FindMeshBlock(on); for (auto &var : pmb->vars_cc_) { - restriction_cache.RegisterRegionHost(irestrict++, - ProResInfo::GetInteriorRestrict(pmb, var), - var.get(), resolved_packages.get()); + restriction_cache.RegisterRegionHost( + irestrict++, ProResInfo::GetInteriorRestrict(pmb, NeighborBlock(), var), + var.get(), resolved_packages.get()); } } } @@ -915,9 +916,9 @@ void Mesh::RedistributeAndRefineMeshBlocks(ParameterInput *pin, ApplicationInput if (newloc[nn].level() > loclist[on].level()) { auto pmb = FindMeshBlock(nn); for (auto &var : pmb->vars_cc_) { - prolongation_cache.RegisterRegionHost(iprolong++, - ProResInfo::GetInteriorProlongate(pmb, var), - var.get(), resolved_packages.get()); + prolongation_cache.RegisterRegionHost( + iprolong++, ProResInfo::GetInteriorProlongate(pmb, NeighborBlock(), var), + var.get(), resolved_packages.get()); } } } @@ -930,6 +931,7 @@ void Mesh::RedistributeAndRefineMeshBlocks(ParameterInput *pin, ApplicationInput loclist = std::move(newloc); ranklist = std::move(newrank); costlist = std::move(newcost); + PopulateLeafLocationMap(); // A block newly refined and prolongated may have neighbors which were // already refined to the new level. @@ -940,7 +942,7 @@ void Mesh::RedistributeAndRefineMeshBlocks(ParameterInput *pin, ApplicationInput // Thus we rebuild and synchronize the mesh now, but using a unique // neighbor precedence favoring the "old" fine blocks over "new" ones for (auto &pmb : block_list) { - pmb->pbval->SearchAndSetNeighbors(tree, ranklist.data(), nslist.data(), + pmb->pbval->SearchAndSetNeighbors(this, tree, ranklist.data(), nslist.data(), newly_refined); } // Make sure all old sends/receives are done before we reconfigure the mesh @@ -951,6 +953,9 @@ void Mesh::RedistributeAndRefineMeshBlocks(ParameterInput *pin, ApplicationInput #endif // Re-initialize the mesh with our temporary ownership/neighbor configurations. // No buffers are different when we switch to the final precedence order. + SetSameLevelNeighbors(block_list, leaf_grid_locs, this->GetRootGridInfo(), nbs, false, + 0, newly_refined); + BuildGMGHierarchy(nbs, pin, app_in); Initialize(false, pin, app_in); // Internal refinement relies on the fine shared values, which are only consistent after @@ -960,14 +965,14 @@ void Mesh::RedistributeAndRefineMeshBlocks(ParameterInput *pin, ApplicationInput // Rebuild just the ownership model, this time weighting the "new" fine blocks just like // any other blocks at their level. + SetSameLevelNeighbors(block_list, leaf_grid_locs, this->GetRootGridInfo(), nbs, false); for (auto &pmb : block_list) { - pmb->pbval->SearchAndSetNeighbors(tree, ranklist.data(), nslist.data()); + pmb->pbval->SearchAndSetNeighbors(this, tree, ranklist.data(), nslist.data()); } Kokkos::Profiling::popRegion(); // AMR: Recv data and unpack ResetLoadBalanceVariables(); - Kokkos::Profiling::popRegion(); // RedistributeAndRefineMeshBlocks } } // namespace parthenon diff --git a/src/mesh/domain.hpp b/src/mesh/domain.hpp index 8c5dee72b165..39fe63aea704 100644 --- a/src/mesh/domain.hpp +++ b/src/mesh/domain.hpp @@ -91,12 +91,14 @@ class IndexShape { return dim <= interior_dims.size(); } + KOKKOS_INLINE_FUNCTION void MakeZeroDimensional_(int const index) { x_[index] = IndexRange{0, 0}; entire_ncells_[index] = 1; } public: + KOKKOS_FUNCTION IndexShape() {} IndexShape(const int &nx3, const int &nx2, const int &nx1, const int &ng) @@ -140,6 +142,25 @@ class IndexShape { } } + KOKKOS_INLINE_FUNCTION + static IndexShape GetFromSeparateInts(int nx3, int nx2, int nx1, int ng) { + IndexShape out; + int idx = 0; + for (auto nx : {nx1, nx2, nx3}) { + // For symmetry directions, nx should be passed in as zero (rather than the actual + // value of one) + if (nx == 0) { + out.x_[idx] = IndexRange{0, 0}; + out.entire_ncells_[idx] = 1; + } else { + out.x_[idx] = IndexRange{ng, (ng + nx - 1)}; + out.entire_ncells_[idx] = nx + 2 * ng; + } + idx++; + } + return out; + } + KOKKOS_INLINE_FUNCTION const IndexRange GetBoundsI(const IndexDomain &domain, TE el = TE::CC) const noexcept { return (domain == IndexDomain::interior && el == TE::CC) diff --git a/src/mesh/logical_location.cpp b/src/mesh/logical_location.cpp new file mode 100644 index 000000000000..697cae366581 --- /dev/null +++ b/src/mesh/logical_location.cpp @@ -0,0 +1,396 @@ +//======================================================================================== +// Athena++ astrophysical MHD code +// Copyright(C) 2014 James M. Stone and other code contributors +// Licensed under the 3-clause BSD License, see LICENSE file for details +//======================================================================================== +// Parthenon performance portable AMR framework +// Copyright(C) 2020-2023 The Parthenon collaboration +// Licensed under the 3-clause BSD License, see LICENSE file for details +//======================================================================================== +// (C) (or copyright) 2020-2023. Triad National Security, LLC. All rights reserved. +// +// This program was produced under U.S. Government contract 89233218CNA000001 for Los +// Alamos National Laboratory (LANL), which is operated by Triad National Security, LLC +// for the U.S. Department of Energy/National Nuclear Security Administration. All rights +// in the program are reserved by Triad National Security, LLC, and the U.S. Department +// of Energy/National Nuclear Security Administration. The Government is granted for +// itself and others acting on its behalf a nonexclusive, paid-up, irrevocable worldwide +// license in this material to reproduce, prepare derivative works, distribute copies to +// the public, perform publicly and display publicly, and to permit others to do so. +//======================================================================================== + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "mesh/logical_location.hpp" +#include "utils/error_checking.hpp" +#include "utils/morton_number.hpp" + +namespace parthenon { + +bool LogicalLocation::IsContainedIn(const LogicalLocation &container) const { + if (container.level() > level()) return false; + bool is_contained = true; + const int level_shift = level() - container.level(); + for (int i = 0; i < 3; ++i) + is_contained = is_contained && (l(i) >> level_shift == container.l(i)); + return is_contained; +} + +bool LogicalLocation::Contains(const LogicalLocation &containee) const { + if (containee.level() < level_) return false; + const std::int64_t shifted_lx1 = containee.lx1() >> (containee.level() - level()); + const std::int64_t shifted_lx2 = containee.lx2() >> (containee.level() - level()); + const std::int64_t shifted_lx3 = containee.lx3() >> (containee.level() - level()); + return (shifted_lx1 == lx1()) && (shifted_lx2 == lx2()) && (shifted_lx3 == lx3()); +} + +std::array LogicalLocation::GetOffset(const LogicalLocation &neighbor, + const RootGridInfo &rg_info) const { + std::array offset; + const int level_diff_1 = std::max(neighbor.level() - level(), 0); + const int level_diff_2 = std::max(level() - neighbor.level(), 0); + const int n_per_root_block = 1 << (std::min(level(), neighbor.level()) - rg_info.level); + const int root_block_per_n = + 1 << std::max(rg_info.level - std::min(level(), neighbor.level()), 0); + for (int i = 0; i < 3; ++i) { + offset[i] = (neighbor.l(i) >> level_diff_1) - (l(i) >> level_diff_2); + if (rg_info.periodic[i]) { + int n_cells_level = std::max(n_per_root_block * rg_info.n[i], 1); + if (root_block_per_n > 1) + n_cells_level = + rg_info.n[i] / root_block_per_n + (rg_info.n[i] % root_block_per_n != 0); + if (std::abs(offset[i]) > (n_cells_level / 2)) { + offset[i] %= n_cells_level; + offset[i] += offset[i] > 0 ? -n_cells_level : n_cells_level; + } + } + } + + return offset; +} + +std::array, 3> +LogicalLocation::GetSameLevelOffsets(const LogicalLocation &neighbor, + const RootGridInfo &rg_info) const { + std::array, 3> offsets; + const int level_diff_1 = std::max(neighbor.level() - level(), 0); + const int level_diff_2 = std::max(level() - neighbor.level(), 0); + const int n_per_root_block = + 1 << std::max((std::min(level(), neighbor.level()) - rg_info.level), 0); + const int root_block_per_n = + 1 << std::max(rg_info.level - std::min(level(), neighbor.level()), 0); + for (int i = 0; i < 3; ++i) { + const auto idxt = l(i) >> level_diff_2; + const auto idxn = neighbor.l(i) >> level_diff_1; + if (std::abs(idxn - idxt) <= 1) offsets[i].push_back(idxn - idxt); + + int n_blocks_level = std::max(n_per_root_block * rg_info.n[i], 1); + if (root_block_per_n > 1) + n_blocks_level = + rg_info.n[i] / root_block_per_n + (rg_info.n[i] % root_block_per_n != 0); + if (rg_info.periodic[i]) { + if (std::abs(idxn - n_blocks_level - idxt) <= 1) + offsets[i].push_back(idxn - n_blocks_level - idxt); + if (std::abs(idxn + n_blocks_level - idxt) <= 1) + offsets[i].push_back(idxn + n_blocks_level - idxt); + } + } + + return offsets; +} + +template +bool LogicalLocation::NeighborFindingImpl(const LogicalLocation &in, + const std::array &te_offset, + const RootGridInfo &rg_info) const { + if (in.level() >= level() && Contains(in)) return false; // You share a volume + if (in.level() < level() && in.Contains(*this)) return false; // You share a volume + + // We work on the finer level of in.level() and this->level() + const int max_level = std::max(in.level(), level()); + const int level_shift_1 = max_level - level(); + const int level_shift_2 = max_level - in.level(); + const auto block_size_1 = 1 << level_shift_1; + const auto block_size_2 = 1 << level_shift_2; + + // TODO(LFR): Think about what this should do when we are above the root level + const int n_per_root_block = 1 << std::max(max_level - rg_info.level, 0); + const int root_block_per_n = 1 << std::max(rg_info.level - max_level, 0); + std::array b; + + for (int i = 0; i < 3; ++i) { + // Index range of daughters of this block on current level plus a one block halo on + // either side + auto low = (l(i) << level_shift_1) - 1; + auto hi = low + block_size_1 + 1; + // Indexing for topological offset calculation + if constexpr (TENeighbor) { + if (te_offset[i] == -1) { + // Left side offset, so only two possible block indices are allowed in this + // direction + hi -= block_size_1; + } else if (te_offset[i] == 0) { + // No offset in this direction, so only interior + low += 1; + hi -= 1; + } else { + // Right side offset, so only two possible block indices are allowed in this + // direction + low += block_size_1; + } + } + // Index range of daughters of possible neighbor block on current level + const auto in_low = in.l(i) << level_shift_2; + const auto in_hi = in_low + block_size_2 - 1; + // Check if these two ranges overlap at all + b[i] = in_hi >= low && in_low <= hi; + if (rg_info.periodic[i]) { + int n_cells_level = std::max(n_per_root_block * rg_info.n[i], 1); + if (root_block_per_n > 1) + n_cells_level = + rg_info.n[i] / root_block_per_n + (rg_info.n[i] % root_block_per_n != 0); + b[i] = b[i] || (in_hi + n_cells_level >= low && in_low + n_cells_level <= hi); + b[i] = b[i] || (in_hi - n_cells_level >= low && in_low - n_cells_level <= hi); + } + } + + return b[0] && b[1] && b[2]; +} +template bool +LogicalLocation::NeighborFindingImpl(const LogicalLocation &in, + const std::array &te_offset, + const RootGridInfo &rg_info) const; +template bool +LogicalLocation::NeighborFindingImpl(const LogicalLocation &in, + const std::array &te_offset, + const RootGridInfo &rg_info) const; + +std::vector LogicalLocation::GetDaughters() const { + std::vector daughters; + if (level() < 0) { + daughters.push_back(GetDaughter(0, 0, 0)); + return daughters; + } + daughters.reserve(8); + for (int i : {0, 1}) { + for (int j : {0, 1}) { + for (int k : {0, 1}) { + daughters.push_back(GetDaughter(i, j, k)); + } + } + } + return daughters; +} + +std::unordered_set +LogicalLocation::GetPossibleNeighbors(const RootGridInfo &rg_info) { + const std::vector irange{-1, 0, 1}; + const std::vector jrange{-1, 0, 1}; + const std::vector krange{-1, 0, 1}; + const std::vector daughter_irange{0, 1}; + const std::vector daughter_jrange{0, 1}; + const std::vector daughter_krange{0, 1}; + + return GetPossibleNeighborsImpl(irange, jrange, krange, daughter_irange, + daughter_jrange, daughter_krange, rg_info); +} + +std::unordered_set +LogicalLocation::GetPossibleBlocksSurroundingTopologicalElement( + int ox1, int ox2, int ox3, const RootGridInfo &rg_info) const { + const auto irange = + (std::abs(ox1) == 1) ? std::vector{0, ox1} : std::vector{0}; + const auto jrange = + (std::abs(ox2) == 1) ? std::vector{0, ox2} : std::vector{0}; + const auto krange = + (std::abs(ox3) == 1) ? std::vector{0, ox3} : std::vector{0}; + const auto daughter_irange = + (std::abs(ox1) == 1) ? std::vector{ox1 > 0} : std::vector{0, 1}; + const auto daughter_jrange = + (std::abs(ox2) == 1) ? std::vector{ox2 > 0} : std::vector{0, 1}; + const auto daughter_krange = + (std::abs(ox3) == 1) ? std::vector{ox3 > 0} : std::vector{0, 1}; + + return GetPossibleNeighborsImpl(irange, jrange, krange, daughter_irange, + daughter_jrange, daughter_krange, rg_info); +} + +std::unordered_set LogicalLocation::GetPossibleNeighborsImpl( + const std::vector &irange, const std::vector &jrange, + const std::vector &krange, const std::vector &daughter_irange, + const std::vector &daughter_jrange, const std::vector &daughter_krange, + const RootGridInfo &rg_info) const { + std::vector locs; + + auto AddNeighbors = [&](const LogicalLocation &loc, bool include_parents) { + const int n_per_root_block = 1 << std::max(loc.level() - rg_info.level, 0); + const int down_shift = 1 << std::max(rg_info.level - loc.level(), 0); + // Account for the fact that the root grid may be overhanging into a partial block + const int extra1 = (rg_info.n[0] % down_shift > 0); + const int extra2 = (rg_info.n[1] % down_shift > 0); + const int extra3 = (rg_info.n[2] % down_shift > 0); + int n1_cells_level = + std::max(n_per_root_block * (rg_info.n[0] / down_shift + extra1), 1); + int n2_cells_level = + std::max(n_per_root_block * (rg_info.n[1] / down_shift + extra2), 1); + int n3_cells_level = + std::max(n_per_root_block * (rg_info.n[2] / down_shift + extra3), 1); + for (int i : irange) { + for (int j : jrange) { + for (int k : krange) { + auto lx1 = loc.lx1() + i; + auto lx2 = loc.lx2() + j; + auto lx3 = loc.lx3() + k; + // This should include blocks that are connected by periodic boundaries + if (rg_info.periodic[0]) lx1 = (lx1 + n1_cells_level) % n1_cells_level; + if (rg_info.periodic[1]) lx2 = (lx2 + n2_cells_level) % n2_cells_level; + if (rg_info.periodic[2]) lx3 = (lx3 + n3_cells_level) % n3_cells_level; + if (0 <= lx1 && lx1 < n1_cells_level && 0 <= lx2 && lx2 < n2_cells_level && + 0 <= lx3 && lx3 < n3_cells_level) { + if (loc.level() > level()) { + const int s = loc.level() - level(); + if ((lx1 >> s) != this->lx1() || (lx2 >> s) != this->lx2() || + (lx3 >> s) != this->lx3()) { + locs.emplace_back(loc.level(), lx1, lx2, lx3); + } + } else { + locs.emplace_back(loc.level(), lx1, lx2, lx3); + } + if (include_parents) { + auto parent = locs.back().GetParent(); + if (IsNeighbor(parent, rg_info)) locs.push_back(parent); + } + } + } + } + } + }; + + // Find the same level and lower level blocks of this block + AddNeighbors(*this, true); + + // Iterate over daughters of this block that share the same topological element + for (int l : daughter_irange) { + for (int m : daughter_jrange) { + for (int n : daughter_krange) { + AddNeighbors(GetDaughter(l, m, n), false); + } + } + } + // The above procedure likely duplicated some blocks, so put them in a set + std::unordered_set unique_locs; + for (auto &loc : locs) + unique_locs.emplace(std::move(loc)); + return unique_locs; +} + +block_ownership_t +DetermineOwnership(const LogicalLocation &main_block, + const std::unordered_set &allowed_neighbors, + const RootGridInfo &rg_info, + const std::unordered_set &newly_refined) { + block_ownership_t main_owns; + + auto ownership_level = [&](const LogicalLocation &a) { + // Newly-refined blocks are treated as higher-level than blocks at their + // parent level, but lower-level than previously-refined blocks at their + // current level. + if (newly_refined.count(a)) return 2 * a.level() - 1; + return 2 * a.level(); + }; + + auto ownership_less_than = [ownership_level](const LogicalLocation &a, + const LogicalLocation &b) { + // Ownership is first determined by block with the highest level, then by maximum + // Morton number this is reversed in precedence from the normal comparators where + // Morton number takes precedence + if (ownership_level(a) == ownership_level(b)) return a.morton() < b.morton(); + return ownership_level(a) < ownership_level(b); + }; + + for (int ox1 : {-1, 0, 1}) { + for (int ox2 : {-1, 0, 1}) { + for (int ox3 : {-1, 0, 1}) { + main_owns(ox1, ox2, ox3) = true; + for (auto &n : allowed_neighbors) { + if (ownership_less_than(main_block, n) && + main_block.IsNeighborOfTE(n, ox1, ox2, ox3, rg_info)) { + main_owns(ox1, ox2, ox3) = false; + break; + } + } + } + } + } + return main_owns; +} + +// Given a topological element, ownership array of the sending block, and offset indices +// defining the location of an index region within the block (i.e. the ghost zones passed +// across the x-face or the ghost zones passed across the z-edge), return the index range +// masking array required for masking out unowned regions of the index space. ox? defines +// buffer location on the owner block +block_ownership_t +GetIndexRangeMaskFromOwnership(TopologicalElement el, + const block_ownership_t &sender_ownership, int ox1, + int ox2, int ox3) { + using vp_t = std::vector>; + + // Transform general block ownership to element ownership over entire block. For + // instance, x-faces only care about block ownership in the x-direction First index of + // the pair is the element index and the second index is the block index that is copied + // to that element index + block_ownership_t element_ownership = sender_ownership; + auto x1_idxs = TopologicalOffsetI(el) ? vp_t{{-1, -1}, {0, 0}, {1, 1}} + : vp_t{{-1, 0}, {0, 0}, {1, 0}}; + auto x2_idxs = TopologicalOffsetJ(el) ? vp_t{{-1, -1}, {0, 0}, {1, 1}} + : vp_t{{-1, 0}, {0, 0}, {1, 0}}; + auto x3_idxs = TopologicalOffsetK(el) ? vp_t{{-1, -1}, {0, 0}, {1, 1}} + : vp_t{{-1, 0}, {0, 0}, {1, 0}}; + for (auto [iel, ibl] : x1_idxs) { + for (auto [jel, jbl] : x2_idxs) { + for (auto [kel, kbl] : x3_idxs) { + element_ownership(iel, jel, kel) = sender_ownership(ibl, jbl, kbl); + } + } + } + + // Now, the ownership status is correct for the entire interior index range of the + // block, but the offsets ox? define a subset of these indices (e.g. one edge of the + // interior). Therefore, we need to set the index ownership to true for edges of the + // index range that are contained in the interior of the sending block + if (ox1 != 0) { + for (auto j : {-1, 0, 1}) { + for (auto k : {-1, 0, 1}) { + element_ownership(-ox1, j, k) = element_ownership(0, j, k); + } + } + } + if (ox2 != 0) { + for (auto i : {-1, 0, 1}) { + for (auto k : {-1, 0, 1}) { + element_ownership(i, -ox2, k) = element_ownership(i, 0, k); + } + } + } + if (ox3 != 0) { + for (auto i : {-1, 0, 1}) { + for (auto j : {-1, 0, 1}) { + element_ownership(i, j, -ox3) = element_ownership(i, j, 0); + } + } + } + + return element_ownership; +} + +} // namespace parthenon diff --git a/src/mesh/logical_location.hpp b/src/mesh/logical_location.hpp index 33abdebb96d8..73d50ef02850 100644 --- a/src/mesh/logical_location.hpp +++ b/src/mesh/logical_location.hpp @@ -3,6 +3,10 @@ // Copyright(C) 2014 James M. Stone and other code contributors // Licensed under the 3-clause BSD License, see LICENSE file for details //======================================================================================== +// Parthenon performance portable AMR framework +// Copyright(C) 2020-2023 The Parthenon collaboration +// Licensed under the 3-clause BSD License, see LICENSE file for details +//======================================================================================== // (C) (or copyright) 2020-2023. Triad National Security, LLC. All rights reserved. // // This program was produced under U.S. Government contract 89233218CNA000001 for Los @@ -23,11 +27,12 @@ #include #include #include +#include #include #include #include -#include "logical_location.hpp" +#include "basic_types.hpp" #include "utils/error_checking.hpp" #include "utils/morton_number.hpp" @@ -46,16 +51,13 @@ namespace parthenon { struct RootGridInfo { int level; - int nx1, nx2, nx3; - bool periodic1, periodic2, periodic3; + std::array n; + std::array periodic; // Defaults to root grid of single block at the // coarsest level - RootGridInfo() - : level(0), nx1(1), nx2(1), nx3(1), periodic1(false), periodic2(false), - periodic3(false) {} + RootGridInfo() : level(0), n{1, 1, 1}, periodic{false, false, false} {} RootGridInfo(int level, int nx1, int nx2, int nx3, bool p1, bool p2, bool p3) - : level(level), nx1(nx1), nx2(nx2), nx3(nx3), periodic1(p1), periodic2(p2), - periodic3(p3) {} + : level(level), n{nx1, nx2, nx3}, periodic{p1, p2, p3} {} }; //-------------------------------------------------------------------------------------- @@ -66,153 +68,110 @@ class LogicalLocation { // aggregate and POD type // These values can exceed the range of std::int32_t even if the root grid has only a // single MeshBlock if >30 levels of AMR are used, since the corresponding max index = // 1*2^31 > INT_MAX = 2^31 -1 for most 32-bit signed integer type impelementations - std::int64_t lx1_, lx2_, lx3_; + std::array l_; MortonNumber morton_; int level_; public: + // No check is provided that the requested LogicalLocation is in the allowed + // range of logical location in the requested level. LogicalLocation(int lev, std::int64_t l1, std::int64_t l2, std::int64_t l3) - : lx1_(l1), lx2_(l2), lx3_(l3), level_(lev), morton_(lev, l1, l2, l3) {} + : l_{l1, l2, l3}, level_(lev), morton_(lev, l1, l2, l3) {} LogicalLocation() : LogicalLocation(0, 0, 0, 0) {} - const auto &lx1() const { return lx1_; } - const auto &lx2() const { return lx2_; } - const auto &lx3() const { return lx3_; } + std::string label() const { + return "(" + std::to_string(level_) + ": " + std::to_string(l_[0]) + ", " + + std::to_string(l_[1]) + ", " + std::to_string(l_[2]) + ")"; + } + const auto &l(int i) const { return l_[i]; } + const auto &lx1() const { return l_[0]; } + const auto &lx2() const { return l_[1]; } + const auto &lx3() const { return l_[2]; } const auto &level() const { return level_; } const auto &morton() const { return morton_; } - // operators useful for sorting - bool operator==(LogicalLocation &ll) { - return ((ll.level() == level_) && (ll.lx1() == lx1_) && (ll.lx2() == lx2_) && - (ll.lx3() == lx3_)); + // Returns the coordinate in the range [0, 1] of the left side of + // a logical location in a given direction on refinement level level + Real LLCoord(CoordinateDirection dir, BlockLocation bloc = BlockLocation::Left) const { + auto nblocks_tot = 1 << std::max(level(), 0); + return (static_cast(l(dir - 1)) + 0.5 * static_cast(bloc)) / + static_cast(nblocks_tot); } - bool IsContainedIn(const LogicalLocation &container) const { - if (container.level() > level_) return false; - const std::int64_t shifted_lx1 = lx1_ >> (level_ - container.level()); - const std::int64_t shifted_lx2 = lx2_ >> (level_ - container.level()); - const std::int64_t shifted_lx3 = lx3_ >> (level_ - container.level()); - return (shifted_lx1 == container.lx1()) && (shifted_lx2 == container.lx2()) && - (shifted_lx3 == container.lx3()); - } + bool IsContainedIn(const LogicalLocation &container) const; - bool Contains(const LogicalLocation &containee) const { - if (containee.level() < level_) return false; - const std::int64_t shifted_lx1 = containee.lx1() >> (containee.level() - level_); - const std::int64_t shifted_lx2 = containee.lx2() >> (containee.level() - level_); - const std::int64_t shifted_lx3 = containee.lx3() >> (containee.level() - level_); - return (shifted_lx1 == lx1_) && (shifted_lx2 == lx2_) && (shifted_lx3 == lx3_); - } + bool Contains(const LogicalLocation &containee) const; - std::array GetOffset(const LogicalLocation &neighbor) const { - std::array offset; - offset[0] = (neighbor.lx1() >> std::max(neighbor.level() - level_, 0)) - - (lx1() >> std::max(level_ - neighbor.level(), 0)); - offset[1] = (neighbor.lx2() >> std::max(neighbor.level() - level_, 0)) - - (lx2() >> std::max(level_ - neighbor.level(), 0)); - offset[2] = (neighbor.lx3() >> std::max(neighbor.level() - level_, 0)) - - (lx3() >> std::max(level_ - neighbor.level(), 0)); - return offset; - } + std::array GetOffset(const LogicalLocation &neighbor, + const RootGridInfo &rg_info = RootGridInfo()) const; + std::array, 3> GetSameLevelOffsets(const LogicalLocation &neighbor, + const RootGridInfo &rg_info) const; // Being a neighbor implies that you share a face, edge, or node and don't share a // volume - bool IsNeighbor(const LogicalLocation &in) const { - if (in.level() < level()) return in.IsNeighbor(*this); - if (Contains(in)) return false; // You share a volume - // Only need to consider case where other block is equally or more refined than you - auto offset = 1 << (in.level() - level()); - const auto shifted_lx1 = lx1_ << (in.level() - level()); - const auto shifted_lx2 = lx2_ << (in.level() - level()); - const auto shifted_lx3 = lx3_ << (in.level() - level()); - const bool bx1 = - (in.lx1() >= (shifted_lx1 - 1)) && (in.lx1() <= (shifted_lx1 + offset)); - const bool bx2 = - (in.lx2() >= (shifted_lx2 - 1)) && (in.lx2() <= (shifted_lx2 + offset)); - const bool bx3 = - (in.lx3() >= (shifted_lx3 - 1)) && (in.lx3() <= (shifted_lx3 + offset)); - return bx1 && bx2 && bx3; + bool IsNeighbor(const LogicalLocation &in, + const RootGridInfo &rg_info = RootGridInfo()) const { + return NeighborFindingImpl(in, std::array(), rg_info); } - LogicalLocation GetSameLevelNeighbor(int ox1, int ox2, int ox3) const { - return LogicalLocation(level_, lx1_ + ox1, lx2_ + ox2, lx3_ + ox3); + bool IsNeighborOfTE(const LogicalLocation &in, int ox1, int ox2, int ox3, + const RootGridInfo &rg_info = RootGridInfo()) const { + return NeighborFindingImpl(in, std::array{ox1, ox2, ox3}, rg_info); } - LogicalLocation GetParent() const { - if (level_ == 0) return *this; - return LogicalLocation(level_ - 1, lx1_ >> 1, lx2_ >> 1, lx3_ >> 1); + LogicalLocation + GetSameLevelNeighbor(int ox1, int ox2, int ox3, + const RootGridInfo &rg_info = RootGridInfo()) const { + return LogicalLocation(level(), lx1() + ox1, lx2() + ox2, lx3() + ox3); } - std::vector GetDaughters() const { - std::vector daughters; - daughters.reserve(8); - for (int i : {0, 1}) { - for (int j : {0, 1}) { - for (int k : {0, 1}) { - daughters.push_back(GetDaughter(i, j, k)); - } - } - } - return daughters; + LogicalLocation GetParent() const { + if (level() <= 0) return LogicalLocation(level() - 1, 0, 0, 0); + return LogicalLocation(level() - 1, lx1() >> 1, lx2() >> 1, lx3() >> 1); } - LogicalLocation GetDaughter(int ox1, int ox2, int ox3) const { - return LogicalLocation(level_ + 1, (lx1_ << 1) + ox1, (lx2_ << 1) + ox2, - (lx3_ << 1) + ox3); - } + std::vector GetDaughters() const; - std::set GetPossibleBlocksSurroundingTopologicalElement( - int ox1, int ox2, int ox3, const RootGridInfo &rg_info = RootGridInfo()) const { - std::vector locs; - - const auto irange = - (std::abs(ox1) == 1) ? std::vector{0, ox1} : std::vector{0}; - const auto jrange = - (std::abs(ox2) == 1) ? std::vector{0, ox2} : std::vector{0}; - const auto krange = - (std::abs(ox3) == 1) ? std::vector{0, ox3} : std::vector{0}; - - auto AddNeighbors = [&](const LogicalLocation &loc) { - int n1_cells_level = std::pow(2, loc.level() - rg_info.level) * rg_info.nx1; - int n2_cells_level = std::pow(2, loc.level() - rg_info.level) * rg_info.nx2; - int n3_cells_level = std::pow(2, loc.level() - rg_info.level) * rg_info.nx3; - for (int i : irange) { - for (int j : jrange) { - for (int k : krange) { - auto lx1 = loc.lx1() + i; - auto lx2 = loc.lx2() + j; - auto lx3 = loc.lx3() + k; - // This should include blocks that are connected by periodic boundaries - if (rg_info.periodic1) lx1 = (lx1 + n1_cells_level) % n1_cells_level; - if (rg_info.periodic2) lx2 = (lx2 + n2_cells_level) % n2_cells_level; - if (rg_info.periodic3) lx3 = (lx3 + n3_cells_level) % n3_cells_level; - if (0 <= lx1 && lx1 < n1_cells_level && 0 <= lx2 && lx2 < n2_cells_level && - 0 <= lx3 && lx3 < n3_cells_level) { - locs.emplace_back(loc.level(), lx1, lx2, lx3); - auto parent = locs.back().GetParent(); - if (IsNeighbor(parent)) locs.push_back(parent); - } - } - } - } - }; - - AddNeighbors(*this); - - // Iterate over daughters of this block that share the same topological element - for (int l : - (std::abs(ox1) == 1) ? std::vector{ox1 > 0} : std::vector{0, 1}) { - for (int m : - (std::abs(ox2) == 1) ? std::vector{ox2 > 0} : std::vector{0, 1}) { - for (int n : - (std::abs(ox3) == 1) ? std::vector{ox3 > 0} : std::vector{0, 1}) { - AddNeighbors(GetDaughter(l, m, n)); - } - } + LogicalLocation GetDaughter(int ox1, int ox2, int ox3) const { + if (level() < 0) return LogicalLocation(level() + 1, 0, 0, 0); + return LogicalLocation(level() + 1, (lx1() << 1) + ox1, (lx2() << 1) + ox2, + (lx3() << 1) + ox3); + } + + // LFR: This returns the face offsets of fine-coarse neighbor blocks as defined in + // Athena++, which are stored in the NeighborBlock struct. I believe that these are + // currently only required for flux correction and can eventually be removed when flux + // correction is combined with boundary communication. + auto GetAthenaXXFaceOffsets(const LogicalLocation &neighbor, int ox1, int ox2, int ox3, + const RootGridInfo &rg_info = RootGridInfo()) const { + // The neighbor block struct should only use the first two, but we have three to allow + // for this being a parent of neighbor, this should be checked for elsewhere + std::array f{0, 0, 0}; + if (neighbor.level() == level() + 1) { + int idx = 0; + if (ox1 == 0) f[idx++] = neighbor.lx1() % 2; + if (ox2 == 0) f[idx++] = neighbor.lx2() % 2; + if (ox3 == 0) f[idx++] = neighbor.lx3() % 2; } - // The above procedure likely duplicated some blocks, so put them in a set - return std::set(std::begin(locs), std::end(locs)); + return f; } + + std::unordered_set + GetPossibleNeighbors(const RootGridInfo &rg_info = RootGridInfo()); + + std::unordered_set GetPossibleBlocksSurroundingTopologicalElement( + int ox1, int ox2, int ox3, const RootGridInfo &rg_info = RootGridInfo()) const; + + private: + template + bool NeighborFindingImpl(const LogicalLocation &in, const std::array &te_offset, + const RootGridInfo &rg_info = RootGridInfo()) const; + + std::unordered_set GetPossibleNeighborsImpl( + const std::vector &irange, const std::vector &jrange, + const std::vector &krange, const std::vector &daughter_irange, + const std::vector &daughter_jrange, const std::vector &daughter_krange, + const RootGridInfo &rg_info = RootGridInfo()) const; }; inline bool operator<(const LogicalLocation &lhs, const LogicalLocation &rhs) { @@ -261,111 +220,21 @@ struct block_ownership_t { bool ownership[3][3][3]; }; -inline block_ownership_t +block_ownership_t DetermineOwnership(const LogicalLocation &main_block, - const std::set &allowed_neighbors, + const std::unordered_set &allowed_neighbors, const RootGridInfo &rg_info = RootGridInfo(), - const std::unordered_set &newly_refined = {}) { - block_ownership_t main_owns; - - auto ownership_level = [&](const LogicalLocation &a) { - // Newly-refined blocks are treated as higher-level than blocks at their - // parent level, but lower-level than previously-refined blocks at their - // current level. - if (newly_refined.count(a)) return 2 * a.level() - 1; - return 2 * a.level(); - }; - - auto ownership_less_than = [ownership_level](const LogicalLocation &a, - const LogicalLocation &b) { - // Ownership is first determined by block with the highest level, then by maximum - // Morton number this is reversed in precedence from the normal comparators where - // Morton number takes precedence - if (ownership_level(a) == ownership_level(b)) return a.morton() < b.morton(); - return ownership_level(a) < ownership_level(b); - }; - - for (int ox1 : {-1, 0, 1}) { - for (int ox2 : {-1, 0, 1}) { - for (int ox3 : {-1, 0, 1}) { - auto possible_neighbors = - main_block.GetPossibleBlocksSurroundingTopologicalElement(ox1, ox2, ox3, - rg_info); - - std::vector actual_neighbors; - std::set_intersection(std::begin(allowed_neighbors), std::end(allowed_neighbors), - std::begin(possible_neighbors), - std::end(possible_neighbors), - std::back_inserter(actual_neighbors)); - - auto max = std::max_element(std::begin(actual_neighbors), - std::end(actual_neighbors), ownership_less_than); - main_owns(ox1, ox2, ox3) = - (*max == main_block || ownership_less_than(*max, main_block) || - actual_neighbors.size() == 0); - } - } - } - return main_owns; -} + const std::unordered_set &newly_refined = {}); // Given a topological element, ownership array of the sending block, and offset indices // defining the location of an index region within the block (i.e. the ghost zones passed // across the x-face or the ghost zones passed across the z-edge), return the index range // masking array required for masking out unowned regions of the index space. ox? defines // buffer location on the owner block -inline auto GetIndexRangeMaskFromOwnership(TopologicalElement el, - const block_ownership_t &sender_ownership, - int ox1, int ox2, int ox3) { - using vp_t = std::vector>; - - // Transform general block ownership to element ownership over entire block. For - // instance, x-faces only care about block ownership in the x-direction First index of - // the pair is the element index and the second index is the block index that is copied - // to that element index - block_ownership_t element_ownership = sender_ownership; - auto x1_idxs = TopologicalOffsetI(el) ? vp_t{{-1, -1}, {0, 0}, {1, 1}} - : vp_t{{-1, 0}, {0, 0}, {1, 0}}; - auto x2_idxs = TopologicalOffsetJ(el) ? vp_t{{-1, -1}, {0, 0}, {1, 1}} - : vp_t{{-1, 0}, {0, 0}, {1, 0}}; - auto x3_idxs = TopologicalOffsetK(el) ? vp_t{{-1, -1}, {0, 0}, {1, 1}} - : vp_t{{-1, 0}, {0, 0}, {1, 0}}; - for (auto [iel, ibl] : x1_idxs) { - for (auto [jel, jbl] : x2_idxs) { - for (auto [kel, kbl] : x3_idxs) { - element_ownership(iel, jel, kel) = sender_ownership(ibl, jbl, kbl); - } - } - } - - // Now, the ownership status is correct for the entire interior index range of the - // block, but the offsets ox? define a subset of these indices (e.g. one edge of the - // interior). Therefore, we need to set the index ownership to true for edges of the - // index range that are contained in the interior of the sending block - if (ox1 != 0) { - for (auto j : {-1, 0, 1}) { - for (auto k : {-1, 0, 1}) { - element_ownership(-ox1, j, k) = element_ownership(0, j, k); - } - } - } - if (ox2 != 0) { - for (auto i : {-1, 0, 1}) { - for (auto k : {-1, 0, 1}) { - element_ownership(i, -ox2, k) = element_ownership(i, 0, k); - } - } - } - if (ox3 != 0) { - for (auto i : {-1, 0, 1}) { - for (auto j : {-1, 0, 1}) { - element_ownership(i, j, -ox3) = element_ownership(i, j, 0); - } - } - } - - return element_ownership; -} +block_ownership_t +GetIndexRangeMaskFromOwnership(TopologicalElement el, + const block_ownership_t &sender_ownership, int ox1, + int ox2, int ox3); } // namespace parthenon diff --git a/src/mesh/mesh-gmg.cpp b/src/mesh/mesh-gmg.cpp new file mode 100644 index 000000000000..8b025776ff39 --- /dev/null +++ b/src/mesh/mesh-gmg.cpp @@ -0,0 +1,235 @@ +//======================================================================================== +// Parthenon performance portable AMR framework +// Copyright(C) 2020-2023 The Parthenon collaboration +// Licensed under the 3-clause BSD License, see LICENSE file for details +//======================================================================================== +// (C) (or copyright) 2020-2023. Triad National Security, LLC. All rights reserved. +// +// This program was produced under U.S. Government contract 89233218CNA000001 for Los +// Alamos National Laboratory (LANL), which is operated by Triad National Security, LLC +// for the U.S. Department of Energy/National Nuclear Security Administration. All rights +// in the program are reserved by Triad National Security, LLC, and the U.S. Department +// of Energy/National Nuclear Security Administration. The Government is granted for +// itself and others acting on its behalf a nonexclusive, paid-up, irrevocable worldwide +// license in this material to reproduce, prepare derivative works, distribute copies to +// the public, perform publicly and display publicly, and to permit others to do so. +//======================================================================================== +//! \file mesh_amr.cpp +// \brief implementation of Mesh::AdaptiveMeshRefinement() and related utilities + +#include +#include +#include +#include +#include +#include +#include +#include + +#include "parthenon_mpi.hpp" + +#include "bvals/boundary_conditions.hpp" +#include "defs.hpp" +#include "globals.hpp" +#include "interface/update.hpp" +#include "mesh/mesh.hpp" +#include "mesh/mesh_refinement.hpp" +#include "mesh/meshblock.hpp" +#include "mesh/meshblock_tree.hpp" +#include "parthenon_arrays.hpp" +#include "utils/bit_hacks.hpp" +#include "utils/buffer_utils.hpp" +#include "utils/error_checking.hpp" + +namespace parthenon { + +void Mesh::PopulateLeafLocationMap() { + const int nbtot = ranklist.size(); + leaf_grid_locs.clear(); + for (int ib = 0; ib < nbtot; ++ib) { + leaf_grid_locs[loclist[ib]] = std::make_pair(ib, ranklist[ib]); + } +} + +void Mesh::SetSameLevelNeighbors( + BlockList_t &block_list, const LogicalLocMap_t &loc_map, RootGridInfo root_grid, + int nbs, bool gmg_neighbors, int composite_logical_level, + const std::unordered_set &newly_refined) { + for (auto &pmb : block_list) { + auto loc = pmb->loc; + auto gid = pmb->gid; + auto *neighbor_list = gmg_neighbors ? &(pmb->gmg_same_neighbors) : &(pmb->neighbors); + if (gmg_neighbors && loc.level() == composite_logical_level - 1) { + neighbor_list = &(pmb->gmg_composite_finer_neighbors); + } else if (gmg_neighbors && loc.level() == composite_logical_level) { + neighbor_list = &(pmb->gmg_same_neighbors); + } else if (gmg_neighbors) { + PARTHENON_FAIL("GMG grid was build incorrectly."); + } + + *neighbor_list = {}; + + auto possible_neighbors = loc.GetPossibleNeighbors(root_grid); + for (auto &pos_neighbor_location : possible_neighbors) { + if (gmg_neighbors && loc.level() == composite_logical_level - 1 && + loc.level() == pos_neighbor_location.level()) + continue; + if (loc_map.count(pos_neighbor_location) > 0) { + const auto &gid_rank = loc_map.at(pos_neighbor_location); + auto offsets = loc.GetSameLevelOffsets(pos_neighbor_location, root_grid); + // This inner loop is necessary in case a block pair has multiple neighbor + // connections due to periodic boundaries + for (auto ox1 : offsets[0]) { + for (auto ox2 : offsets[1]) { + for (auto ox3 : offsets[2]) { + NeighborConnect nc; + int connect_indicator = std::abs(ox1) + std::abs(ox2) + std::abs(ox3); + if (connect_indicator == 0) continue; + if (connect_indicator == 1) { + nc = NeighborConnect::face; + } else if (connect_indicator == 2) { + nc = NeighborConnect::edge; + } else if (connect_indicator == 3) { + nc = NeighborConnect::corner; + } + auto f = loc.GetAthenaXXFaceOffsets(pos_neighbor_location, ox1, ox2, ox3, + root_grid); + neighbor_list->emplace_back( + pmb->pmy_mesh, pos_neighbor_location, gid_rank.second, gid_rank.first, + gid_rank.first - nbs, std::array{ox1, ox2, ox3}, nc, 0, 0, f[0], + f[1]); + } + } + } + } + } + // Set neighbor block ownership + std::unordered_set allowed_neighbors; + allowed_neighbors.insert(pmb->loc); + for (auto &nb : *neighbor_list) + allowed_neighbors.insert(nb.loc); + for (auto &nb : *neighbor_list) { + nb.ownership = + DetermineOwnership(nb.loc, allowed_neighbors, root_grid, newly_refined); + nb.ownership.initialized = true; + } + } +} + +void Mesh::BuildGMGHierarchy(int nbs, ParameterInput *pin, ApplicationInput *app_in) { + if (!multigrid) return; + // Create GMG logical location lists, first just copy coarsest grid + auto block_size_default = GetBlockSize(); + + int gmg_level_offset = std::numeric_limits::max(); + for (auto dir : {X1DIR, X2DIR, X3DIR}) { + if (!mesh_size.symmetry(dir)) { + int dir_allowed_levels = + NumberOfBinaryTrailingZeros(block_size_default.nx(dir) * nrbx[dir - 1]); + gmg_level_offset = std::min(dir_allowed_levels, gmg_level_offset); + } + } + + const int gmg_min_level = root_level - gmg_level_offset; + gmg_min_logical_level_ = gmg_min_level; + + const int gmg_levels = current_level - gmg_min_level + 1; + gmg_grid_locs = std::vector(gmg_levels); + gmg_block_lists = std::vector(gmg_levels); + + // Create MeshData objects for GMG + gmg_mesh_data = std::vector>>(gmg_levels); + for (auto &mdc : gmg_mesh_data) + mdc.SetMeshPointer(this); + + // Add leaf grid locations to GMG grid levels + int gmg_gid = 0; + for (auto loc : loclist) { + const int gmg_level = gmg_levels - 1 + loc.level() - current_level; + gmg_grid_locs[gmg_level].insert( + {loc, std::pair(gmg_gid, ranklist[gmg_gid])}); + if (gmg_level < gmg_levels - 1) { + gmg_grid_locs[gmg_level + 1].insert( + {loc, std::pair(gmg_gid, ranklist[gmg_gid])}); + } + if (ranklist[gmg_gid] == Globals::my_rank) { + const int lid = gmg_gid - nslist[Globals::my_rank]; + gmg_block_lists[gmg_level].push_back(block_list[lid]); + if (gmg_level < gmg_levels - 1) + gmg_block_lists[gmg_level + 1].push_back(block_list[lid]); + } + gmg_gid++; + } + + // Fill in internal nodes for GMG grid levels from levels on finer GMG grid + for (int gmg_level = gmg_levels - 2; gmg_level >= 0; --gmg_level) { + int grid_logical_level = gmg_level - gmg_levels + 1 + current_level; + for (auto &[loc, gid_rank] : gmg_grid_locs[gmg_level + 1]) { + if (loc.level() == grid_logical_level + 1) { + auto parent = loc.GetParent(); + if (parent.morton() == loc.morton()) { + gmg_grid_locs[gmg_level].insert( + {parent, std::make_pair(gmg_gid, gid_rank.second)}); + if (gid_rank.second == Globals::my_rank) { + BoundaryFlag block_bcs[6]; + auto block_size = block_size_default; + SetBlockSizeAndBoundaries(parent, block_size, block_bcs); + gmg_block_lists[gmg_level].push_back( + MeshBlock::Make(gmg_gid, -1, parent, block_size, block_bcs, this, pin, + app_in, packages, resolved_packages, gflag)); + } + gmg_gid++; + } + } + } + } + + // Find same level neighbors on all GMG levels + auto root_grid = this->GetRootGridInfo(); + for (int gmg_level = 0; gmg_level < gmg_levels; ++gmg_level) { + int grid_logical_level = gmg_level - gmg_levels + 1 + current_level; + SetSameLevelNeighbors(gmg_block_lists[gmg_level], gmg_grid_locs[gmg_level], root_grid, + nbs, true, grid_logical_level); + } + + // Now find GMG coarser neighbor + for (int gmg_level = 1; gmg_level < gmg_levels; ++gmg_level) { + int grid_logical_level = gmg_level - gmg_levels + 1 + current_level; + for (auto &pmb : gmg_block_lists[gmg_level]) { + if (pmb->loc.level() != grid_logical_level) continue; + auto parent_loc = pmb->loc.GetParent(); + auto loc = pmb->loc; + auto gid = pmb->gid; + auto rank = Globals::my_rank; + if (gmg_grid_locs[gmg_level - 1].count(parent_loc) > 0) { + loc = parent_loc; + gid = gmg_grid_locs[gmg_level - 1][parent_loc].first; + rank = gmg_grid_locs[gmg_level - 1][parent_loc].second; + } else { + PARTHENON_FAIL("There is something wrong with GMG block list."); + } + pmb->gmg_coarser_neighbors.emplace_back(pmb->pmy_mesh, loc, rank, gid, gid - nbs, + std::array{0, 0, 0}, + NeighborConnect::none, 0, 0, 0, 0); + } + } + + // Now find finer GMG neighbors + for (int gmg_level = 0; gmg_level < gmg_levels - 1; ++gmg_level) { + int grid_logical_level = gmg_level - gmg_levels + 1 + current_level; + for (auto &pmb : gmg_block_lists[gmg_level]) { + if (pmb->loc.level() != grid_logical_level) continue; + auto daughter_locs = pmb->loc.GetDaughters(); + for (auto &daughter_loc : daughter_locs) { + if (gmg_grid_locs[gmg_level + 1].count(daughter_loc) > 0) { + auto &gid_rank = gmg_grid_locs[gmg_level + 1][daughter_loc]; + pmb->gmg_finer_neighbors.emplace_back( + pmb->pmy_mesh, daughter_loc, gid_rank.second, gid_rank.first, + gid_rank.first - nbs, std::array{0, 0, 0}, NeighborConnect::none, 0, + 0, 0, 0); + } + } + } + } +} +} // namespace parthenon diff --git a/src/mesh/mesh.cpp b/src/mesh/mesh.cpp index ed81e3e4e219..9150ac8583b0 100644 --- a/src/mesh/mesh.cpp +++ b/src/mesh/mesh.cpp @@ -63,7 +63,7 @@ namespace parthenon { Mesh::Mesh(ParameterInput *pin, ApplicationInput *app_in, Packages_t &packages, int mesh_test) : // public members: - modified(true), + modified(true), is_restart(false), // aggregate initialization of RegionSize struct: mesh_size({pin->GetReal("parthenon/mesh", "x1min"), pin->GetReal("parthenon/mesh", "x2min"), @@ -90,18 +90,21 @@ Mesh::Mesh(ParameterInput *pin, ApplicationInput *app_in, Packages_t &packages, adaptive(pin->GetOrAddString("parthenon/mesh", "refinement", "none") == "adaptive" ? true : false), - multilevel((adaptive || - pin->GetOrAddString("parthenon/mesh", "refinement", "none") == "static") - ? true - : false), + multilevel( + (adaptive || + pin->GetOrAddString("parthenon/mesh", "refinement", "none") == "static" || + pin->GetOrAddString("parthenon/mesh", "multigrid", "false") == "true") + ? true + : false), + multigrid(pin->GetOrAddString("parthenon/mesh", "multigrid", "false") == "true" + ? true + : false), nbnew(), nbdel(), step_since_lb(), gflag(), packages(packages), // private members: num_mesh_threads_(pin->GetOrAddInteger("parthenon/mesh", "num_threads", 1)), tree(this), use_uniform_meshgen_fn_{true, true, true, true}, lb_flag_(true), - lb_automatic_(), lb_manual_(), MeshGenerator_{nullptr, UniformMeshGenerator, - UniformMeshGenerator, - UniformMeshGenerator}, - MeshBndryFnctn{nullptr, nullptr, nullptr, nullptr, nullptr, nullptr} { + lb_automatic_(), + lb_manual_(), MeshBndryFnctn{nullptr, nullptr, nullptr, nullptr, nullptr, nullptr} { std::stringstream msg; RegionSize block_size; BoundaryFlag block_bcs[6]; @@ -204,6 +207,7 @@ Mesh::Mesh(ParameterInput *pin, ApplicationInput *app_in, Packages_t &packages, nrbx[dir - 1] = mesh_size.nx(dir) / block_size.nx(dir); } nbmax = *std::max_element(std::begin(nrbx), std::end(nrbx)); + base_block_size = block_size; // check consistency of the block and mesh if (mesh_size.nx(X1DIR) % block_size.nx(X1DIR) != 0 || @@ -221,18 +225,6 @@ Mesh::Mesh(ParameterInput *pin, ApplicationInput *app_in, Packages_t &packages, } // initialize user-enrollable functions - if (mesh_size.xrat(X1DIR) != 1.0) { - use_uniform_meshgen_fn_[X1DIR] = false; - MeshGenerator_[X1DIR] = DefaultMeshGenerator; - } - if (mesh_size.xrat(X2DIR) != 1.0) { - use_uniform_meshgen_fn_[X2DIR] = false; - MeshGenerator_[X2DIR] = DefaultMeshGenerator; - } - if (mesh_size.xrat(X3DIR) != 1.0) { - use_uniform_meshgen_fn_[X3DIR] = false; - MeshGenerator_[X3DIR] = DefaultMeshGenerator; - } default_pack_size_ = pin->GetOrAddInteger("parthenon/mesh", "pack_size", -1); // calculate the logical root level and maximum level @@ -323,81 +315,38 @@ Mesh::Mesh(ParameterInput *pin, ApplicationInput *app_in, Packages_t &packages, << "Refinement region must be smaller than the whole mesh." << std::endl; PARTHENON_FAIL(msg); } - // find the logical range in the ref_level - // note: if this is too slow, this should be replaced with bi-section search. - std::int64_t lx1min = 0, lx1max = 0, lx2min = 0, lx2max = 0, lx3min = 0, - lx3max = 0; - std::int64_t lxmax = nrbx[X1DIR - 1] * (1LL << ref_lev); - for (lx1min = 0; lx1min < lxmax; lx1min++) { - Real rx = - ComputeMeshGeneratorX(lx1min + 1, lxmax, use_uniform_meshgen_fn_[X1DIR]); - if (MeshGenerator_[X1DIR](rx, mesh_size) > ref_size.xmin(X1DIR)) break; - } - for (lx1max = lx1min; lx1max < lxmax; lx1max++) { - Real rx = - ComputeMeshGeneratorX(lx1max + 1, lxmax, use_uniform_meshgen_fn_[X1DIR]); - if (MeshGenerator_[X1DIR](rx, mesh_size) >= ref_size.xmax(X1DIR)) break; - } - if (lx1min % 2 == 1) lx1min--; - if (lx1max % 2 == 0) lx1max++; - if (ndim >= 2) { // 2D or 3D - lxmax = nrbx[X2DIR - 1] * (1LL << ref_lev); - for (lx2min = 0; lx2min < lxmax; lx2min++) { - Real rx = - ComputeMeshGeneratorX(lx2min + 1, lxmax, use_uniform_meshgen_fn_[X2DIR]); - if (MeshGenerator_[X2DIR](rx, mesh_size) > ref_size.xmin(X2DIR)) break; - } - for (lx2max = lx2min; lx2max < lxmax; lx2max++) { - Real rx = - ComputeMeshGeneratorX(lx2max + 1, lxmax, use_uniform_meshgen_fn_[X2DIR]); - if (MeshGenerator_[X2DIR](rx, mesh_size) >= ref_size.xmax(X2DIR)) break; - } - if (lx2min % 2 == 1) lx2min--; - if (lx2max % 2 == 0) lx2max++; - } - if (ndim == 3) { // 3D - lxmax = nrbx[X3DIR - 1] * (1LL << ref_lev); - for (lx3min = 0; lx3min < lxmax; lx3min++) { - Real rx = - ComputeMeshGeneratorX(lx3min + 1, lxmax, use_uniform_meshgen_fn_[X3DIR]); - if (MeshGenerator_[X3DIR](rx, mesh_size) > ref_size.xmin(X3DIR)) break; - } - for (lx3max = lx3min; lx3max < lxmax; lx3max++) { - Real rx = - ComputeMeshGeneratorX(lx3max + 1, lxmax, use_uniform_meshgen_fn_[X3DIR]); - if (MeshGenerator_[X3DIR](rx, mesh_size) >= ref_size.xmax(X3DIR)) break; + std::int64_t l_region_min[3]{0, 0, 0}; + std::int64_t l_region_max[3]{1, 1, 1}; + for (auto dir : {X1DIR, X2DIR, X3DIR}) { + if (!mesh_size.symmetry(dir)) { + l_region_min[dir - 1] = + GetLLFromMeshCoordinate(dir, lrlev, ref_size.xmin(dir)); + l_region_max[dir - 1] = + GetLLFromMeshCoordinate(dir, lrlev, ref_size.xmax(dir)); + l_region_min[dir - 1] = + std::max(l_region_min[dir - 1], static_cast(0)); + l_region_max[dir - 1] = + std::min(l_region_max[dir - 1], + static_cast(nrbx[dir - 1] * (1LL << ref_lev) - 1)); + auto current_loc = + LogicalLocation(lrlev, l_region_max[0], l_region_max[1], l_region_max[2]); + // Remove last block if it just it's boundary overlaps with the region + if (GetMeshCoordinate(dir, BlockLocation::Left, current_loc) == + ref_size.xmax(dir)) + l_region_max[dir - 1]--; + if (l_region_min[dir - 1] % 2 == 1) l_region_min[dir - 1]--; + if (l_region_max[dir - 1] % 2 == 0) l_region_max[dir - 1]++; } - if (lx3min % 2 == 1) lx3min--; - if (lx3max % 2 == 0) lx3max++; } - // create the finest level - if (ndim == 1) { - for (std::int64_t i = lx1min; i < lx1max; i += 2) { - LogicalLocation nloc(lrlev, i, 0, 0); - int nnew; - tree.AddMeshBlock(nloc, nnew); - } - } - if (ndim == 2) { - for (std::int64_t j = lx2min; j < lx2max; j += 2) { - for (std::int64_t i = lx1min; i < lx1max; i += 2) { - LogicalLocation nloc(lrlev, i, j, 0); + for (std::int64_t k = l_region_min[2]; k < l_region_max[2]; k += 2) { + for (std::int64_t j = l_region_min[1]; j < l_region_max[1]; j += 2) { + for (std::int64_t i = l_region_min[0]; i < l_region_max[0]; i += 2) { + LogicalLocation nloc(lrlev, i, j, k); int nnew; tree.AddMeshBlock(nloc, nnew); } } } - if (ndim == 3) { - for (std::int64_t k = lx3min; k < lx3max; k += 2) { - for (std::int64_t j = lx2min; j < lx2max; j += 2) { - for (std::int64_t i = lx1min; i < lx1max; i += 2) { - LogicalLocation nloc(lrlev, i, j, k); - int nnew; - tree.AddMeshBlock(nloc, nnew); - } - } - } - } } pib = pib->pnext; } @@ -443,6 +392,7 @@ Mesh::Mesh(ParameterInput *pin, ApplicationInput *app_in, Packages_t &packages, costlist = std::vector(nbtotal, 1.0); CalculateLoadBalance(costlist, ranklist, nslist, nblist); + PopulateLeafLocationMap(); // Output some diagnostic information to terminal @@ -471,9 +421,11 @@ Mesh::Mesh(ParameterInput *pin, ApplicationInput *app_in, Packages_t &packages, block_list[i - nbs] = MeshBlock::Make(i, i - nbs, loclist[i], block_size, block_bcs, this, pin, app_in, packages, resolved_packages, gflag); - block_list[i - nbs]->SearchAndSetNeighbors(tree, ranklist.data(), nslist.data()); + block_list[i - nbs]->SearchAndSetNeighbors(this, tree, ranklist.data(), + nslist.data()); } - + SetSameLevelNeighbors(block_list, leaf_grid_locs, this->GetRootGridInfo(), nbs, false); + BuildGMGHierarchy(nbs, pin, app_in); ResetLoadBalanceVariables(); } @@ -484,7 +436,7 @@ Mesh::Mesh(ParameterInput *pin, ApplicationInput *app_in, RestartReader &rr, : // public members: // aggregate initialization of RegionSize struct: // (will be overwritten by memcpy from restart file, in this case) - modified(true), + modified(true), is_restart(true), // aggregate initialization of RegionSize struct: mesh_size({pin->GetReal("parthenon/mesh", "x1min"), pin->GetReal("parthenon/mesh", "x2min"), @@ -511,18 +463,21 @@ Mesh::Mesh(ParameterInput *pin, ApplicationInput *app_in, RestartReader &rr, adaptive(pin->GetOrAddString("parthenon/mesh", "refinement", "none") == "adaptive" ? true : false), - multilevel((adaptive || - pin->GetOrAddString("parthenon/mesh", "refinement", "none") == "static") - ? true - : false), + multilevel( + (adaptive || + pin->GetOrAddString("parthenon/mesh", "refinement", "none") == "static" || + pin->GetOrAddString("parthenon/mesh", "multigrid", "false") == "true") + ? true + : false), + multigrid(pin->GetOrAddString("parthenon/mesh", "multigrid", "false") == "true" + ? true + : false), nbnew(), nbdel(), step_since_lb(), gflag(), packages(packages), // private members: num_mesh_threads_(pin->GetOrAddInteger("parthenon/mesh", "num_threads", 1)), tree(this), use_uniform_meshgen_fn_{true, true, true, true}, lb_flag_(true), - lb_automatic_(), lb_manual_(), MeshGenerator_{nullptr, UniformMeshGenerator, - UniformMeshGenerator, - UniformMeshGenerator}, - MeshBndryFnctn{nullptr, nullptr, nullptr, nullptr, nullptr, nullptr} { + lb_automatic_(), + lb_manual_(), MeshBndryFnctn{nullptr, nullptr, nullptr, nullptr, nullptr, nullptr} { std::stringstream msg; RegionSize block_size; BoundaryFlag block_bcs[6]; @@ -594,26 +549,20 @@ Mesh::Mesh(ParameterInput *pin, ApplicationInput *app_in, RestartReader &rr, const auto nGhost = rr.GetAttr("Info", "NGhost"); for (auto &dir : {X1DIR, X2DIR, X3DIR}) { + block_size.xrat(dir) = mesh_size.xrat(dir); block_size.nx(dir) = blockSize[dir - 1] - (blockSize[dir - 1] > 1) * includesGhost * 2 * nGhost; + if (block_size.nx(dir) == 1) { + block_size.symmetry(dir) = true; + mesh_size.symmetry(dir) = true; + } else { + block_size.symmetry(dir) = false; + mesh_size.symmetry(dir) = false; + } // calculate the number of the blocks nrbx[dir - 1] = mesh_size.nx(dir) / block_size.nx(dir); } - - // initialize user-enrollable functions - if (mesh_size.xrat(X1DIR) != 1.0) { - use_uniform_meshgen_fn_[X1DIR] = false; - MeshGenerator_[X1DIR] = DefaultMeshGenerator; - } - if (mesh_size.xrat(X2DIR) != 1.0) { - use_uniform_meshgen_fn_[X2DIR] = false; - MeshGenerator_[X2DIR] = DefaultMeshGenerator; - } - if (mesh_size.xrat(X3DIR) != 1.0) { - use_uniform_meshgen_fn_[X3DIR] = false; - MeshGenerator_[X3DIR] = DefaultMeshGenerator; - } - default_pack_size_ = pin->GetOrAddInteger("parthenon/mesh", "pack_size", -1); + base_block_size = block_size; // Load balancing flag and parameters RegisterLoadBalancing_(pin); @@ -697,6 +646,7 @@ Mesh::Mesh(ParameterInput *pin, ApplicationInput *app_in, RestartReader &rr, } CalculateLoadBalance(costlist, ranklist, nslist, nblist); + PopulateLeafLocationMap(); // Output MeshBlock list and quit (mesh test only); do not create meshes if (mesh_test > 0) { @@ -729,9 +679,11 @@ Mesh::Mesh(ParameterInput *pin, ApplicationInput *app_in, RestartReader &rr, block_list[i - nbs] = MeshBlock::Make(i, i - nbs, loclist[i], block_size, block_bcs, this, pin, app_in, packages, resolved_packages, gflag, costlist[i]); - block_list[i - nbs]->SearchAndSetNeighbors(tree, ranklist.data(), nslist.data()); + block_list[i - nbs]->SearchAndSetNeighbors(this, tree, ranklist.data(), + nslist.data()); } - + SetSameLevelNeighbors(block_list, leaf_grid_locs, this->GetRootGridInfo(), nbs, false); + BuildGMGHierarchy(nbs, pin, app_in); ResetLoadBalanceVariables(); } @@ -938,40 +890,6 @@ void Mesh::EnrollBndryFncts_(ApplicationInput *app_in) { } } -//---------------------------------------------------------------------------------------- -//! \fn void Mesh::EnrollUserMeshGenerator(CoordinateDirection,MeshGenFunc my_mg) -// \brief Enroll a user-defined function for Mesh generation - -void Mesh::EnrollUserMeshGenerator(CoordinateDirection dir, MeshGenFunc my_mg) { - std::stringstream msg; - if (dir < 0 || dir >= 3) { - msg << "### FATAL ERROR in EnrollUserMeshGenerator function" << std::endl - << "dirName = " << dir << " not valid" << std::endl; - PARTHENON_FAIL(msg); - } - if (dir == X1DIR && mesh_size.xrat(X1DIR) > 0.0) { - msg << "### FATAL ERROR in EnrollUserMeshGenerator function" << std::endl - << "x1rat = " << mesh_size.xrat(X1DIR) - << " must be negative for user-defined mesh generator in X1DIR " << std::endl; - PARTHENON_FAIL(msg); - } - if (dir == X2DIR && mesh_size.xrat(X2DIR) > 0.0) { - msg << "### FATAL ERROR in EnrollUserMeshGenerator function" << std::endl - << "x2rat = " << mesh_size.xrat(X2DIR) - << " must be negative for user-defined mesh generator in X2DIR " << std::endl; - PARTHENON_FAIL(msg); - } - if (dir == X3DIR && mesh_size.xrat(X3DIR) > 0.0) { - msg << "### FATAL ERROR in EnrollUserMeshGenerator function" << std::endl - << "x3rat = " << mesh_size.xrat(X3DIR) - << " must be negative for user-defined mesh generator in X3DIR " << std::endl; - PARTHENON_FAIL(msg); - } - use_uniform_meshgen_fn_[dir] = false; - MeshGenerator_[dir] = my_mg; - return; -} - //---------------------------------------------------------------------------------------- // \!fn void Mesh::ApplyUserWorkBeforeOutput(ParameterInput *pin) // \brief Apply MeshBlock::UserWorkBeforeOutput @@ -1040,7 +958,16 @@ void Mesh::Initialize(bool init_problem, ParameterInput *pin, ApplicationInput * tag_map.clear(); for (int i = 0; i < num_partitions; i++) { auto &md = mesh_data.GetOrAdd("base", i); - tag_map.AddMeshDataToMap(md); + tag_map.AddMeshDataToMap(md); + for (int gmg_level = 0; gmg_level < gmg_mesh_data.size(); ++gmg_level) { + auto &mdg = gmg_mesh_data[gmg_level].GetOrAdd(gmg_level, "base", i); + // tag_map.AddMeshDataToMap(mdg); + tag_map.AddMeshDataToMap(mdg); + tag_map.AddMeshDataToMap(mdg); + tag_map.AddMeshDataToMap(mdg); + tag_map.AddMeshDataToMap(mdg); + tag_map.AddMeshDataToMap(mdg); + } } tag_map.ResolveMap(); @@ -1071,6 +998,11 @@ void Mesh::Initialize(bool init_problem, ParameterInput *pin, ApplicationInput * for (int i = 0; i < num_partitions; i++) { auto &md = mesh_data.GetOrAdd("base", i); BuildBoundaryBuffers(md); + for (int gmg_level = 0; gmg_level < gmg_mesh_data.size(); ++gmg_level) { + auto &mdg = gmg_mesh_data[gmg_level].GetOrAdd(gmg_level, "base", i); + BuildBoundaryBuffers(mdg); + BuildGMGBoundaryBuffers(mdg); + } } std::vector sent(num_partitions, false); @@ -1173,7 +1105,7 @@ void Mesh::Initialize(bool init_problem, ParameterInput *pin, ApplicationInput * } while (!init_done); // Initialize the "base" MeshData object - mesh_data.Get()->Set(block_list); + mesh_data.Get()->Set(block_list, this); Kokkos::Profiling::popRegion(); // Mesh::Initialize } @@ -1194,40 +1126,68 @@ std::shared_ptr Mesh::FindMeshBlock(int tgid) const { // RegionSize &block_size, BundaryFlag *block_bcs) // \brief Set the physical part of a block_size structure and block boundary conditions -void Mesh::SetBlockSizeAndBoundaries(LogicalLocation loc, RegionSize &block_size, +bool Mesh::SetBlockSizeAndBoundaries(LogicalLocation loc, RegionSize &block_size, BoundaryFlag *block_bcs) { - const int &ll = loc.level(); - const std::array lx{-1, loc.lx1(), loc.lx2(), loc.lx3()}; + bool valid_region = true; + block_size = GetBlockSize(loc); for (auto &dir : {X1DIR, X2DIR, X3DIR}) { - block_size.xrat(dir) = mesh_size.xrat(dir); - block_size.symmetry(dir) = mesh_size.symmetry(dir); - if (mesh_size.nx(dir) == 1) { - block_size.xmin(dir) = mesh_size.xmin(dir); - block_size.xmax(dir) = mesh_size.xmax(dir); + if (!block_size.symmetry(dir)) { + std::int64_t nrbx_ll = nrbx[dir - 1] << (loc.level() - root_level); + if (loc.level() < root_level) { + std::int64_t fac = 1 << (root_level - loc.level()); + nrbx_ll = nrbx[dir - 1] / fac + (nrbx[dir - 1] % fac != 0); + } + block_bcs[GetInnerBoundaryFace(dir)] = + loc.l(dir - 1) == 0 ? mesh_bcs[GetInnerBoundaryFace(dir)] : BoundaryFlag::block; + block_bcs[GetOuterBoundaryFace(dir)] = loc.l(dir - 1) == nrbx_ll - 1 + ? mesh_bcs[GetOuterBoundaryFace(dir)] + : BoundaryFlag::block; + } else { block_bcs[GetInnerBoundaryFace(dir)] = mesh_bcs[GetInnerBoundaryFace(dir)]; block_bcs[GetOuterBoundaryFace(dir)] = mesh_bcs[GetOuterBoundaryFace(dir)]; - } else { - std::int64_t nrbx_ll = nrbx[dir - 1] << (ll - root_level); - if (lx[dir] == 0) { - block_size.xmin(dir) = mesh_size.xmin(dir); - block_bcs[GetInnerBoundaryFace(dir)] = mesh_bcs[GetInnerBoundaryFace(dir)]; - } else { - Real rx = ComputeMeshGeneratorX(lx[dir], nrbx_ll, use_uniform_meshgen_fn_[dir]); - block_size.xmin(dir) = MeshGenerator_[dir](rx, mesh_size); - block_bcs[GetInnerBoundaryFace(dir)] = BoundaryFlag::block; - } + } + } + return valid_region; +} + +//---------------------------------------------------------------------------------------- +// \!fn void Mesh::GetBlockSize(const LogicalLocation &loc) const +// \brief Find the (hyper-)rectangular region of the grid covered by the block at +// logical location loc - if (lx[dir] == nrbx_ll - 1) { +RegionSize Mesh::GetBlockSize(const LogicalLocation &loc) const { + RegionSize block_size = GetBlockSize(); + bool valid_region = true; + for (auto &dir : {X1DIR, X2DIR, X3DIR}) { + block_size.xrat(dir) = mesh_size.xrat(dir); + block_size.symmetry(dir) = mesh_size.symmetry(dir); + if (!block_size.symmetry(dir)) { + std::int64_t nrbx_ll = nrbx[dir - 1] << (loc.level() - root_level); + if (loc.level() < root_level) { + std::int64_t fac = 1 << (root_level - loc.level()); + nrbx_ll = nrbx[dir - 1] / fac + (nrbx[dir - 1] % fac != 0); + } + block_size.xmin(dir) = GetMeshCoordinate(dir, BlockLocation::Left, loc); + block_size.xmax(dir) = GetMeshCoordinate(dir, BlockLocation::Right, loc); + // Correct for possible overshooting, since the root grid may not cover the + // entire logical level zero block of the mesh + if (block_size.xmax(dir) > mesh_size.xmax(dir) || loc.level() < 0) { + // Need integer reduction factor, so transform location back to root level + PARTHENON_REQUIRE(loc.level() < root_level, "Something is messed up."); + std::int64_t loc_low = loc.l(dir - 1) << (root_level - loc.level()); + std::int64_t loc_hi = (loc.l(dir - 1) + 1) << (root_level - loc.level()); + if (block_size.nx(dir) * (nrbx[dir - 1] - loc_low) % (loc_hi - loc_low) != 0) + valid_region = false; + block_size.nx(dir) = + block_size.nx(dir) * (nrbx[dir - 1] - loc_low) / (loc_hi - loc_low); block_size.xmax(dir) = mesh_size.xmax(dir); - block_bcs[GetOuterBoundaryFace(dir)] = mesh_bcs[GetOuterBoundaryFace(dir)]; - } else { - Real rx = - ComputeMeshGeneratorX(lx[dir] + 1, nrbx_ll, use_uniform_meshgen_fn_[dir]); - block_size.xmax(dir) = MeshGenerator_[dir](rx, mesh_size); - block_bcs[GetOuterBoundaryFace(dir)] = BoundaryFlag::block; } + } else { + block_size.xmin(dir) = mesh_size.xmin(dir); + block_size.xmax(dir) = mesh_size.xmax(dir); } } + return block_size; } std::int64_t Mesh::GetTotalCells() { @@ -1239,7 +1199,7 @@ std::int64_t Mesh::GetTotalCells() { int Mesh::GetNumberOfMeshBlockCells() const { return block_list.front()->GetNumberOfMeshBlockCells(); } -const RegionSize &Mesh::GetBlockSize() const { return block_list.front()->block_size; } +const RegionSize &Mesh::GetBlockSize() const { return base_block_size; } // Functionality re-used in mesh constructor void Mesh::RegisterLoadBalancing_(ParameterInput *pin) { @@ -1273,7 +1233,9 @@ void Mesh::SetupMPIComms() { // Create both boundary and flux communicators for everything with either FillGhost // or WithFluxes just to be safe if (metadata.IsSet(Metadata::FillGhost) || metadata.IsSet(Metadata::WithFluxes) || - metadata.IsSet(Metadata::ForceRemeshComm)) { + metadata.IsSet(Metadata::ForceRemeshComm) || + metadata.IsSet(Metadata::GMGProlongate) || + metadata.IsSet(Metadata::GMGRestrict)) { MPI_Comm mpi_comm; PARTHENON_MPI_CHECK(MPI_Comm_dup(MPI_COMM_WORLD, &mpi_comm)); const auto ret = mpi_comm_map_.insert({pair.first.label(), mpi_comm}); diff --git a/src/mesh/mesh.hpp b/src/mesh/mesh.hpp index ad0d1a85539c..efc6feb57b9e 100644 --- a/src/mesh/mesh.hpp +++ b/src/mesh/mesh.hpp @@ -21,13 +21,16 @@ // The Mesh is the overall grid structure, and MeshBlocks are local patches of data // (potentially on different levels) that tile the entire domain. +#include #include #include +#include #include #include #include #include #include +#include #include #include @@ -61,6 +64,9 @@ class MeshRefinement; class ParameterInput; class RestartReader; +// Map from LogicalLocation to (gid, rank) pair of location +using LogicalLocMap_t = std::map>; + //---------------------------------------------------------------------------------------- //! \class Mesh // \brief data/functions associated with the overall mesh @@ -91,13 +97,16 @@ class Mesh { // TODO(JMM): Move block_size into mesh. int GetNumberOfMeshBlockCells() const; const RegionSize &GetBlockSize() const; + RegionSize GetBlockSize(const LogicalLocation &loc) const; // data bool modified; + const bool is_restart; RegionSize mesh_size; + RegionSize base_block_size; BoundaryFlag mesh_bcs[BOUNDARY_NFACES]; const int ndim; // number of dimensions - const bool adaptive, multilevel; + const bool adaptive, multilevel, multigrid; int nbtotal, nbnew, nbdel; std::uint64_t mbcnt; @@ -110,9 +119,16 @@ class Mesh { DataCollection> mesh_data; + LogicalLocMap_t leaf_grid_locs; + std::vector gmg_grid_locs; + std::vector gmg_block_lists; + std::vector>> gmg_mesh_data; + int GetGMGMaxLevel() { return gmg_grid_locs.size() - 1; } + int GetGMGMinLogicalLevel() { return gmg_min_logical_level_; } + // functions void Initialize(bool init_problem, ParameterInput *pin, ApplicationInput *app_in); - void SetBlockSizeAndBoundaries(LogicalLocation loc, RegionSize &block_size, + bool SetBlockSizeAndBoundaries(LogicalLocation loc, RegionSize &block_size, BoundaryFlag *block_bcs); void OutputCycleDiagnostics(); void LoadBalancingAndAdaptiveMeshRefinement(ParameterInput *pin, @@ -166,10 +182,11 @@ class Mesh { int GetRootLevel() const noexcept { return root_level; } RootGridInfo GetRootGridInfo() const noexcept { - return RootGridInfo(root_level, nrbx[0], nrbx[1], nrbx[2], - mesh_bcs[BoundaryFace::inner_x1] == BoundaryFlag::periodic, - mesh_bcs[BoundaryFace::inner_x2] == BoundaryFlag::periodic, - mesh_bcs[BoundaryFace::inner_x3] == BoundaryFlag::periodic); + return RootGridInfo( + root_level, nrbx[0], nrbx[1], nrbx[2], + mesh_bcs[BoundaryFace::inner_x1] == BoundaryFlag::periodic && ndim > 0, + mesh_bcs[BoundaryFace::inner_x2] == BoundaryFlag::periodic && ndim > 1, + mesh_bcs[BoundaryFace::inner_x3] == BoundaryFlag::periodic && ndim > 2); } int GetMaxLevel() const noexcept { return max_level; } int GetCurrentLevel() const noexcept { return current_level; } @@ -254,14 +271,14 @@ class Mesh { // size of default MeshBlockPacks int default_pack_size_; + int gmg_min_logical_level_ = 0; + #ifdef MPI_PARALLEL // Global map of MPI comms for separate variables std::unordered_map mpi_comm_map_; #endif // functions - MeshGenFunc MeshGenerator_[4]; - void CalculateLoadBalance(std::vector const &costlist, std::vector &ranklist, std::vector &nslist, std::vector &nblist); @@ -273,74 +290,43 @@ class Mesh { bool GatherCostListAndCheckBalance(); void RedistributeAndRefineMeshBlocks(ParameterInput *pin, ApplicationInput *app_in, int ntot); - + void BuildGMGHierarchy(int nbs, ParameterInput *pin, ApplicationInput *app_in); + void + SetSameLevelNeighbors(BlockList_t &block_list, const LogicalLocMap_t &loc_map, + RootGridInfo root_grid, int nbs, bool gmg_neighbors, + int composite_logical_level = 0, + const std::unordered_set &newly_refined = {}); // defined in either the prob file or default_pgen.cpp in ../pgen/ static void InitUserMeshDataDefault(Mesh *mesh, ParameterInput *pin); std::function InitUserMeshData = InitUserMeshDataDefault; void EnrollBndryFncts_(ApplicationInput *app_in); - void EnrollUserMeshGenerator(CoordinateDirection dir, MeshGenFunc my_mg); // Re-used functionality in constructor void RegisterLoadBalancing_(ParameterInput *pin); void SetupMPIComms(); -}; - -//---------------------------------------------------------------------------------------- -// \!fn Real ComputeMeshGeneratorX(std::int64_t index, std::int64_t nrange, -// bool sym_interval) -// \brief wrapper fn to compute Real x logical location for either [0., 1.] or [-0.5, 0.5] -// real cell ranges for MeshGenerator_[] functions (default/user vs. uniform) - -inline Real ComputeMeshGeneratorX(std::int64_t index, std::int64_t nrange, - bool sym_interval) { - // index is typically 0, ... nrange for non-ghost boundaries - if (!sym_interval) { - // to map to fractional logical position [0.0, 1.0], simply divide by # of faces - return static_cast(index) / static_cast(nrange); - } else { - // to map to a [-0.5, 0.5] range, rescale int indices around 0 before FP conversion - // if nrange is even, there is an index at center x=0.0; map it to (int) 0 - // if nrange is odd, the center x=0.0 is between two indices; map them to -1, 1 - std::int64_t noffset = index - (nrange) / 2; - std::int64_t noffset_ceil = index - (nrange + 1) / 2; // = noffset if nrange is even - // std::cout << "noffset, noffset_ceil = " << noffset << ", " << noffset_ceil << "\n"; - // average the (possibly) biased integer indexing - return static_cast(noffset + noffset_ceil) / (2.0 * nrange); + void PopulateLeafLocationMap(); + + // Transform from logical location coordinates to uniform mesh coordinates accounting + // for root grid + Real GetMeshCoordinate(CoordinateDirection dir, BlockLocation bloc, + const LogicalLocation &loc) const { + auto xll = loc.LLCoord(dir, bloc); + auto root_fac = static_cast(1 << root_level) / static_cast(nrbx[dir - 1]); + xll *= root_fac; + return mesh_size.xmin(dir) * (1.0 - xll) + mesh_size.xmax(dir) * xll; } -} -//---------------------------------------------------------------------------------------- -// \!fn Real DefaultMeshGenerator(Real x, RegionSize rs) -// \brief generic default mesh generator function, x is the logical location; x=i/nx, real -// in [0., 1.] -template -inline Real DefaultMeshGenerator(Real x, RegionSize rs) { - Real lw, rw; - if (rs.xrat(dir) == 1.0) { - rw = x, lw = 1.0 - x; - } else { - Real ratn = std::pow(rs.xrat(dir), rs.nx(dir)); - Real rnx = std::pow(rs.xrat(dir), x * rs.nx(dir)); - lw = (rnx - ratn) / (1.0 - ratn); - rw = 1.0 - lw; + std::int64_t GetLLFromMeshCoordinate(CoordinateDirection dir, int level, + Real xmesh) const { + auto root_fac = static_cast(1 << root_level) / static_cast(nrbx[dir - 1]); + auto xLL = (xmesh - mesh_size.xmin(dir)) / + (mesh_size.xmax(dir) - mesh_size.xmin(dir)) / root_fac; + return static_cast((1 << std::max(level, 0)) * xLL); } - // linear interp, equally weighted from left (x(xmin)=0.0) and right (x(xmax)=1.0) - return rs.xmin(dir) * lw + rs.xmax(dir) * rw; -} - -//---------------------------------------------------------------------------------------- -// \!fn Real UniformMeshGeneratorX1(Real x, RegionSize rs) -// \brief generic mesh generator function, x is the logical location; real cells in [-0.5, -// 0.5] -template -inline Real UniformMeshGenerator(Real x, RegionSize rs) { - // linear interp, equally weighted from left (x(xmin)=-0.5) and right (x(xmax)=0.5) - return static_cast(0.5) * (rs.xmin(dir) + rs.xmax(dir)) + - (x * rs.xmax(dir) - x * rs.xmin(dir)); -} +}; } // namespace parthenon diff --git a/src/mesh/meshblock.cpp b/src/mesh/meshblock.cpp index 4735e10c88c5..e2d7af7c6ac0 100644 --- a/src/mesh/meshblock.cpp +++ b/src/mesh/meshblock.cpp @@ -218,8 +218,17 @@ void MeshBlock::InitializeIndexShapesImpl(const int nx1, const int nx2, const in if (init_coarse) { if (multilevel) { + // Prevent the coarse bounds from going to zero + int cnx1 = nx1 / 2; + int cnx2 = nx2 / 2; + int cnx3 = nx3 / 2; + if (pmy_mesh != nullptr) { + cnx1 = pmy_mesh->mesh_size.symmetry(X1DIR) ? 0 : std::max(1, nx1 / 2); + cnx2 = pmy_mesh->mesh_size.symmetry(X2DIR) ? 0 : std::max(1, nx2 / 2); + cnx3 = pmy_mesh->mesh_size.symmetry(X3DIR) ? 0 : std::max(1, nx3 / 2); + } cnghost = (Globals::nghost + 1) / 2 + 1; - c_cellbounds = IndexShape(nx3 / 2, nx2 / 2, nx1 / 2, Globals::nghost); + c_cellbounds = IndexShape(cnx3, cnx2, cnx1, Globals::nghost); } else { c_cellbounds = IndexShape(nx3 / 2, nx2 / 2, nx1 / 2, 0); } diff --git a/src/mesh/meshblock.hpp b/src/mesh/meshblock.hpp index e488c9db424e..a00b2b5b0b4e 100644 --- a/src/mesh/meshblock.hpp +++ b/src/mesh/meshblock.hpp @@ -3,7 +3,7 @@ // Copyright(C) 2014 James M. Stone and other code contributors // Licensed under the 3-clause BSD License, see LICENSE file for details //======================================================================================== -// (C) (or copyright) 2020-2022. Triad National Security, LLC. All rights reserved. +// (C) (or copyright) 2020-2023. Triad National Security, LLC. All rights reserved. // // This program was produced under U.S. Government contract 89233218CNA000001 for Los // Alamos National Laboratory (LANL), which is operated by Triad National Security, LLC @@ -161,6 +161,13 @@ class MeshBlock : public std::enable_shared_from_this { std::unique_ptr pbswarm; std::unique_ptr pmr; + // Block connectivity information + std::vector neighbors; + std::vector gmg_coarser_neighbors; + std::vector gmg_composite_finer_neighbors; + std::vector gmg_same_neighbors; + std::vector gmg_finer_neighbors; + BoundaryFlag boundary_flag[6]; // functions @@ -273,8 +280,9 @@ class MeshBlock : public std::enable_shared_from_this { int GetNumberOfMeshBlockCells() const { return block_size.nx(X1DIR) * block_size.nx(X2DIR) * block_size.nx(X3DIR); } - void SearchAndSetNeighbors(MeshBlockTree &tree, int *ranklist, int *nslist) { - pbval->SearchAndSetNeighbors(tree, ranklist, nslist); + void SearchAndSetNeighbors(Mesh *mesh, MeshBlockTree &tree, int *ranklist, + int *nslist) { + pbval->SearchAndSetNeighbors(mesh, tree, ranklist, nslist); } // inform MeshBlock which arrays contained in member Field, Particles, diff --git a/src/outputs/history.cpp b/src/outputs/history.cpp index f36586b4a0d9..3f6117f6adf2 100644 --- a/src/outputs/history.cpp +++ b/src/outputs/history.cpp @@ -67,13 +67,13 @@ void HistoryOutput::WriteOutputFile(Mesh *pm, ParameterInput *pin, SimTime *tm, auto &md_base = pm->mesh_data.Get(); // Populated with all blocks if (md_base->NumBlocks() == 0) { - md_base->Set(pm->block_list); + md_base->Set(pm->block_list, pm); } else if (md_base->NumBlocks() != pm->block_list.size()) { PARTHENON_WARN( "Resetting \"base\" MeshData to contain all blocks. This indicates that " "the \"base\" MeshData container has been modified elsewhere. Double check " "that the modification was intentional and is compatible with this reset.") - md_base->Set(pm->block_list); + md_base->Set(pm->block_list, pm); } auto result = hist_var.hst_fun(md_base.get()); #ifdef MPI_PARALLEL diff --git a/src/prolong_restrict/pr_ops.hpp b/src/prolong_restrict/pr_ops.hpp index f9252e748fbc..95eb94879923 100644 --- a/src/prolong_restrict/pr_ops.hpp +++ b/src/prolong_restrict/pr_ops.hpp @@ -1,9 +1,9 @@ //======================================================================================== // Parthenon performance portable AMR framework -// Copyright(C) 2020-2022 The Parthenon collaboration +// Copyright(C) 2020-2023 The Parthenon collaboration // Licensed under the 3-clause BSD License, see LICENSE file for details //======================================================================================== -// (C) (or copyright) 2020-2022. Triad National Security, LLC. All rights reserved. +// (C) (or copyright) 2020-2023. Triad National Security, LLC. All rights reserved. // // This program was produced under U.S. Government contract 89233218CNA000001 // for Los Alamos National Laboratory (LANL), which is operated by Triad @@ -94,9 +94,9 @@ GetGridSpacings(const Coordinates_t &coords, const Coordinates_t &coarse_coords, KOKKOS_FORCEINLINE_FUNCTION Real GradMinMod(const Real fc, const Real fm, const Real fp, const Real dxm, - const Real dxp) { - const Real gxm = (fc - fm) / dxm; - const Real gxp = (fp - fc) / dxp; + const Real dxp, Real &gxm, Real &gxp) { + gxm = (fc - fm) / dxm; + gxp = (fp - fc) / dxp; return 0.5 * (SIGN(gxm) + SIGN(gxp)) * std::min(std::abs(gxm), std::abs(gxp)); } @@ -163,7 +163,8 @@ struct RestrictAverage { } }; -struct ProlongateSharedMinMod { +template +struct ProlongateSharedGeneral { static constexpr bool OperationRequired(TopologicalElement fel, TopologicalElement cel) { return fel == cel; @@ -199,65 +200,84 @@ struct ProlongateSharedMinMod { Real dx1fm = 0; [[maybe_unused]] Real dx1fp = 0; - Real gx1c = 0; + [[maybe_unused]] Real gx1m = 0, gx1p = 0; if constexpr (INCLUDE_X1) { Real dx1m, dx1p; GetGridSpacings<1, el>(coords, coarse_coords, cib, ib, i, fi, &dx1m, &dx1p, &dx1fm, &dx1fp); - gx1c = GradMinMod(fc, coarse(element_idx, l, m, n, k, j, i - 1), - coarse(element_idx, l, m, n, k, j, i + 1), dx1m, dx1p); + + Real gx1c = + GradMinMod(fc, coarse(element_idx, l, m, n, k, j, i - 1), + coarse(element_idx, l, m, n, k, j, i + 1), dx1m, dx1p, gx1m, gx1p); + if constexpr (use_minmod_slope) { + gx1m = gx1c; + gx1p = gx1c; + } } Real dx2fm = 0; [[maybe_unused]] Real dx2fp = 0; - Real gx2c = 0; + [[maybe_unused]] Real gx2m = 0, gx2p = 0; if constexpr (INCLUDE_X2) { Real dx2m, dx2p; GetGridSpacings<2, el>(coords, coarse_coords, cjb, jb, j, fj, &dx2m, &dx2p, &dx2fm, &dx2fp); - gx2c = GradMinMod(fc, coarse(element_idx, l, m, n, k, j - 1, i), - coarse(element_idx, l, m, n, k, j + 1, i), dx2m, dx2p); + Real gx2c = + GradMinMod(fc, coarse(element_idx, l, m, n, k, j - 1, i), + coarse(element_idx, l, m, n, k, j + 1, i), dx2m, dx2p, gx2m, gx2p); + if constexpr (use_minmod_slope) { + gx2m = gx2c; + gx2p = gx2c; + } } Real dx3fm = 0; [[maybe_unused]] Real dx3fp = 0; - Real gx3c = 0; + [[maybe_unused]] Real gx3m = 0, gx3p = 0; if constexpr (INCLUDE_X3) { Real dx3m, dx3p; GetGridSpacings<3, el>(coords, coarse_coords, ckb, kb, k, fk, &dx3m, &dx3p, &dx3fm, &dx3fp); - gx3c = GradMinMod(fc, coarse(element_idx, l, m, n, k - 1, j, i), - coarse(element_idx, l, m, n, k + 1, j, i), dx3m, dx3p); + Real gx3c = + GradMinMod(fc, coarse(element_idx, l, m, n, k - 1, j, i), + coarse(element_idx, l, m, n, k + 1, j, i), dx3m, dx3p, gx3m, gx3p); + if constexpr (use_minmod_slope) { + gx3m = gx3c; + gx3p = gx3c; + } } // KGF: add the off-centered quantities first to preserve FP symmetry // JMM: Extraneous quantities are zero fine(element_idx, l, m, n, fk, fj, fi) = - fc - (gx1c * dx1fm + gx2c * dx2fm + gx3c * dx3fm); + fc - (gx1m * dx1fm + gx2m * dx2fm + gx3m * dx3fm); if constexpr (INCLUDE_X1) fine(element_idx, l, m, n, fk, fj, fi + 1) = - fc + (gx1c * dx1fp - gx2c * dx2fm - gx3c * dx3fm); + fc + (gx1p * dx1fp - gx2m * dx2fm - gx3m * dx3fm); if constexpr (INCLUDE_X2) fine(element_idx, l, m, n, fk, fj + 1, fi) = - fc - (gx1c * dx1fm - gx2c * dx2fp + gx3c * dx3fm); + fc - (gx1m * dx1fm - gx2p * dx2fp + gx3m * dx3fm); if constexpr (INCLUDE_X2 && INCLUDE_X1) fine(element_idx, l, m, n, fk, fj + 1, fi + 1) = - fc + (gx1c * dx1fp + gx2c * dx2fp - gx3c * dx3fm); + fc + (gx1p * dx1fp + gx2p * dx2fp - gx3m * dx3fm); if constexpr (INCLUDE_X3) fine(element_idx, l, m, n, fk + 1, fj, fi) = - fc - (gx1c * dx1fm + gx2c * dx2fm - gx3c * dx3fp); + fc - (gx1m * dx1fm + gx2m * dx2fm - gx3p * dx3fp); if constexpr (INCLUDE_X3 && INCLUDE_X1) fine(element_idx, l, m, n, fk + 1, fj, fi + 1) = - fc + (gx1c * dx1fp - gx2c * dx2fm + gx3c * dx3fp); + fc + (gx1p * dx1fp - gx2m * dx2fm + gx3p * dx3fp); if constexpr (INCLUDE_X3 && INCLUDE_X2) fine(element_idx, l, m, n, fk + 1, fj + 1, fi) = - fc - (gx1c * dx1fm - gx2c * dx2fp - gx3c * dx3fp); + fc - (gx1m * dx1fm - gx2p * dx2fp - gx3p * dx3fp); if constexpr (INCLUDE_X3 && INCLUDE_X2 && INCLUDE_X1) fine(element_idx, l, m, n, fk + 1, fj + 1, fi + 1) = - fc + (gx1c * dx1fp + gx2c * dx2fp + gx3c * dx3fp); + fc + (gx1p * dx1fp + gx2p * dx2fp + gx3p * dx3fp); } }; +using ProlongateSharedMinMod = ProlongateSharedGeneral; +using ProlongateSharedLinear = ProlongateSharedGeneral; + struct ProlongateInternalAverage { static constexpr bool OperationRequired(TopologicalElement fel, TopologicalElement cel) { diff --git a/src/utils/bit_hacks.hpp b/src/utils/bit_hacks.hpp new file mode 100644 index 000000000000..3d82d24206c2 --- /dev/null +++ b/src/utils/bit_hacks.hpp @@ -0,0 +1,93 @@ +//======================================================================================== +// (C) (or copyright) 2020-2023. Triad National Security, LLC. All rights reserved. +// +// This program was produced under U.S. Government contract 89233218CNA000001 for Los +// Alamos National Laboratory (LANL), which is operated by Triad National Security, LLC +// for the U.S. Department of Energy/National Nuclear Security Administration. All rights +// in the program are reserved by Triad National Security, LLC, and the U.S. Department +// of Energy/National Nuclear Security Administration. The Government is granted for +// itself and others acting on its behalf a nonexclusive, paid-up, irrevocable worldwide +// license in this material to reproduce, prepare derivative works, distribute copies to +// the public, perform publicly and display publicly, and to permit others to do so. +//======================================================================================== + +#ifndef UTILS_BIT_HACKS_HPP_ +#define UTILS_BIT_HACKS_HPP_ + +namespace parthenon { +namespace impl { +template +inline constexpr uint64_t GetInterleaveConstant(int power) { + // For power = 2, NDIM = 3, this should return + // ...000011000011 + // For power = 1, NDIM = 3, this should return + // ...001001001001 + // For power = 2, NDIM = 2, this should return + // ...001100110011 + // etc. + constexpr int type_bit_size = sizeof(uint64_t) * 8; + if (power >= type_bit_size) return ~0ULL; // Return with all bits set + uint64_t i_const = ~((~0ULL) << power); // std::pow(2, power) - 1; + int cur_shift = type_bit_size * NDIM; // Works for anything that will fit in uint64_t + while (cur_shift >= NDIM * power) { + if (cur_shift < type_bit_size) i_const = (i_const << cur_shift) | i_const; + cur_shift /= 2; + } + return i_const; +} +} // namespace impl + +template +inline uint64_t InterleaveZeros(uint64_t x) { + // This is a standard bithack for interleaving zeros in binary numbers to make a Morton + // number + if constexpr (N_VALID_BITS >= 64) + x = (x | x << 64 * (NDIM - 1)) & impl::GetInterleaveConstant(64); + if constexpr (N_VALID_BITS >= 32) + x = (x | x << 32 * (NDIM - 1)) & impl::GetInterleaveConstant(32); + if constexpr (N_VALID_BITS >= 16) + x = (x | x << 16 * (NDIM - 1)) & impl::GetInterleaveConstant(16); + if constexpr (N_VALID_BITS >= 8) + x = (x | x << 8 * (NDIM - 1)) & impl::GetInterleaveConstant(8); + if constexpr (N_VALID_BITS >= 4) + x = (x | x << 4 * (NDIM - 1)) & impl::GetInterleaveConstant(4); + if constexpr (N_VALID_BITS >= 2) + x = (x | x << 2 * (NDIM - 1)) & impl::GetInterleaveConstant(2); + if constexpr (N_VALID_BITS >= 1) + x = (x | x << 1 * (NDIM - 1)) & impl::GetInterleaveConstant(1); + return x; +} + +inline int NumberOfBinaryTrailingZeros(std::uint64_t val) { + int n = 0; + if (val == 0) return sizeof(val) * 8; + if ((val & 0xFFFFFFFF) == 0) { + n += 32; + val = val >> 32; + } + if ((val & 0x0000FFFF) == 0) { + n += 16; + val = val >> 16; + } + if ((val & 0x000000FF) == 0) { + n += 8; + val = val >> 8; + } + if ((val & 0x0000000F) == 0) { + n += 4; + val = val >> 4; + } + if ((val & 0x00000003) == 0) { + n += 2; + val = val >> 2; + } + if ((val & 0x00000001) == 0) { + n += 1; + val = val >> 1; + } + return n; +} + +} // namespace parthenon + +#endif // UTILS_BIT_HACKS_HPP_ diff --git a/src/utils/indexer.hpp b/src/utils/indexer.hpp index 6a5b62328394..311f5c416440 100644 --- a/src/utils/indexer.hpp +++ b/src/utils/indexer.hpp @@ -14,6 +14,7 @@ #define UTILS_INDEXER_HPP_ #include +#include #include #include #include @@ -27,6 +28,14 @@ struct Indexer { KOKKOS_INLINE_FUNCTION Indexer() : N{}, start{}, _size{} {}; + std::string GetRangesString() const { + std::string out; + for (int i = 0; i < sizeof...(Ts); ++i) { + out += "[ " + std::to_string(start[i]) + ", " + std::to_string(end[i]) + "]"; + } + return out; + } + KOKKOS_INLINE_FUNCTION explicit Indexer(std::pair... Ns) : N{GetFactors(std::make_tuple((Ns.second - Ns.first + 1)...), diff --git a/src/utils/loop_utils.hpp b/src/utils/loop_utils.hpp index 9f0412b9c31b..38baed40f769 100644 --- a/src/utils/loop_utils.hpp +++ b/src/utils/loop_utils.hpp @@ -3,7 +3,7 @@ // Copyright(C) 2022 The Parthenon collaboration // Licensed under the 3-clause BSD License, see LICENSE file for details //======================================================================================== -// (C) (or copyright) 2022. Triad National Security, LLC. All rights reserved. +// (C) (or copyright) 2022-2023. Triad National Security, LLC. All rights reserved. // // This program was produced under U.S. Government contract 89233218CNA000001 for Los // Alamos National Laboratory (LANL), which is operated by Triad National Security, LLC @@ -24,6 +24,7 @@ #include "bvals/comms/bnd_info.hpp" // TODO(JMM): Remove me when possible #include "interface/metadata.hpp" #include "mesh/domain.hpp" // TODO(JMM): Remove me when possible +#include "mesh/mesh.hpp" namespace parthenon { @@ -79,36 +80,82 @@ inline void ForEachBoundary(std::shared_ptr> &md, F func) { for (int block = 0; block < md->NumBlocks(); ++block) { auto &rc = md->GetBlockData(block); auto pmb = rc->GetBlockPointer(); + auto *gmg_same = pmb->loc.level() == md->grid.logical_level + ? &(pmb->gmg_same_neighbors) + : &(pmb->gmg_composite_finer_neighbors); for (auto &v : rc->GetVariableVector()) { - if (v->IsSet(Metadata::FillGhost) || v->IsSet(Metadata::WithFluxes)) { - for (int n = 0; n < pmb->pbval->nneighbor; ++n) { - auto &nb = pmb->pbval->neighbor[n]; - if constexpr (bound == BoundaryType::local) { - if (!v->IsSet(Metadata::FillGhost)) continue; - if (nb.snb.rank != Globals::my_rank) continue; - } else if constexpr (bound == BoundaryType::nonlocal) { - if (!v->IsSet(Metadata::FillGhost)) { - continue; + if constexpr (bound == BoundaryType::gmg_restrict_send) { + if (pmb->loc.level() != md->grid.logical_level) continue; + if (v->IsSet(Metadata::GMGRestrict)) { + for (auto &nb : pmb->gmg_coarser_neighbors) { + if (func_caller(func, pmb, rc, nb, v) == LoopControl::break_out) return; + } + } + } else if constexpr (bound == BoundaryType::gmg_restrict_recv) { + if (pmb->loc.level() != md->grid.logical_level) continue; + if (v->IsSet(Metadata::GMGRestrict)) { + for (auto &nb : pmb->gmg_finer_neighbors) { + if (func_caller(func, pmb, rc, nb, v) == LoopControl::break_out) { + return; + } + } + } + } else if constexpr (bound == BoundaryType::gmg_prolongate_send) { + if (pmb->loc.level() != md->grid.logical_level) continue; + if (v->IsSet(Metadata::GMGProlongate)) { + for (auto &nb : pmb->gmg_finer_neighbors) { + if (func_caller(func, pmb, rc, nb, v) == LoopControl::break_out) { + return; + } + } + } + } else if constexpr (bound == BoundaryType::gmg_prolongate_recv) { + if (pmb->loc.level() != md->grid.logical_level) continue; + if (v->IsSet(Metadata::GMGProlongate)) { + for (auto &nb : pmb->gmg_coarser_neighbors) { + if (func_caller(func, pmb, rc, nb, v) == LoopControl::break_out) { + return; + } + } + } + } else if constexpr (bound == BoundaryType::gmg_same) { + if (v->IsSet(Metadata::FillGhost)) { + for (auto &nb : *gmg_same) { + if (func_caller(func, pmb, rc, nb, v) == LoopControl::break_out) { + return; + } + } + } + } else { + if (v->IsSet(Metadata::FillGhost) || v->IsSet(Metadata::WithFluxes)) { + for (auto &nb : pmb->neighbors) { + if constexpr (bound == BoundaryType::local) { + if (!v->IsSet(Metadata::FillGhost)) continue; + if (nb.snb.rank != Globals::my_rank) continue; + } else if constexpr (bound == BoundaryType::nonlocal) { + if (!v->IsSet(Metadata::FillGhost)) { + continue; + } + if (nb.snb.rank == Globals::my_rank) continue; + } else if constexpr (bound == BoundaryType::any) { + if (!v->IsSet(Metadata::FillGhost)) continue; + } else if constexpr (bound == BoundaryType::flxcor_send) { + if (!v->IsSet(Metadata::WithFluxes)) continue; + // Check if this boundary requires flux correction + if (nb.snb.level != pmb->loc.level() - 1) continue; + // No flux correction required unless boundaries share a face + if (std::abs(nb.ni.ox1) + std::abs(nb.ni.ox2) + std::abs(nb.ni.ox3) != 1) + continue; + } else if constexpr (bound == BoundaryType::flxcor_recv) { + if (!v->IsSet(Metadata::WithFluxes)) continue; + // Check if this boundary requires flux correction + if (nb.snb.level - 1 != pmb->loc.level()) continue; + // No flux correction required unless boundaries share a face + if (std::abs(nb.ni.ox1) + std::abs(nb.ni.ox2) + std::abs(nb.ni.ox3) != 1) + continue; } - if (nb.snb.rank == Globals::my_rank) continue; - } else if constexpr (bound == BoundaryType::any) { - if (!v->IsSet(Metadata::FillGhost)) continue; - } else if constexpr (bound == BoundaryType::flxcor_send) { - if (!v->IsSet(Metadata::WithFluxes)) continue; - // Check if this boundary requires flux correction - if (nb.snb.level != pmb->loc.level() - 1) continue; - // No flux correction required unless boundaries share a face - if (std::abs(nb.ni.ox1) + std::abs(nb.ni.ox2) + std::abs(nb.ni.ox3) != 1) - continue; - } else if constexpr (bound == BoundaryType::flxcor_recv) { - if (!v->IsSet(Metadata::WithFluxes)) continue; - // Check if this boundary requires flux correction - if (nb.snb.level - 1 != pmb->loc.level()) continue; - // No flux correction required unless boundaries share a face - if (std::abs(nb.ni.ox1) + std::abs(nb.ni.ox2) + std::abs(nb.ni.ox3) != 1) - continue; + if (func_caller(func, pmb, rc, nb, v) == LoopControl::break_out) return; } - if (func_caller(func, pmb, rc, nb, v) == LoopControl::break_out) return; } } } diff --git a/src/utils/morton_number.hpp b/src/utils/morton_number.hpp index 9cde833a1980..28aacb18b940 100644 --- a/src/utils/morton_number.hpp +++ b/src/utils/morton_number.hpp @@ -16,49 +16,10 @@ #include #include +#include "utils/bit_hacks.hpp" + namespace parthenon { namespace impl { -template -constexpr uint64_t GetInterleaveConstant(int power) { - // For power = 2, NDIM = 3, this should return - // ...000011000011 - // For power = 1, NDIM = 3, this should return - // ...001001001001 - // For power = 2, NDIM = 2, this should return - // ...001100110011 - // etc. - constexpr int type_bit_size = sizeof(uint64_t) * 8; - if (power >= type_bit_size) return ~0ULL; // Return with all bits set - uint64_t i_const = ~((~0ULL) << power); // std::pow(2, power) - 1; - int cur_shift = type_bit_size * NDIM; // Works for anything that will fit in uint64_t - while (cur_shift >= NDIM * power) { - if (cur_shift < type_bit_size) i_const = (i_const << cur_shift) | i_const; - cur_shift /= 2; - } - return i_const; -} - -template -uint64_t InterleaveZeros(uint64_t x) { - // This is a standard bithack for interleaving zeros in binary numbers to make a Morton - // number - if constexpr (N_VALID_BITS >= 64) - x = (x | x << 64 * (NDIM - 1)) & GetInterleaveConstant(64); - if constexpr (N_VALID_BITS >= 32) - x = (x | x << 32 * (NDIM - 1)) & GetInterleaveConstant(32); - if constexpr (N_VALID_BITS >= 16) - x = (x | x << 16 * (NDIM - 1)) & GetInterleaveConstant(16); - if constexpr (N_VALID_BITS >= 8) - x = (x | x << 8 * (NDIM - 1)) & GetInterleaveConstant(8); - if constexpr (N_VALID_BITS >= 4) - x = (x | x << 4 * (NDIM - 1)) & GetInterleaveConstant(4); - if constexpr (N_VALID_BITS >= 2) - x = (x | x << 2 * (NDIM - 1)) & GetInterleaveConstant(2); - if constexpr (N_VALID_BITS >= 1) - x = (x | x << 1 * (NDIM - 1)) & GetInterleaveConstant(1); - return x; -} - inline uint64_t GetMortonBits(int level, uint64_t x, uint64_t y, uint64_t z, int chunk) { constexpr int NBITS = 21; constexpr uint64_t lowest_nbits_mask = ~((~static_cast(0)) << NBITS); diff --git a/tst/regression/CMakeLists.txt b/tst/regression/CMakeLists.txt index bf902381bbe7..96a54da6e427 100644 --- a/tst/regression/CMakeLists.txt +++ b/tst/regression/CMakeLists.txt @@ -116,6 +116,12 @@ if (ENABLE_HDF5) --driver_input ${CMAKE_CURRENT_SOURCE_DIR}/test_suites/poisson/parthinput.poisson") list(APPEND EXTRA_TEST_LABELS "poisson") + list(APPEND TEST_DIRS poisson_gmg) + list(APPEND TEST_PROCS ${NUM_MPI_PROC_TESTING}) + list(APPEND TEST_ARGS "--driver ${PROJECT_BINARY_DIR}/example/poisson_gmg/poisson-gmg-example \ + --driver_input ${CMAKE_CURRENT_SOURCE_DIR}/test_suites/poisson_gmg/parthinput.poisson") + list(APPEND EXTRA_TEST_LABELS "poisson_gmg") + list(APPEND TEST_DIRS sparse_advection) list(APPEND TEST_PROCS ${NUM_MPI_PROC_TESTING}) list(APPEND TEST_ARGS "--driver ${PROJECT_BINARY_DIR}/example/sparse_advection/sparse_advection-example \ diff --git a/tst/regression/test_suites/poisson_gmg/__init__.py b/tst/regression/test_suites/poisson_gmg/__init__.py new file mode 100644 index 000000000000..e69de29bb2d1 diff --git a/tst/regression/test_suites/poisson_gmg/parthinput.poisson b/tst/regression/test_suites/poisson_gmg/parthinput.poisson new file mode 100644 index 000000000000..e30e721fb17b --- /dev/null +++ b/tst/regression/test_suites/poisson_gmg/parthinput.poisson @@ -0,0 +1,69 @@ +# ======================================================================================== +# (C) (or copyright) 2023. Triad National Security, LLC. All rights reserved. +# +# This program was produced under U.S. Government contract 89233218CNA000001 for Los +# Alamos National Laboratory (LANL), which is operated by Triad National Security, LLC +# for the U.S. Department of Energy/National Nuclear Security Administration. All rights +# in the program are reserved by Triad National Security, LLC, and the U.S. Department +# of Energy/National Nuclear Security Administration. The Government is granted for +# itself and others acting on its behalf a nonexclusive, paid-up, irrevocable worldwide +# license in this material to reproduce, prepare derivative works, distribute copies to +# the public, perform publicly and display publicly, and to permit others to do so. +# ======================================================================================== + + +problem_id = poisson + + +refinement = static +multigrid = true + +nx1 = 64 +x1min = -1.0 +x1max = 1.0 +ix1_bc = user +ox1_bc = user + +nx2 = 64 +x2min = -1.0 +x2max = 1.0 +ix2_bc = user +ox2_bc = user + +nx3 = 1 +x3min = 0.0 +x3max = 1.0 +ix3_bc = periodic +ox3_bc = periodic + + +nx1 = 32 +nx2 = 32 +nx3 = 1 + + +#nlim = -1 +#tlim = 1.0 +#integrator = rk2 +#ncycle_out_mesh = -10000 + + +x1min = -1.0 +x1max = -0.75 +x2min = -1.0 +x2max = -0.75 +level = 3 + + +solver = MG +smoother = SRJ2 +do_FAS = true +max_iterations = 15 + +x0 = 0.0 +y0 = 0.0 +z0 = 0.0 +radius = 0.5 +diagonal_alpha = 0.0 +interior_D = 1.0 +exterior_D = 1.0 diff --git a/tst/regression/test_suites/poisson_gmg/poisson_gmg.py b/tst/regression/test_suites/poisson_gmg/poisson_gmg.py new file mode 100644 index 000000000000..bb4b8e0a577a --- /dev/null +++ b/tst/regression/test_suites/poisson_gmg/poisson_gmg.py @@ -0,0 +1,31 @@ +# ======================================================================================== +# Parthenon performance portable AMR framework +# Copyright(C) 2020 The Parthenon collaboration +# Licensed under the 3-clause BSD License, see LICENSE file for details +# ======================================================================================== +# (C) (or copyright) 2021. Triad National Security, LLC. All rights reserved. +# +# This program was produced under U.S. Government contract 89233218CNA000001 for Los +# Alamos National Laboratory (LANL), which is operated by Triad National Security, LLC +# for the U.S. Department of Energy/National Nuclear Security Administration. All rights +# in the program are reserved by Triad National Security, LLC, and the U.S. Department +# of Energy/National Nuclear Security Administration. The Government is granted for +# itself and others acting on its behalf a nonexclusive, paid-up, irrevocable worldwide +# license in this material to reproduce, prepare derivative works, distribute copies to +# the public, perform publicly and display publicly, and to permit others to do so. +# ======================================================================================== + +# Modules +import sys +import utils.test_case + +# To prevent littering up imported folders with .pyc files or __pycache_ folder +sys.dont_write_bytecode = True + + +class TestCase(utils.test_case.TestCaseAbs): + def Prepare(self, parameters, step): + return parameters + + def Analyse(self, parameters): + return True diff --git a/tst/unit/test_logical_location.cpp b/tst/unit/test_logical_location.cpp index 351c52b01b6b..b4de3a457bc6 100644 --- a/tst/unit/test_logical_location.cpp +++ b/tst/unit/test_logical_location.cpp @@ -132,7 +132,7 @@ TEST_CASE("Logical Location", "[Logical Location]") { hash_leaves.insert(std::begin(leaves), std::end(leaves)); // Create a set of the leaves - std::set set_leaves; + std::unordered_set set_leaves; for (const auto &[k, v] : leaves) set_leaves.insert(k); @@ -185,7 +185,7 @@ TEST_CASE("Logical Location", "[Logical Location]") { auto possible_neighbors = base_loc.GetPossibleBlocksSurroundingTopologicalElement(1, 0, 0); - std::set by_hand_elements, automatic_elements; + std::unordered_set by_hand_elements, automatic_elements; // There should be five total neighboring blocks of this face since one neighbor is // refined by_hand_elements.insert(LogicalLocation(2, 2, 3, 3)); @@ -207,7 +207,7 @@ TEST_CASE("Logical Location", "[Logical Location]") { auto possible_neighbors = base_loc.GetPossibleBlocksSurroundingTopologicalElement(1, -1, 0); - std::set by_hand_elements, automatic_elements; + std::unordered_set by_hand_elements, automatic_elements; // There should be five total neighboring blocks of this edge since one neighbor is // refined by_hand_elements.insert(LogicalLocation(2, 2, 2, 3)); @@ -229,7 +229,7 @@ TEST_CASE("Logical Location", "[Logical Location]") { auto possible_neighbors = base_loc.GetPossibleBlocksSurroundingTopologicalElement(1, -1, -1); - std::set by_hand_elements, automatic_elements; + std::unordered_set by_hand_elements, automatic_elements; // There should be eight total neighboring blocks for this node by_hand_elements.insert(LogicalLocation(2, 2, 2, 3)); by_hand_elements.insert(LogicalLocation(2, 2, 2, 2)); diff --git a/tst/unit/test_mesh_data.cpp b/tst/unit/test_mesh_data.cpp index 2c232a7a025c..a4cba8f30f8a 100644 --- a/tst/unit/test_mesh_data.cpp +++ b/tst/unit/test_mesh_data.cpp @@ -78,7 +78,7 @@ TEST_CASE("MeshData works as expected for simple packs", "[MeshData]") { BlockList_t block_list = MakeBlockList(pkg, NBLOCKS, N, NDIM); MeshData mesh_data("base"); - mesh_data.Set(block_list); + mesh_data.Set(block_list, nullptr); THEN("The number of blocks is correct") { REQUIRE(mesh_data.NumBlocks() == NBLOCKS); } diff --git a/tst/unit/test_sparse_pack.cpp b/tst/unit/test_sparse_pack.cpp index 48172d5723f3..013fadda811f 100644 --- a/tst/unit/test_sparse_pack.cpp +++ b/tst/unit/test_sparse_pack.cpp @@ -97,7 +97,7 @@ TEST_CASE("Test behavior of sparse packs", "[SparsePack]") { BlockList_t block_list = MakeBlockList(pkg, NBLOCKS, N, NDIM); MeshData mesh_data("base"); - mesh_data.Set(block_list); + mesh_data.Set(block_list, nullptr); WHEN("We initialize the independent variables by hand and deallocate one") { auto ib = block_list[0]->cellbounds.GetBoundsI(IndexDomain::entire);