Skip to content

Commit

Permalink
Version 0.4 Merge with Stable (#426)
Browse files Browse the repository at this point in the history
* Used "signed" version of size_t - ptrdiff_t in io_wrapper.cpp

* Remove `KOKKOS_INLINE_FUNCTION` from functions where it's not required/incorrect.

* Fix improperly captured `CellVariable`

* Reintroduce pack helper function (now with cache support)

* Joshua s brown/ci darwin speedup (#347)

* Updated ci to use existing spack install on darwin rather than building packages from scratch each time. 
* Also corrected scheduling to use the develop branch rather than the no longer existing master branch.

* Fixing parameter_input

* Fix formatting

* Added CHANGELOG comment

* Darwin should append on to CMAKE_CXX_FLAGS

* Add checks for all MPI function calls

* Update Changelog

* Add more doc

* fixed indexing errors in mesh domain for edge cases

* finished updating unit domain

* finish domain

* Fix artifacts

* first rewrite of boundaries. More modifications needed.

* address some of Josh Browns comments

* Pass coarse vs. fine boolean through boundary stack.

* jdolence/unify_more

* attempt to fix weird boundaries bug

* try this fix. Actually flip vector components

* fix the weird boundary thing(?)

* Seek throw rather than return error

* void IOWrapper::Close

* template some update functionality

* formatting

* update advection

* trying to limit duplicated code for FluxDivergence

* cleanup some

* template FillDerived

* formatting

* add support for FillDerived functionality on both MeshBlockData and MeshData simultaneously

* formatting

* HDF5 changes: Automatic resource cleanup and error checking

* gitignore cmake trash

* fix a "C++17" language warning

* first pass at boundaries rewrite. Still some debugging to do and the boundaries hack in prolongate/restrict still present.

* abstract out par_for_bndry

* make Parthenon::AppInput a StateDescriptor that automatically populates a Packages_t

* formatting

* oops need to return TaskStatus::complete

* add test for coarse flag for variable_pack and fix bug it uncovered

* added a test for repeated coarse vs fine access to a variable pack and fixed caching bug caught by test

* remove dead code.

* only prolongate boundaries if multilevel

* comment out applyBounds for now

* formatting

* add support for coexistent calls to EstimateTimestep calls with MeshData and MeshBlockData

* remove stray comment

* no bool overload for zero lenght pack

* formatting and replaced metadata::independent with metadata::fillghost

* add documentation

* changelog

* introduce a Tag function with MeshData and MeshBlockData overloads

* formatting

* replace special strings with constexpr char arrays

* address most of jdolence comments

* add advection outflow test

* partial gold file update

* remove applybounds as it seems unnecessary

* update gold standard for regression test

* fix warnings

* dont need that comment anymore

* fix comments in meshblock.hpp

* add missing endline

* fixed bug where empty variable packs segfault

* doc typo

* Update src/mesh/domain.hpp

* josh browns suggestion of not creating intermediate indexrange object

* comment out si and ei which may be needed for face centered fields but are currently unused

* variable pack constructor take references to avoid double copy. thanks Josh Brown

* add const where suggested

* reduce replicated code in bvals_refine and simplify API by using IndexRange

* non-member par_for_inner

* newline

* const

* Fix no-mpi build

* Brief documentation for H5Handle

* Remove unused variable

* More thorough error checking

PARTHENON_HDF5_CHECK now returns the error value - sometimes it is overloaded as the return value of a function

* Revert par_for_outer changes

* Update copyright

* close restart file prior to MPI finalize

* Fail hard on error

* Fix missing ;

* Remove comment

* Add changelog comment

* -Werror in parthenon-cuda-unit test

* CHANGELOG

* Force hard fail with set -e

* More elegant solution

* Fix submodules by calling update init

* Add CHANGELOG comment

* Update scripts/darwin/build_fast.sh

Co-authored-by: Jonah Miller <[email protected]>

* More elegant fix

* refactoring back to function pointers instead of params

* formatting

* some cleanup

* some updates to docs

* update changelog

* making more things const references

* generalize Update and Average functions

* Update src/parthenon_manager.cpp

Co-authored-by: Philipp Grete <[email protected]>

* more cleanup

* more cleanup

* addressing comments

* fix linting

* oops, undo stupid.

* addressing thread safety issue with setting dt

* cleanup as suggested by Forrest Glines

* update doc

* Fix Kokkos ver

* Add/remove const and force inline func

* addressing review comments

* Add more prof regions

* cpp-auto-formatter

* generalizing update functionality

* move to universal references

* how about const T

* namespace function overloads for task resolution

* move from shared_ptr<T> to *

* Add Changelog

* Add filtered by name for adding MeshBlockDatas to DataCollections

* remove extra templating in Add, and include generalized Copy in MeshData

* add a little error checking

* fix stupid

* refactor advection to use new tasks in Update

* add profiling for SumData

* move FluxDivergence to T *

* add some error checking

* add some documentation

* update changelog

* adding a unit test

* gpuify test

* aghh...formatting

* Output on fail

* Add Changelog comment

* addressing comments

* formatting

* more review comments

* intermediate state descriptor. doesn't compile.

* still working on design

* fix bug and address comments

* working on package conflict resolution

* add const that seems required

* update changelog

* code compiles

* making progress on testing

* more progress on testing

* have everything working, including printed metadata at the start of a run

* changelog and documentation

* fix stupid cmake

* codacy changes

* Update CHANGELOG.md

Co-authored-by: Andrew Gaspar <[email protected]>

* Update changelog to better describe new features.

* Update src/interface/state_descriptor.cpp

Co-authored-by: Andrew Gaspar <[email protected]>

* Update src/interface/metadata.hpp

Co-authored-by: Andrew Gaspar <[email protected]>

* same bits -> flags

* Add GetGlobalSparseID and GetLocalSparseID to VariablePack

* For now both these functions dispatch to GetSparse

* When sparse variables are actually being used, GetGlobalSparseID
  should return the global id and GetLocalSparseID should return the
  variable-pack-local id

* Add as contributor

* Format fixes

* Update CHANGELOG.md and add tests

* Change "ID" -> "Id" for consistency

* Rename methods

* `GetLocalSparseId` -> `GetSparseIndex`

* `GetGlobalSparseId` -> `GetSparseId`

* Update CHANGELOG.md for changes to method names

* Modify c++ linting for when parthenon is submodule

* uses the `--repository` argument to cpplint to set the repository to
  the `${PROJECT_SOURCE_DIR}`

* adds `CPPLINT.cfg` as a dependency to the linting command so changes
  to it will trigger re-linting

* update CHANGELOG.md

* initial refactoring

* fix linting

* formatting

* address comments by jdolence and agaspar

* make Pre/PostExecute virtual

* do not explicitly call Driver::PreExecute

* Add const correctness to `Params.Get`

* update CHANGELOG.md

* reimplement with `at(key)`

* Update state_descriptor.cpp

* Remove old files before running tests (#362)

* Remove old files before running tests

* Added changelog comment

* Fix indentation

* Use clean output folder when running tests

* Add API to state_descriptor for diagnostic outputs

* Extend APIs to support Pre and Post step user work

* Fix formatting

* update CHANGELOG.md

* Time arg const; post-step work after time update

* Fix format

* Initialize OutputDiagnosticsMesh to nullptr

* Implement application level diagnostics output

* Raw pointers in `StateDescriptor` -> std::function

* Fix init timestep calc for MeshData

* Don't overwrite MeshBlockData dt

* Update CHANGELOG.md and some comments

* expose TaskListStatus to downstream codes

* export EvolutionDriver

* update CHANGELOG

* Cleanup some Codacy warnings (#403)

* Cleanup some Codacy warnings

* Update Changelog

* Update comments

* Formatting

* Apply suggestions from code review

Co-authored-by: Andrew Gaspar <[email protected]>

* Update src/parameter_input.cpp

Co-authored-by: Andrew Gaspar <[email protected]>

Co-authored-by: Andrew Gaspar <[email protected]>

* remove unused constructor

* Move post-step user work and diagnostics to before time/cycle update

* Add documention

* Update `StateDescriptor` documentation adding information about `Pre-`
  and `PostStepDiagnostics` functions

* Update `Mesh` documentation to reflect the additions of the `Pre-` and
  `PostStepUserWorkInLoop` and the `Pre-` and
  `PostStepDiagnisticsInLoop`

* update driver docs

* Joshua s brown/version 0.4 (#422)

Version 0.4

Co-authored-by: Andrew Gaspar <[email protected]>
Co-authored-by: Philipp Grete <[email protected]>
Co-authored-by: Josh Dolence <[email protected]>
Co-authored-by: Jonah Miller <[email protected]>
Co-authored-by: Jonah Miller <[email protected]>
Co-authored-by: par-hermes <[email protected]>
Co-authored-by: Clell J. (CJ) Solomon <[email protected]>
Co-authored-by: clellsolomon <[email protected]>
Co-authored-by: Philipp Grete <[email protected]>
  • Loading branch information
10 people authored Jan 21, 2021
1 parent 3da72ea commit 9ab89bd
Show file tree
Hide file tree
Showing 175 changed files with 9,324 additions and 3,764 deletions.
4 changes: 4 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -34,3 +34,7 @@ compile_commands.jsonbin
*.phdf
*.phdf.xdmf
*.dat

# Files created by bad CMake hygiene
a.out
cmake_hdf5_test.o
33 changes: 33 additions & 0 deletions .gitlab-ci-darwin.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
variables:
SCHEDULER_PARAMETERS: '--nodes=1 --partition=power9 --export=NONE'
GIT_SUBMODULE_STRATEGY: recursive

stages:
- performance-regression

.gcc-mpi-cuda-performance-regression:
variables:
CMAKE_CXX_COMPILER: $CI_PROJECT_DIR/external/Kokkos/bin/nvcc_wrapper
script:
- env -i bash --norc --noprofile ./scripts/darwin/build_fast.sh

artifacts:
expire_in: 3 days
paths:
- ${CI_PROJECT_DIR}/build/tst/regression/outputs/advection_performance/performance.png
- ${CI_PROJECT_DIR}/build/tst/regression/outputs/advection_performance_mpi/performance.png

parthenon-power9-gcc-mpi-cuda-perf-manual:
extends: .gcc-mpi-cuda-performance-regression
stage: performance-regression
when: manual
except:
- schedules

parthenon-power9-gcc-mpi-cuda-perf-schedule:
extends: .gcc-mpi-cuda-performance-regression
stage: performance-regression
only:
- schedules
- develop

1 change: 1 addition & 0 deletions .gitlab-ci-ias.yml
Original file line number Diff line number Diff line change
Expand Up @@ -108,6 +108,7 @@ parthenon-build-cuda:
-DPARTHENON_DISABLE_MPI=ON
-DPARTHENON_DISABLE_HDF5=ON
-DPARTHENON_LINT_DEFAULT=OFF
-DNUM_MPI_PROC_TESTING=1
../
- make -j${J} advection-example
- nvidia-smi
Expand Down
21 changes: 17 additions & 4 deletions .gitlab-ci.yml
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
image: cuda10.0-mpi-hdf5-mpl
image: cuda11.0-mpi-hdf5

# Is performed before the scripts in the stages step
before_script:
Expand All @@ -18,7 +18,6 @@ cache:

variables:
GIT_SUBMODULE_STRATEGY: recursive

stages:
- short
- performance_and_regression
Expand All @@ -41,6 +40,7 @@ parthenon-cuda-unit:
- cd build-cuda-debug
- cmake -DCMAKE_BUILD_TYPE=Debug
-DMACHINE_VARIANT=cuda-mpi
-DCMAKE_CXX_FLAGS=-Werror
../
- make -j${J}
- ctest -LE 'performance|regression'
Expand Down Expand Up @@ -79,14 +79,27 @@ parthenon-cuda-short:
- cmake -DCMAKE_BUILD_TYPE=Release
-DMACHINE_VARIANT=cuda-mpi
../
- make -j${J}
- make -j${J} advection-example
- export OMPI_MCA_mpi_common_cuda_event_max=1000
- ctest -R regression_mpi_test:output_hdf5
# Now testing if there are no hidden memcopies between host and device.
# Using a static grid (i.e., not AMR) as additional transfers are expected
# during loadbalance and refinement, but not for a static grid.
# Also delaying start as there are explicit copies during initialization, e.g.,
# when the Variable caches are created.
- nsys profile --delay=5 --duration=5 --stats=true example/advection/advection-example
-i ../tst/regression/test_suites/advection_performance/parthinput.advection_performance
parthenon/mesh/nx1=128 parthenon/mesh/nx2=128 parthenon/mesh/nx3=128
parthenon/meshblock/nx1=64 parthenon/meshblock/nx2=64 parthenon/meshblock/nx3=64
parthenon/time/nlim=1000 |& tee profile.txt
- test $(grep HtoD profile.txt |wc -l) == 0
- test $(grep DtoH profile.txt |wc -l) == 0
artifacts:
when: always
expire_in: 3 days
paths:
- build-cuda-perf-mpi/CMakeFiles/CMakeOutput.log
- build-cuda-perf-mpi/profile.txt

parthenon-cpu-short:
tags:
Expand All @@ -98,7 +111,7 @@ parthenon-cpu-short:
- cmake -DCMAKE_BUILD_TYPE=Release
-DMACHINE_VARIANT=mpi
../
- make -j${J}
- make -j${J} advection-example
- ctest -R regression_mpi_test:output_hdf5
artifacts:
when: always
Expand Down
70 changes: 70 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,8 +8,75 @@

### Fixed (not changing behavior/API/variables/...)

### Infrastructure (changes irrelevant to downstream codes)

### Removed (removing behavior/API/varaibles/...)

## Release 0.4.0
Date: 01/19/2021

### Added (new features/APIs/variables/...)
- [[PR 400]](https://github.com/lanl/parthenon/pull/400) Extend `StateDescriptor` for customizable output via user-customizable function pointers `PreStepDiagnosticsMesh` and `PostStepDiagnosticsMesh`
- [[PR 391]](https://github.com/lanl/parthenon/pull/391) Add `VariablePack<T>::GetSparseId` and `VariablePack<T>::GetSparseIndex` to return global sparse ids and pack-local sparse index, repsectively.
- [[PR 381]](https://github.com/lanl/parthenon/pull/381) Overload `DataCollection::Add` to build `MeshData` and `MeshBlockData` objects with a subset of variables.
- [[PR 378]](https://github.com/lanl/parthenon/pull/378) Add Kokkos profiling regions throughout the code to allow the collection characteristic application profiles
- [[PR 358]](https://github.com/lanl/parthenon/pull/358) Generalize code that interfaces with downstream apps to work with both `MeshData` and `MeshBlockData`.
- [[PR 335]](https://github.com/lanl/parthenon/pull/335) Support for project-relative `MACHINE_CFG` with `@PAR_ROOT@`
- [[PR 328]](https://github.com/lanl/parthenon/pull/328) New `MeshBlock` packing interface using `DataCollection`s of `MeshData` and `MeshBlockData`.
- [[PR 386]](https://github.com/lanl/parthenon/pull/386) Introduce `Private`, `Provides`, `Requires`, and `Overridable` variable metadata, allowing fine-grained control of conflict resolution between packages.

### Changed (changing behavior/API/variables/...)
- [[PR 393]](https://github.com/lanl/parthenon/pull/393) Small refactor to make driver code more flexible for downstream apps.
- [[PR 400]](https://github.com/lanl/parthenon/pull/400) Change `Mesh`, `ApplicationInput`, and `Driver` to suppport pre- and post- step user work
- [[PR 394]](https://github.com/lanl/parthenon/pull/394) Make `Params.Get` const-correct.
- [[PR 332]](https://github.com/lanl/parthenon/pull/332) Rewrote boundary conditions to work on GPUs with variable packs. Re-enabled user-defined boundary conditions via `ApplicationInput`.

### Fixed (not changing behavior/API/variables/...)
- [[\#401]](https://github.com/lanl/parthenon/issues/401) Fix missing initial timestep for MeshData functions
- [[PR 387]](https://github.com/lanl/parthenon/pull/387) Add missing const that was needed
- [[PR 353]](https://github.com/lanl/parthenon/pull/353) Fixed small error in input\_parameter logic
- [[PR 352]](https://github.com/lanl/parthenon/pull/352) Code compiles cleanly (no warnings) with nvcc_wrapper

### Infrastructure (changes irrelevant to downstream codes)
- [[PR 392]](https://github.com/lanl/parthenon/pull/392) Fix C++ linting for when parthenon is a submodule
- [[PR 335]](https://github.com/lanl/parthenon/pull/335) New machine configuration file for LANL's Darwin cluster
- [[PR 200]](https://github.com/lanl/parthenon/pull/200) Adds support for running ci on power9 nodes.
- [[PR 347]](https://github.com/lanl/parthenon/pull/347) Speed up darwin ci by using pre installed spack packages from project space
- [[PR 368]](https://github.com/lanl/parthenon/pull/368) Fixes false positive in ci.
- [[PR 369]](https://github.com/lanl/parthenon/pull/369) Initializes submodules when running on darwin ci.
- [[PR 382]](https://github.com/lanl/parthenon/pull/382) Adds output on fail for fast ci implementation on Darwin.
- [[PR 362]](https://github.com/lanl/parthenon/pull/362) Small fix to clean regression tests output folder on reruns
- [[PR 403]](https://github.com/lanl/parthenon/pull/403) Cleanup Codacy warnings

### Removed (removing behavior/API/varaibles/...)

## Release 0.3.0
Date: 10/29/2020

### Added (new features/APIs/variables/...)
- [[PR 317]](https://github.com/lanl/parthenon/pull/317) Add initial support for particles (no MPI support)
- [[PR 311]](https://github.com/lanl/parthenon/pull/311) Bugfix::Restart. Fixed restart parallel bug and also restart bug for simulations with reflecting boundary conditions. Added ability to write restart files with or without ghost cells by setting `ghost_zones` in the output block similar to other output formats.
- [[PR 314]](https://github.com/lanl/parthenon/pull/314) Generalized `par_for` abstractions to provide for reductions with a consistent interface.
- [[PR 308]](https://github.com/lanl/parthenon/pull/308) Added the ability to register and name `MeshBlockPack`s in the `Mesh` or in package initialization.
- [[PR 285]](https://github.com/lanl/parthenon/pull/285) Parthenon can now be linked in CMake as `Parthenon::parthenon` when used as a subdirectory, matching install.

### Changed (changing behavior/API/variables/...)
- [[PR 303]](https://github.com/lanl/parthenon/pull/303) Changed `Mesh::BlockList` from a `std::list<MeshBlock>` to a `std::vector<std::shared_ptr<MeshBlock>>`, making `FindMeshBlock` run in constant, rather than linear, time. Loops over `block_list` in application drivers must be cahnged accordingly.
- [[PR 300]](https://github.com/lanl/parthenon/pull/300): Changes to `AddTask` function signature. Requires re-ordering task dependency argument to front.
- [[PR 307]](https://github.com/lanl/parthenon/pull/307) Changed back-pointers in mesh structure to weak pointers. Cleaned up `MeshBlock` constructor and implemented `MeshBlock` factory function.

### Fixed (not changing behavior/API/variables/...)
- [[PR 293]](https://github.com/lanl/parthenon/pull/293) Changed `VariablePack` and related objects to use `ParArray1D` objects instead of `ParArrayND` objects under the hood to reduce the size of the captured objects.
- [[PR 313]](https://github.com/lanl/parthenon/pull/313) Add include guards for Kokkos in cmake.
- [[PR 321]](https://github.com/lanl/parthenon/pull/321) Make inner loop pattern tags constexpr

### Infrastructure (changes irrelevant to downstream codes)
- [[PR 336]](https://github.com/lanl/parthenon/pull/336) Automated testing now checks for extraneous HtoD or DtoH copies.
- [[PR 325]](https://github.com/lanl/parthenon/pull/325) Fixes regression in convergence tests with multiple MPI ranks.
- [[PR 310]](https://github.com/lanl/parthenon/pull/310) Fix Cuda 11 builds.
- [[PR 281]](https://github.com/lanl/parthenon/pull/281) Allows one to run regression tests with more than one cuda device, Also improves readability of regression tests output.
- [[PR 330]](https://github.com/lanl/parthenon/pull/330) Fixes restart regression test.


## Release 0.2.0
Date: 9/12/2020
Expand All @@ -32,6 +99,9 @@ Date: 9/12/2020
- [[PR 262]](https://github.com/lanl/parthenon/pull/262) Fix setting of "coverage" label in testing. Automatically applies coverage tag to all tests not containing "performance" label.
- [[PR 276]](https://github.com/lanl/parthenon/pull/276) Decrease required Python version from 3.6 to 3.5.
- [[PR 283]](https://github.com/lanl/parthenon/pull/283) Change CI to extended nightly develop tests and short push tests.
- [[PR 291]](https://github.com/lanl/parthenon/pull/291) Adds Task Diagram to documentation.

### Removed
- [[PR 282]](https://github.com/lanl/parthenon/pull/282) Integrated MeshBlockPack and tasking in pi example
- [[PR 294]](https://github.com/lanl/parthenon/pull/294) Fix `IndexShape::GetTotal(IndexDomain)` - previously was returning opposite of expected domain result.

Expand Down
85 changes: 46 additions & 39 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -13,31 +13,14 @@

cmake_minimum_required(VERSION 3.12)

# Load machine specific defaults such as architecture and mpi launch command.
# Command line argument takes precedence over environment variable.
# Loading this before project definition to allow setting the compiler.
if (MACHINE_CFG)
if(EXISTS "${MACHINE_CFG}")
include(${MACHINE_CFG})
else()
message(FATAL_ERROR "Given machine configuration at "
"${MACHINE_CFG} not found.")
endif()
elseif (DEFINED ENV{MACHINE_CFG})
if(EXISTS "$ENV{MACHINE_CFG}")
include($ENV{MACHINE_CFG})
else()
message(FATAL_ERROR "Given machine configuration from environment variable "
"MACHINE_CFG at $ENV{MACHINE_CFG} not found.")
endif()
else()
message(WARNING "Not using any machine configuration. Consider creating a configuration "
"file following the examples in ${PROJECT_SOURCE_DIR}/cmake/machine_cfgs/ and then "
"point the MACHINE_CFG variable to your custom file."
"Note, that the machine file can be placed in any directory (also outside the repo).")
endif()
# Imports machine-specific configuration
include(cmake/MachineCfg.cmake)

project(parthenon VERSION 0.4.0 LANGUAGES C CXX)

project(parthenon VERSION 0.2.0 LANGUAGES C CXX)
if (${CMAKE_VERSION} VERSION_GREATER_EQUAL 3.19.0)
cmake_policy(SET CMP0110 NEW)
endif()

include(CTest)

Expand All @@ -62,6 +45,8 @@ built some, drivers needed by the regression tests will still be built."
OFF
)

option(PARTHENON_ENABLE_GPU_MPI_CHECKS "Checks if possible that the mpi num of procs and the number\
of gpu devices detected are appropriate." ${BUILD_TESTING})
option(PARTHENON_ENABLE_UNIT_TESTS "Enable unit tests" ${BUILD_TESTING})
option(PARTHENON_ENABLE_INTEGRATION_TESTS "Enable integration tests" ${BUILD_TESTING})
option(PARTHENON_ENABLE_PERFORMANCE_TESTS "Enable performance tests" ${BUILD_TESTING})
Expand All @@ -86,8 +71,10 @@ include(cmake/Lint.cmake)
set(NUMBER_GHOST_CELLS ${PARTHENON_NGHOST})

# regression test reference data
set(REGRESSION_GOLD_STANDARD_VER 2 CACHE STRING "Version of gold standard to download and use")
set(REGRESSION_GOLD_STANDARD_HASH "SHA512=45f57d16b76a3a44940e40ea642c1d5a2b3aea681a3064e4c06b031bf264d627f8009e967b92a2cb0fbc08e5cd0ffda381038483f390fa00f47c42a551ca3646" CACHE STRING "Hash of default gold standard file to download")
set(REGRESSION_GOLD_STANDARD_VER 3 CACHE STRING "Version of gold standard to download and use")
set(REGRESSION_GOLD_STANDARD_HASH
"SHA512=2445000a031aafc9a85a684aa7e82fb5ccda05530c5652ab432a8fb254642c7b18252ff55d8306b1120aba898713712b28b804eb30272b06278698046a6461cc"
CACHE STRING "Hash of default gold standard file to download")
option(REGRESSION_GOLD_STANDARD_SYNC "Automatically sync gold standard files." ON)

# set single precision #define
Expand Down Expand Up @@ -196,10 +183,16 @@ set(CMAKE_CXX_EXTENSIONS OFF)
SET (Kokkos_ENABLE_AGGRESSIVE_VECTORIZATION ON CACHE BOOL
"Kokkos aggressive vectorization")

# Tell Kokkos we need lambdas in Cuda.
# Check that gpu devices are actually detected
set(NUM_GPU_DEVICES_PER_NODE "1" CACHE STRING "Number of gpu devices to use when testing if built with Kokkos_ENABLE_CUDA")
set(NUM_OMP_THREADS_PER_RANK "1" CACHE STRING "Number of threads to use when testing if built with Kokkos_ENABLE_OPENMP")
if (Kokkos_ENABLE_CUDA)
# Tell Kokkos we need lambdas in Cuda.
SET (Kokkos_ENABLE_CUDA_LAMBDA ON CACHE BOOL
"Enable lambda expressions in CUDA")
if ( "${PARTHENON_ENABLE_GPU_MPI_CHECKS}" )
configure_file(${CMAKE_CURRENT_SOURCE_DIR}/cmake/CTestCustom.cmake.in ${CMAKE_BINARY_DIR}/CTestCustom.cmake @ONLY)
endif()
endif()

# If this is a debug build, set kokkos debug on
Expand All @@ -222,19 +215,23 @@ endif()
# We want Kokkos to be built with C++14, since that's what we're using in
# Parthenon.
set(CMAKE_CXX_STANDARD 14)
if (PARTHENON_IMPORT_KOKKOS)
find_package(Kokkos 3)
if (NOT Kokkos_FOUND)
unset(PARTHENON_IMPORT_KOKKOS CACHE)
message(FATAL_ERROR "Could not find external Kokkos. Consider importing a Kokkos installation into your environment or disabling external Kokkos with e.g. -DPARTHENON_IMPORT_KOKKOS=OFF")
endif()
else()
if (EXISTS ${Kokkos_ROOT}/CMakeLists.txt)
add_subdirectory(${Kokkos_ROOT} Kokkos)
elseif(EXISTS ${CMAKE_CURRENT_SOURCE_DIR}/external/Kokkos/CMakeLists.txt)
add_subdirectory(${CMAKE_CURRENT_SOURCE_DIR}/external/Kokkos Kokkos)
if (NOT TARGET Kokkos::kokkos)
if (PARTHENON_IMPORT_KOKKOS)
find_package(Kokkos 3)
if (NOT Kokkos_FOUND)
unset(PARTHENON_IMPORT_KOKKOS CACHE)
message(FATAL_ERROR "Could not find external Kokkos. Consider importing a Kokkos installation into your environment or disabling external Kokkos with e.g. -DPARTHENON_IMPORT_KOKKOS=OFF")
endif()
else()
message(FATAL_ERROR "Could not find Kokkos source. Consider running `git submodule update --init`, providing the path to a Kokkos source directory with Kokkos_ROOT, or setting PARTHENON_IMPORT_KOKKOS=ON to link to an external Kokkos installation.")
if (EXISTS ${Kokkos_ROOT}/CMakeLists.txt)
add_subdirectory(${Kokkos_ROOT} Kokkos)
message(STATUS "Using Kokkos source from Kokkos_ROOT=${Kokkos_ROOT}")
elseif(EXISTS ${CMAKE_CURRENT_SOURCE_DIR}/external/Kokkos/CMakeLists.txt)
add_subdirectory(${CMAKE_CURRENT_SOURCE_DIR}/external/Kokkos Kokkos)
message(STATUS "Using Kokkos source from Parthenon submodule at ${CMAKE_CURRENT_SOURCE_DIR}/external/Kokkos")
else()
message(FATAL_ERROR "Could not find Kokkos source. Consider running `git submodule update --init`, providing the path to a Kokkos source directory with Kokkos_ROOT, or setting PARTHENON_IMPORT_KOKKOS=ON to link to an external Kokkos installation.")
endif()
endif()
endif()

Expand Down Expand Up @@ -262,6 +259,16 @@ if (${PARTHENON_ENABLE_UNIT_TESTS} OR ${PARTHENON_ENABLE_INTEGRATION_TESTS} OR $
else()
message(STATUS "Gold standard is up-to-date. No download required.")
endif()


# Throw a warning if the number of processors available is less than the number specified for running the tests.
include(ProcessorCount)
ProcessorCount(N)
if( "${NUM_MPI_PROC_TESTING}" GREATER "${N}")
message(WARNING "Consider changing the number of MPI processors used in testing, currently "
"NUM_MPI_PROC_TESTING is set to ${NUM_MPI_PROC_TESTING} but cmake has only detected a"
" total of ${N} available processors")
endif()
endif()

# Try finding an installed Catch2 first
Expand Down
Loading

0 comments on commit 9ab89bd

Please sign in to comment.