-
Notifications
You must be signed in to change notification settings - Fork 117
Using MPI and OpenMP
For parallel simulations with MPI, the computing domain is decomposed into small units. In Athena++, this decomposition unit is called MeshBlock, and all the MeshBlocks have the same logical size (i.e., the number of cells). These MeshBlocks are stored on a tree structure, and have unique integer IDs numbered by Z-ordering.
The MeshBlock size is specified by <meshblock> parameters in an input file. The following example is decomposing a Mesh with 256^3 into MeshBlocks with 64^3 cells, resulting in 64 MeshBlocks. Obviously, the size of Mesh must be divisible by MeshBlocks.
<mesh>
nx1 = 256
...
nx2 = 256
...
nx3 = 256
...
<meshblock>
nx1 = 64
nx2 = 64
nx3 = 64
The data output for non-parallelized formats (e.g. VTK), one file is generated per MeshBlock regardless of the actual number of processes. We recommend the HDF5 output because it combines all the MeshBlocks and outputs only two files per output timestep. For detailes, see Outputs.
OpenMP is a standard shared-memory parallelization within a node. OpenMP parallelize calculations within each MeshBlock. To enable this, configure the code with -omp option and set num_threads in the <mesh> block in your input file. Also, you probably need to set environment parameter to specify the number of threads. Generally this is OMP_NUM_THREADS, but please check the documentat of your system.
OpenMP parallelization is not very scalable. Usually you will get the best performance with 2 or 4 threads per process. Because these threads can share some data, especially the MeshBlock tree, it saves some memory. When you are running gigantic parallel simulations, this will be helpful.
Getting Started
User Guide
- Configuring
- Compiling
- The Input File
- Problem Generators
- Boundary Conditions
- Coordinate Systems and Meshes
- Running the Code
- Outputs
- Using MPI and OpenMP
- Static Mesh Refinement
- Adaptive Mesh Refinement
- Load Balancing
- Special Relativity
- General Relativity
- Passive Scalars
- Shearing Box
- Diffusion Processes
- General Equation of State
- FFT
- High-Order Methods
- Super-Time-Stepping
- Orbital Advection
- Rotating System
- Reading Data from External Files
Programmer Guide