Skip to content

Benchmark Gather Scatter performance

Sam Reeve edited this page Oct 16, 2023 · 3 revisions

The plots below show performance of particle (gather and scatter) communication on the ORNL Frontier supercomputer. Both CPU and GPU performance are compared as a function of total particle count per rank (with 8 MPI ranks). These are the only core library benchmarks intended to run with multiple MPI ranks.

Create refers to building the communication steering vectors (construction of the Halo) and gather and scatter refer to executing the communication (copying particles to new ranks and updating particles copied to new ranks, respectively). Each point represents a single fraction of particles communicated.

Frontier

Device-Device

Device-Host

Host-Host

Implementation

Default parameters with the commandline "large" setting were used for these results.

Clone this wiki locally