Skip to content

2020.10.07 Meeting Notes

Andrew Gaspar edited this page Oct 7, 2020 · 6 revisions

Agenda

  • Individual/group updates
    • AthenaPK scaling using cached MeshBlockPacks and single kernel buffer send/set functions: image
  • Discuss Regression Test Failures (Joshua and Andrew)
  • Dense on Block - @gshipman
  • Interface design (Container versus MeshBlockPacks) -- @pgrete
  • Discuss large-scale testing for AMR at smaller mesh block sizes -- @gshipman
  • Public headers vs. private headers
  • Review non-WIP PRs

Individual/group Updates

LANL CS

Mostly pull request reviews. @Joshua has been trying to wrap up public headers change.

Sriram still working on restart.

LANL Physics

Introduced an abstraction for parallel reduce par_reduce.

Ben is getting close to have particle framework changes merged in. On large static, uniform mesh, performance looks good. Still working on GPU performance.

Intel compiler errors cropping up.

PKAthena

Scaling of mesh block packs. Overhead of 256^3 down to 16^3 is now 6x on GPU (compared to baseline of 4x on CPU). Factor of 100 improvement over 3 months ago. This is on a full 2nd-order hydro problem.

image

Forrest noticed there were a lot of limitations in Jim's code due to register usage. It's an experimental code.

Feedback from Jim:

  • Multi-value reduction - basically do a reduction on N variables and produce N results
    • Possibly provide a custom reduction operation or something
  • Output "tabs" files
    • Discussion on perhaps doing something generic so downstream apps can support their own output formats

Discuss Regression Test Failures (Joshua and Andrew)

https://github.com/lanl/parthenon/issues/312

Dense on Block

We'd be interested in doing dense-on-block variables to claw back some memory usage from switching to block-based from cell-based.

Dense-on-block basically means that you only allocate variables on blocks where they're used - implicit 0 everywhere else.

Complexity arises in implementation in AMR - boundary functions.

Dense-on-processor was considered, but the though is that is too coarse, and you only get marginal benefits from that.

More discussion is needed. @jdolence will schedule a meeting.

Interface design (Container versus MeshBlockPacks)

Integration of mesh block packing revealed issues with everything being based on Containers - that's what tasks operate on. Containers are low-granularity - only look at a set of variables per-mesh block.

Longer discussion warranted. @andrewgaspar will schedule a meeting.

Discuss large-scale testing for AMR at smaller mesh block sizes

Last piece of this puzzle: https://github.com/lanl/parthenon/pull/302/files

Galen wants to do weak scaling study with 16^3 blocks, maybe trying 8^3. We need to look at comparisons between CPU vs. GPU and raw numbers. Important to note that running on GPUs enables certain algorithms that are much more performant on GPU than CPU.

Clone this wiki locally