-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Put RNG in shared memory where beneficial #229
base: master
Are you sure you want to change the base?
Conversation
Can one of the admins verify this patch? |
As mentioned in #216 (comment), we can get rid of this super complex "compaction in shared memory" code and just use queues... Let me know if you want me to push my WIP code somewhere. |
Sure, please push your changes somewhere and share them with us! |
99ab67a
to
69beb8b
Compare
69beb8b
to
232d6db
Compare
@bernhardmgruber is this PR still relevant, is it worth it using shared memory for other examples besides example19? |
@agheata it was totally worth it for example19 IIRC. It may also be useful for other examples as well. I don't have the time to look into this anymore now that I am in the middle of my PhD writeup. So I leave this PR as a suggestions to you. Continue with it as you please. |
The shared memory hardware can handle the access pattern on the RNG better than the local memory (in case of spilling) or global memory. For most kernels, it is thus beneficial to move the RNG into shared memory at the beginning of the kernel and store it back at the end in case the track survives.
Running
example19 -particles 10000 -batch 5000 -gdml_file ./examples/Example14/macros/testEm3.gdml
on the V100.master (V100):
Mean: 4.00011
Stddev: 0.00030375
rng_sm (V100, only TransportElectrons):
Mean: 3.70078
Stddev: 0.00141616
rng_sm (V100, TransportElectrons and TransportGammas):
Mean: 3.94435
Stddev: 0.00170546
So it’s only beneficial to put the RNG in SM in TransportElectrons.
rng_sm (V100, TransportElectrons and electron interaction kernels):
Mean: 3.68766
Stddev: 0.000562445
rng_sm (V100, TransportElectrons and electron/gamma interaction kernels):
Mean: 3.65415
Stddev: 0.00167166
The benefit in the interaction kernels is significantly smaller. Furthermore, contention for the shared memory is increased, since SM is also needed to compact active tracks at the beginning of the interaction kernels.
Opinions on the SM usage for the interaction kernels?
Depends on: