Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Put RNG in shared memory where beneficial #229

Draft
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

bernhardmgruber
Copy link
Contributor

@bernhardmgruber bernhardmgruber commented Oct 7, 2022

The shared memory hardware can handle the access pattern on the RNG better than the local memory (in case of spilling) or global memory. For most kernels, it is thus beneficial to move the RNG into shared memory at the beginning of the kernel and store it back at the end in case the track survives.

Running example19 -particles 10000 -batch 5000 -gdml_file ./examples/Example14/macros/testEm3.gdml on the V100.

master (V100):
Mean: 4.00011
Stddev: 0.00030375

rng_sm (V100, only TransportElectrons):
Mean: 3.70078
Stddev: 0.00141616

rng_sm (V100, TransportElectrons and TransportGammas):
Mean: 3.94435
Stddev: 0.00170546

So it’s only beneficial to put the RNG in SM in TransportElectrons.

rng_sm (V100, TransportElectrons and electron interaction kernels):
Mean: 3.68766
Stddev: 0.000562445

rng_sm (V100, TransportElectrons and electron/gamma interaction kernels):
Mean: 3.65415
Stddev: 0.00167166

The benefit in the interaction kernels is significantly smaller. Furthermore, contention for the shared memory is increased, since SM is also needed to compact active tracks at the beginning of the interaction kernels.

Opinions on the SM usage for the interaction kernels?

Depends on:

@phsft-bot
Copy link

Can one of the admins verify this patch?

@hahnjo
Copy link
Contributor

hahnjo commented Oct 10, 2022

The benefit in the interaction kernels is significantly smaller. Furthermore, contention for the shared memory is increased, since SM is also needed to compact active tracks at the beginning of the interaction kernels.

As mentioned in #216 (comment), we can get rid of this super complex "compaction in shared memory" code and just use queues... Let me know if you want me to push my WIP code somewhere.

@bernhardmgruber
Copy link
Contributor Author

The benefit in the interaction kernels is significantly smaller. Furthermore, contention for the shared memory is increased, since SM is also needed to compact active tracks at the beginning of the interaction kernels.

As mentioned in #216 (comment), we can get rid of this super complex "compaction in shared memory" code and just use queues... Let me know if you want me to push my WIP code somewhere.

Sure, please push your changes somewhere and share them with us!

@agheata
Copy link
Contributor

agheata commented Oct 17, 2023

@bernhardmgruber is this PR still relevant, is it worth it using shared memory for other examples besides example19?

@bernhardmgruber
Copy link
Contributor Author

@agheata it was totally worth it for example19 IIRC. It may also be useful for other examples as well. I don't have the time to look into this anymore now that I am in the middle of my PhD writeup. So I leave this PR as a suggestions to you. Continue with it as you please.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants