Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hermetic CUDA Toolkit #283

Open
9 of 12 tasks
cloudhan opened this issue Oct 16, 2024 · 13 comments
Open
9 of 12 tasks

Hermetic CUDA Toolkit #283

cloudhan opened this issue Oct 16, 2024 · 13 comments

Comments

@udaya2899
Copy link
Contributor

@cloudhan, super excited for this! Thanks for starting the work on this. Roughly when do you see this to be done?

@cloudhan
Copy link
Collaborator Author

I am currently on my way of jumping ship, that is, I am joining NVIDIA ;). It may take sometime for me to settle down so it might take a little bit longer time. I'd hope I can have a working version by the end of next month.

@honeway
Copy link

honeway commented Dec 5, 2024

First of all, I wish you all the best in your work! Thank you for your efforts on this. We're eagerly looking forward to seeing progress on this feature, as it’s something we truly need. Please let us know if there’s any way we can assist or contribute.

@udaya2899
Copy link
Contributor

@cloudhan, I hope your time at NVIDIA is going well! We're really excited about the possibilities of using hermetic CUDA in RBE. We're currently facing a decision about whether to build a temporary non-hermetic solution for RBE or wait for this issue to be resolved.

Could you give us an update on your plan here? Any information you can share would help us make the best decision for our project's roadmap. +1 to @honeway and we'll be happy to assist some way too.

@hofbi
Copy link
Contributor

hofbi commented Dec 12, 2024

One this effort start, I can recommend using a rule based toolchain which was announced on the last BazelCon to be the modern way of writing toolchains.

@cloudhan
Copy link
Collaborator Author

@udaya2899 The cloudhan/hermetic-ctk-2 branch is actually working months ago with. Better test on it and provide some feedback.

@cloudhan
Copy link
Collaborator Author

@hofbi Seem to be very interesting. But it seems to be in a very early stage. Better just wait now.

@cloudhan
Copy link
Collaborator Author

cloudhan commented Dec 23, 2024

For a preview,

https://github.com/cloudhan/cuda-samples/blob/bazel-cuda-components/WORKSPACE.bazel shows how a manually configured repo will be. Branch cloudhan/hermetic-ctk-2 contains the related feature.

https://github.com/cloudhan/cuda-samples/blob/bazel-cuda-redist-json/WORKSPACE.bazel shows how a automatically configured repo will be for WORKSPACE based project. Branch cloudhan/hermetic-ctk-3 contains the related feature.

@udaya2899
Copy link
Contributor

udaya2899 commented Dec 23, 2024

Thanks for working on this now. We're on holiday season and I couldn't get time to experiment with your dev branch until now.

Expect to hear from me by the second week of January.

Unfortunately, we don't support WORKSPACE in our setup, and only use MODULE.bazel. Which is the most recent branch to try from? Is MODULE.bazel considered working in the tmp branch? Or is it hermetic-ctk-2 branch?

@udaya2899
Copy link
Contributor

Happy New Year 2025! I'm just back from vacation and trying to try out your branch locally using git_override or local_path_override for giving some earlier feedback if any. Which branch has a possible working solution for MODULE.bazel? I see hermetic-ctk-2, hermetic-breaking-changes as well as tmp. Let me know what's the best way to try this out on our RBE setup.

@cloudhan
Copy link
Collaborator Author

cloudhan commented Jan 10, 2025

@udaya2899 I updated previous comment. The branchs are stacked one by one, so blindly pick the last one should be OK.

@cloudhan
Copy link
Collaborator Author

You can also find MODULE base config in the referenced cuda-samples repo.

@cloudhan
Copy link
Collaborator Author

Auto config with redistrib.json in MODULE based project is not implemented at the moment. Maybe in future PRs. Another unsolved feature is how can we make switch cuda version easier. Say export or maybe a flag to build against different releases of cuda. A possible solution is to extend the current alias mapping to a versioned mapping with select in between.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants