Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compilation error on rocm-cupy framework (AMD GPU) #68

Open
danielsz opened this issue Apr 3, 2023 · 6 comments
Open

Compilation error on rocm-cupy framework (AMD GPU) #68

danielsz opened this issue Apr 3, 2023 · 6 comments

Comments

@danielsz
Copy link

danielsz commented Apr 3, 2023

Managed to run the project in the cloud, but on a local machine with AMD GPU I get a compilation error. I've reported the problem on the appropriate repository, but I want to leave this here for future reference. Also, I would love to know if there is a workaround so that I can run this amazing project locally. Thank you!

Traceback (most recent call last):
  File "/usr/lib/python3.10/site-packages/cupy/cuda/compiler.py", line 686, in compile
    nvrtc.compileProgram(self.ptr, options)
  File "cupy_backends/cuda/libs/nvrtc.pyx", line 154, in cupy_backends.cuda.libs.nvrtc.compileProgram
  File "cupy_backends/cuda/libs/nvrtc.pyx", line 166, in cupy_backends.cuda.libs.nvrtc.compileProgram
  File "cupy_backends/cuda/libs/nvrtc.pyx", line 84, in cupy_backends.cuda.libs.nvrtc.check_status
cupy_backends.cuda.libs.nvrtc.NVRTCError: HIPRTC_ERROR_COMPILATION (6)

@sniklaus
Copy link
Owner

sniklaus commented Apr 3, 2023

Thank you for your interest in our work! Not sure whether that will ultimately pan out, I am only using CuPy to compile and call custom CUDA code and I don't think AMD will be able to compile that to something that runs on their GPUs. 🤔

@danielsz
Copy link
Author

danielsz commented Apr 3, 2023

Ideally this shouldn't require you to do anything special. The requirements of this project (cupy, pytorch) have experimental support for rocm, the cuda equivalent on AMD systems. Which explains why I am able to run the PyTorch examples without any kind of modification. Unfortunately, it seems that some code in this project is not handled well by the compatibility layer, hence the wording experimental support . It makes me sad, but not all is lost. I've reported the problem on cupy. I keep my hopes in check, because, adding insult to injury, my GPU card is not officially supported by rocm.

From the rocm FAQ:

Is HIP a drop-in replacement for CUDA?

No. HIP provides porting tools which do most of the work to convert CUDA code into portable C++ code that uses the HIP APIs. Most developers will port their code from CUDA to HIP and then maintain the HIP version. HIP code provides the same performance as native CUDA code, plus the benefits of running on AMD platforms.

Again, thank you for your work!

@sniklaus
Copy link
Owner

sniklaus commented Apr 3, 2023

Note that I am under the impression that the official PyTorch examples do not have inline CUDA like in our project. And from your quote from the FAQ it seems like that that CUDA code would first have to be converted to HIP.

@danielsz
Copy link
Author

danielsz commented Apr 4, 2023

Oh, sorry I didn't realize it was inline code. In that case, the problem is understood. Is the inline code absolutely necessary, the python APIs don't provide enough coverage?

@sniklaus
Copy link
Owner

sniklaus commented Apr 4, 2023

I am afraid so, there are a bunch of operators that we rely on that are neither in PyTorch nor in CuPy.

@danielsz
Copy link
Author

danielsz commented Apr 4, 2023

Oh well, never mind then. And thank you!
Hopefully, this discussion may serve someone at some time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants