[3.2.x] `ptx_get_version` cannot handle CUDA>12.6 #5737

h-vetinari · 2025-01-29T00:43:04Z

nvidia recently released CUDA 12.8, and I'm seeing failures while running triton if it is present:

 │ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
 │ cuda_version = '12.8'
 │     @functools.lru_cache()
 │     def ptx_get_version(cuda_version) -> int:
 │         '''
 │         Get the highest PTX version supported by the current CUDA driver.
 │         '''
 │         assert isinstance(cuda_version, str)
 │         major, minor = map(int, cuda_version.split('.'))
 │         if major == 12:
 │             if minor < 6:
 │                 return 80 + minor
 │             elif minor == 6:
 │                 return 85
 │         if major == 11:
 │             return 70 + minor
 │         if major == 10:
 │             return 63 + minor
 │ >       raise RuntimeError("Triton only support CUDA 10.0 or higher, but got CUDA version: " + cuda_version)
 │ E       RuntimeError: Triton only support CUDA 10.0 or higher, but got CUDA version: 12.8
 │ ../../../../../lib/python3.11/site-packages/triton/backends/nvidia/compiler.py:57: RuntimeError

IMO it would be more appropriate to use >=6 in

triton/third_party/nvidia/backend/compiler.py

Lines 51 to 55 in 9641643

    
           if major == 12: 
        
               if minor < 6: 
        
                   return 80 + minor 
        
               elif minor == 6: 
        
                   return 85

as it's less of an issue whether an older PTX version is used, than if the whole thing errors out.

The text was updated successfully, but these errors were encountered:

ThomasRaoux · 2025-01-29T00:46:57Z

ToT tree should be fine. Here is the code:

@functools.lru_cache()
def ptx_get_version(cuda_version) -> int:
    '''
    Get the highest PTX version supported by the current CUDA driver.
    '''
    assert isinstance(cuda_version, str)
    major, minor = map(int, cuda_version.split('.'))
    if major == 12:
        if minor < 6:
            return 80 + minor
        else:
            return 80 + minor - 1
    if major == 11:
        return 70 + minor
    if major == 10:
        return 63 + minor
    raise RuntimeError("Triton only support CUDA 10.0 or higher, but got CUDA version: " + cuda_version)

h-vetinari · 2025-01-29T00:47:27Z

I see that b39c1e1 landed on main, which is almost certainly too big for backporting, but I'm wondering if

    if major == 12:
        if minor < 6:
            return 80 + minor
-       elif minor == 6:
+       elif minor >= 6:
            return 85
    if major == 11:
        return 70 + minor

would be acceptable for 3.2.x

h-vetinari · 2025-01-29T00:48:11Z

we're going to need triton 3.2 for pytorch 2.6, and it would be a pity if that cannot be used with CUDA 12.8 - I'm not talking about sm100 support, but just being able to use a CUDA 12.8 toolchain.

ThomasRaoux · 2025-01-29T00:55:10Z

@bertmaher is handling the release branch, I'll defer to him

h-vetinari · 2025-01-29T00:56:35Z

Thanks. Can you please reopen the issue in the meantime, otherwise the reduced visibility makes it all too easy for it to fall through the cracks.

bertmaher · 2025-01-29T14:24:25Z

@atalman Can we still patch this to release/3.2.x in time for PyTorch 2.6? People will almost certainly be using CUDA 12.8 soon and it'll be really frustrating if torch.compile doesn't work there because of this

h-vetinari · 2025-01-30T00:12:24Z

From our limited testing, I can confirm that

    if major == 12:
        if minor < 6:
            return 80 + minor
-       elif minor == 6:
+       elif minor >= 6:
            return 85
    if major == 11:
        return 70 + minor

works

Fixes triton-lang#5737

bertmaher · 2025-01-30T19:53:23Z

Proposing #5765 as a cherry-pick to Triton 3.2; but since we just pushed PT 2.6 I think it'll be a while before we can get this into a patch release. @atalman can clarify, hopefully

h-vetinari changed the title ~~ptx_get_version cannot handle CUDA>=12.6~~ ptx_get_version cannot handle CUDA>12.6 Jan 29, 2025

h-vetinari changed the title ~~ptx_get_version cannot handle CUDA>12.6~~ [3.2.x] ptx_get_version cannot handle CUDA>12.6 Jan 29, 2025

ThomasRaoux closed this as completed Jan 29, 2025

ThomasRaoux reopened this Jan 29, 2025

h-vetinari mentioned this issue Jan 29, 2025

don't depend on cuda-nvcc; use cuda-nvcc-impl to avoid pulling in GCC conda-forge/triton-feedstock#38

Merged

bertmaher self-assigned this Jan 29, 2025

bertmaher added a commit to bertmaher/triton that referenced this issue Jan 30, 2025

[release/3.2.x] Get proper PTX version for CUDA >= 12.6

06169b9

Fixes triton-lang#5737

bertmaher mentioned this issue Jan 30, 2025

[release/3.2.x] Get proper PTX version for CUDA >= 12.6 #5765

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[3.2.x] `ptx_get_version` cannot handle CUDA>12.6 #5737

[3.2.x] `ptx_get_version` cannot handle CUDA>12.6 #5737

h-vetinari commented Jan 29, 2025

ThomasRaoux commented Jan 29, 2025

h-vetinari commented Jan 29, 2025 •

edited

Loading

h-vetinari commented Jan 29, 2025 •

edited

Loading

ThomasRaoux commented Jan 29, 2025

h-vetinari commented Jan 29, 2025

bertmaher commented Jan 29, 2025

h-vetinari commented Jan 30, 2025 •

edited

Loading

bertmaher commented Jan 30, 2025

[3.2.x] ptx_get_version cannot handle CUDA>12.6 #5737

[3.2.x] ptx_get_version cannot handle CUDA>12.6 #5737

Comments

h-vetinari commented Jan 29, 2025

ThomasRaoux commented Jan 29, 2025

h-vetinari commented Jan 29, 2025 • edited Loading

h-vetinari commented Jan 29, 2025 • edited Loading

ThomasRaoux commented Jan 29, 2025

h-vetinari commented Jan 29, 2025

bertmaher commented Jan 29, 2025

h-vetinari commented Jan 30, 2025 • edited Loading

bertmaher commented Jan 30, 2025

[3.2.x] `ptx_get_version` cannot handle CUDA>12.6 #5737

[3.2.x] `ptx_get_version` cannot handle CUDA>12.6 #5737

h-vetinari commented Jan 29, 2025 •

edited

Loading

h-vetinari commented Jan 29, 2025 •

edited

Loading

h-vetinari commented Jan 30, 2025 •

edited

Loading