Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix horrible Shuffle bug in GPU_C_Codegen and add test. #8553

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

mcourteaux
Copy link
Contributor

Shuffle emitting in OpenCL was broken when the input to the Shuffle node were actual vectors instead of scalar. For some reason, in most of the scenarios the codegen makes it's way to the Shuffle nodes, the Shuffles are containing all vectors with 1 lane, causing no real issue without this PR. Today I ran into codegen having to shuffle multiple actual vectors in the OpenCL codegen.

@derek-gerstmann I had to disable Vulkan testing, because there is an issue on my machine with that. Could you check that out? Last excerpt from DEBUG_CODEGEN=1 output:

Skipping Hexagon offload...
Offloading GPU loops...
 CodeGen_Vulkan_Dev::SPIRV_Emitter::visit(Shuffle): type=int32x8 vectors=2 is_interleave=false is_extract_element=false
 CodeGen_Vulkan_Dev::SPIRV_Emitter::visit(Shuffle): type=int32x4 vectors=4 is_interleave=false is_extract_element=false
    vector shuffle x4 : 0 1 2 3 
 CodeGen_Vulkan_Dev::SPIRV_Emitter::visit(Shuffle): type=int32x4 vectors=4 is_interleave=false is_extract_element=false
    vector shuffle x4 : 0 1 2 3 
    vector shuffle x2 : 3 1 6 7 2 4 0 5 
 CodeGen_Vulkan_Dev::SPIRV_Emitter::visit(Shuffle): type=int32x4 vectors=1 is_interleave=false is_extract_element=false
    vector shuffle x1 : 0 1 2 3 
 CodeGen_Vulkan_Dev::SPIRV_Emitter::visit(Shuffle): type=int32x4 vectors=1 is_interleave=false is_extract_element=false
    vector shuffle x1 : 4 5 6 7 
 CodeGen_Vulkan_Dev::SPIRV_Emitter::visit(Shuffle): type=int32x2 vectors=1 is_interleave=false is_extract_element=false
    vector shuffle x1 : 0 1 
 CodeGen_Vulkan_Dev::SPIRV_Emitter::visit(Shuffle): type=int32x2 vectors=1 is_interleave=false is_extract_element=false
    vector shuffle x1 : 2 3 
 CodeGen_Vulkan_Dev::SPIRV_Emitter::visit(Shuffle): type=int32 vectors=1 is_interleave=false is_extract_element=true
 CodeGen_Vulkan_Dev::SPIRV_Emitter::visit(Shuffle): type=int32 vectors=1 is_interleave=false is_extract_element=true
Vulkan: Using static workgroup local size [8, 1, 1]...
  kernel_count = 1
  spirv_module_size[0] = 2432 bytes
Lowering Parallel Tasks...
Embedding image vulkan_buf
Embedding image vulkan_gpu_source_kernels
Target triple of initial module: x86_64--linux-gnu
Generating llvm bitcode...
Generating llvm bitcode prolog for function g...
Generating llvm bitcode for function g...
JIT compiling g for x86-64-linux-tune_znver1-avx-avx2-f16c-fma-jit-sse41-user_context-vk_v13-vulkan
[New Thread 0x7fffed4006c0 (LWP 434073)]
[New Thread 0x7fffeca006c0 (LWP 434074)]
[New Thread 0x7fffe5c006c0 (LWP 434075)]
[New Thread 0x7fffe40006c0 (LWP 434077)]
[New Thread 0x7fffe36006c0 (LWP 434079)]
NVVM compilation failed: 1
Vulkan [WARNING]: (user_context=0x7fffffffc510, id=2, name:NVIDIA) CreatePipeline: failed to compile internal representation
Vulkan [WARNING]: (user_context=0x7fffffffc510, id=2, name:NVIDIA) CreatePipeline: unexpected compilation failure
Vulkan [WARNING]: (user_context=0x7fffffffc510, id=2, name:NVIDIA) CreateComputePipeline: unexpected failure compiling SPIR-V shader: 0x9c4841a2cbb3db9d
User error triggered at /home/martijn/zec/3rd/halide/src/JITModule.cpp:1232
Error: Vulkan: Failed to create compute pipeline! vkCreateComputePipelines returned <Unknown Vulkan Result Code>
Vulkan: Failed to create compute pipeline!
Vulkan: Failed to setup compute pipeline!

@mcourteaux
Copy link
Contributor Author

I can't format the code right now... I have clang-format 18, and this stuff doesn't work due to new options used introduced in clang-format 19... 😢

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant