Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backport to 2.8: PTX support for Blackwell #3624

Merged
merged 14 commits into from
Jan 31, 2025
Merged

Backport to 2.8: PTX support for Blackwell #3624

merged 14 commits into from
Jan 31, 2025

Conversation

bernhardmgruber and others added 10 commits January 31, 2025 10:58
* ptx: Update existing instructions
* ptx: Add new instructions
* Fix returning error out values
See:
- https://gitlab-master.nvidia.com/CCCL/libcuda-ptx/-/merge_requests/74
- https://gitlab-master.nvidia.com/CCCL/libcuda-ptx/-/merge_requests/73
* ptx: Fix out var declaration
See  https://gitlab-master.nvidia.com/CCCL/libcuda-ptx/-/merge_requests/75
* mbarrier.{test,try}_wait: Fix test. Wrong files were included.
* docs: Fix special registers include
* Allow non-included documentation pages
* Workaround NVRTC

Co-authored-by: Allard Hendriksen <[email protected]>
* barrier.cluster.aligned: Remove
This is not supposed to be exposed in CCCL.

* elect.sync: Remove
Not ready for inclusion yet. This needs to handle the optional extra
output mask as well.

* mapa: Remove
This has compiler bugs. We should use intrinsics instead.

Co-authored-by: Allard Hendriksen <[email protected]>
* mbarrier.expect_tx: Add missing source and test
It was already documented(!)

* cp.async.bulk.tensor: Add .{gather,scatter}4
* fence: Add .sync_restrict, .proxy.async.sync_restrict

Co-authored-by: Allard Hendriksen <[email protected]>
* Add multimem.ld_reduce
* Add multimem.red
* Add multimem.st

Co-authored-by: Allard Hendriksen <[email protected]>
Co-authored-by: Allard Hendriksen <[email protected]>
* ptx: Add tcgen05.alloc

* ptx: Add tcgen05.commit

* ptx: Add tcgen05.cp

* ptx: Add tcgen05.fence

* ptx: Add tcgen05.ld

* ptx: Add tcgen05.mma

* ptx: Add tcgen05.mma.ws

* ptx: Add tcgen05.shift

* ptx: Add tcgen05.st

* ptx: Add tcgen05.wait

* fix docs

---------

Co-authored-by: Allard Hendriksen <[email protected]>
Copy link
Contributor

🟩 CI finished in 1h 32m: Pass: 100%/169 | Total: 3d 07h | Avg: 28m 22s | Max: 1h 21m | Hits: 213%/20880
  • 🟩 libcudacxx: Pass: 100%/48 | Total: 10h 13m | Avg: 12m 47s | Max: 38m 01s | Hits: 385%/10028

    🟩 cpu
      🟩 amd64              Pass: 100%/46  | Total:  9h 49m | Avg: 12m 48s | Max: 38m 01s | Hits: 385%/10028 
      🟩 arm64              Pass: 100%/2   | Total: 24m 37s | Avg: 12m 18s | Max: 20m 26s
    🟩 ctk
      🟩 11.1               Pass: 100%/7   | Total:  1h 09m | Avg:  9m 53s | Max: 31m 38s | Hits: 356%/2324  
      🟩 12.5               Pass: 100%/2   | Total:  1h 09m | Avg: 34m 52s | Max: 36m 33s
      🟩 12.6               Pass: 100%/39  | Total:  7h 54m | Avg: 12m 10s | Max: 38m 01s | Hits: 394%/7704  
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/4   | Total:  1h 05m | Avg: 16m 20s | Max: 19m 48s
      🟩 nvcc11.1           Pass: 100%/7   | Total:  1h 09m | Avg:  9m 53s | Max: 31m 38s | Hits: 356%/2324  
      🟩 nvcc12.5           Pass: 100%/2   | Total:  1h 09m | Avg: 34m 52s | Max: 36m 33s
      🟩 nvcc12.6           Pass: 100%/35  | Total:  6h 49m | Avg: 11m 42s | Max: 38m 01s | Hits: 394%/7704  
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/4   | Total:  1h 05m | Avg: 16m 20s | Max: 19m 48s
      🟩 nvcc               Pass: 100%/44  | Total:  9h 08m | Avg: 12m 27s | Max: 38m 01s | Hits: 385%/10028 
    🟩 cxx
      🟩 Clang9             Pass: 100%/4   | Total: 35m 39s | Avg:  8m 54s | Max: 22m 45s
      🟩 Clang10            Pass: 100%/1   | Total:  5m 27s | Avg:  5m 27s | Max:  5m 27s
      🟩 Clang11            Pass: 100%/1   | Total:  4m 20s | Avg:  4m 20s | Max:  4m 20s
      🟩 Clang12            Pass: 100%/1   | Total:  4m 31s | Avg:  4m 31s | Max:  4m 31s
      🟩 Clang13            Pass: 100%/1   | Total: 19m 59s | Avg: 19m 59s | Max: 19m 59s
      🟩 Clang14            Pass: 100%/1   | Total:  4m 33s | Avg:  4m 33s | Max:  4m 33s
      🟩 Clang15            Pass: 100%/1   | Total:  5m 03s | Avg:  5m 03s | Max:  5m 03s
      🟩 Clang16            Pass: 100%/1   | Total:  4m 39s | Avg:  4m 39s | Max:  4m 39s
      🟩 Clang17            Pass: 100%/1   | Total: 20m 18s | Avg: 20m 18s | Max: 20m 18s
      🟩 Clang18            Pass: 100%/8   | Total:  1h 33m | Avg: 11m 41s | Max: 19m 48s
      🟩 GCC6               Pass: 100%/2   | Total:  5m 20s | Avg:  2m 40s | Max:  2m 47s
      🟩 GCC7               Pass: 100%/2   | Total: 19m 58s | Avg:  9m 59s | Max: 16m 05s
      🟩 GCC8               Pass: 100%/1   | Total:  4m 07s | Avg:  4m 07s | Max:  4m 07s
      🟩 GCC9               Pass: 100%/3   | Total: 10m 01s | Avg:  3m 20s | Max:  4m 00s
      🟩 GCC10              Pass: 100%/1   | Total:  4m 22s | Avg:  4m 22s | Max:  4m 22s
      🟩 GCC11              Pass: 100%/1   | Total:  4m 14s | Avg:  4m 14s | Max:  4m 14s
      🟩 GCC12              Pass: 100%/1   | Total:  4m 36s | Avg:  4m 36s | Max:  4m 36s
      🟩 GCC13              Pass: 100%/10  | Total:  2h 03m | Avg: 12m 23s | Max: 21m 23s
      🟩 Intel2023.2.0      Pass: 100%/1   | Total: 27m 06s | Avg: 27m 06s | Max: 27m 06s
      🟩 MSVC14.16          Pass: 100%/1   | Total: 31m 38s | Avg: 31m 38s | Max: 31m 38s | Hits: 356%/2324  
      🟩 MSVC14.29          Pass: 100%/1   | Total: 37m 26s | Avg: 37m 26s | Max: 37m 26s | Hits: 396%/2519  
      🟩 MSVC14.39          Pass: 100%/2   | Total:  1h 13m | Avg: 36m 41s | Max: 38m 01s | Hits: 393%/5185  
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  1h 09m | Avg: 34m 52s | Max: 36m 33s
    🟩 cxx_family
      🟩 Clang              Pass: 100%/20  | Total:  3h 17m | Avg:  9m 53s | Max: 22m 45s
      🟩 GCC                Pass: 100%/21  | Total:  2h 56m | Avg:  8m 24s | Max: 21m 23s
      🟩 Intel              Pass: 100%/1   | Total: 27m 06s | Avg: 27m 06s | Max: 27m 06s
      🟩 MSVC               Pass: 100%/4   | Total:  2h 22m | Avg: 35m 36s | Max: 38m 01s | Hits: 385%/10028 
      🟩 NVHPC              Pass: 100%/2   | Total:  1h 09m | Avg: 34m 52s | Max: 36m 33s
    🟩 gpu
      🟩 rtx2080            Pass: 100%/8   | Total:  1h 48m | Avg: 13m 35s | Max: 21m 23s
      🟩 v100               Pass: 100%/40  | Total:  8h 25m | Avg: 12m 37s | Max: 38m 01s | Hits: 385%/10028 
    🟩 jobs
      🟩 Build              Pass: 100%/41  | Total:  8h 32m | Avg: 12m 30s | Max: 38m 01s | Hits: 385%/10028 
      🟩 NVRTC              Pass: 100%/4   | Total:  1h 15m | Avg: 18m 54s | Max: 21m 23s
      🟩 Test               Pass: 100%/2   | Total: 23m 20s | Avg: 11m 40s | Max: 14m 12s
      🟩 VerifyCodegen      Pass: 100%/1   | Total:  1m 57s | Avg:  1m 57s | Max:  1m 57s
    🟩 sm
      🟩 75                 Pass: 100%/4   | Total:  1h 15m | Avg: 18m 54s | Max: 21m 23s
      🟩 90                 Pass: 100%/1   | Total: 12m 44s | Avg: 12m 44s | Max: 12m 44s
      🟩 90a                Pass: 100%/2   | Total: 21m 13s | Avg: 10m 36s | Max: 13m 19s
    🟩 std
      🟩 11                 Pass: 100%/6   | Total:  1h 09m | Avg: 11m 34s | Max: 22m 45s
      🟩 14                 Pass: 100%/5   | Total: 57m 22s | Avg: 11m 28s | Max: 31m 38s | Hits: 356%/2324  
      🟩 17                 Pass: 100%/13  | Total:  3h 23m | Avg: 15m 39s | Max: 37m 26s | Hits: 396%/5038  
      🟩 20                 Pass: 100%/23  | Total:  4h 41m | Avg: 12m 14s | Max: 38m 01s | Hits: 390%/2666  
    
  • 🟩 cub: Pass: 100%/47 | Total: 1d 16h | Avg: 51m 06s | Max: 1h 13m | Hits: 26%/3132

    🟩 cpu
      🟩 amd64              Pass: 100%/45  | Total:  1d 14h | Avg: 50m 47s | Max:  1h 13m | Hits:  26%/3132  
      🟩 arm64              Pass: 100%/2   | Total:  1h 56m | Avg: 58m 14s | Max: 58m 31s
    🟩 ctk
      🟩 11.1               Pass: 100%/7   | Total:  5h 49m | Avg: 49m 52s | Max: 57m 29s | Hits:  26%/783   
      🟩 12.5               Pass: 100%/2   | Total:  2h 11m | Avg:  1h 05m | Max:  1h 05m
      🟩 12.6               Pass: 100%/38  | Total:  1d 08h | Avg: 50m 34s | Max:  1h 13m | Hits:  26%/2349  
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total:  1h 55m | Avg: 57m 38s | Max: 58m 15s
      🟩 nvcc11.1           Pass: 100%/7   | Total:  5h 49m | Avg: 49m 52s | Max: 57m 29s | Hits:  26%/783   
      🟩 nvcc12.5           Pass: 100%/2   | Total:  2h 11m | Avg:  1h 05m | Max:  1h 05m
      🟩 nvcc12.6           Pass: 100%/36  | Total:  1d 06h | Avg: 50m 11s | Max:  1h 13m | Hits:  26%/2349  
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total:  1h 55m | Avg: 57m 38s | Max: 58m 15s
      🟩 nvcc               Pass: 100%/45  | Total:  1d 14h | Avg: 50m 49s | Max:  1h 13m | Hits:  26%/3132  
    🟩 cxx
      🟩 Clang9             Pass: 100%/4   | Total:  3h 22m | Avg: 50m 43s | Max: 57m 43s
      🟩 Clang10            Pass: 100%/1   | Total:  1h 01m | Avg:  1h 01m | Max:  1h 01m
      🟩 Clang11            Pass: 100%/1   | Total: 59m 27s | Avg: 59m 27s | Max: 59m 27s
      🟩 Clang12            Pass: 100%/1   | Total: 56m 18s | Avg: 56m 18s | Max: 56m 18s
      🟩 Clang13            Pass: 100%/1   | Total: 59m 30s | Avg: 59m 30s | Max: 59m 30s
      🟩 Clang14            Pass: 100%/1   | Total: 59m 16s | Avg: 59m 16s | Max: 59m 16s
      🟩 Clang15            Pass: 100%/1   | Total:  1h 00m | Avg:  1h 00m | Max:  1h 00m
      🟩 Clang16            Pass: 100%/1   | Total:  1h 02m | Avg:  1h 02m | Max:  1h 02m
      🟩 Clang17            Pass: 100%/1   | Total: 59m 34s | Avg: 59m 34s | Max: 59m 34s
      🟩 Clang18            Pass: 100%/7   | Total:  5h 27m | Avg: 46m 42s | Max: 58m 38s
      🟩 GCC6               Pass: 100%/2   | Total:  1h 37m | Avg: 48m 56s | Max: 50m 35s
      🟩 GCC7               Pass: 100%/2   | Total:  1h 48m | Avg: 54m 11s | Max: 55m 04s
      🟩 GCC8               Pass: 100%/1   | Total: 54m 13s | Avg: 54m 13s | Max: 54m 13s
      🟩 GCC9               Pass: 100%/3   | Total:  2h 36m | Avg: 52m 14s | Max: 54m 55s
      🟩 GCC10              Pass: 100%/1   | Total: 55m 06s | Avg: 55m 06s | Max: 55m 06s
      🟩 GCC11              Pass: 100%/1   | Total:  1h 01m | Avg:  1h 01m | Max:  1h 01m
      🟩 GCC12              Pass: 100%/3   | Total:  1h 53m | Avg: 37m 52s | Max:  1h 03m
      🟩 GCC13              Pass: 100%/8   | Total:  4h 41m | Avg: 35m 14s | Max:  1h 01m
      🟩 Intel2023.2.0      Pass: 100%/1   | Total:  1h 03m | Avg:  1h 03m | Max:  1h 03m
      🟩 MSVC14.16          Pass: 100%/1   | Total: 57m 29s | Avg: 57m 29s | Max: 57m 29s | Hits:  26%/783   
      🟩 MSVC14.29          Pass: 100%/1   | Total:  1h 11m | Avg:  1h 11m | Max:  1h 11m | Hits:  26%/783   
      🟩 MSVC14.39          Pass: 100%/2   | Total:  2h 20m | Avg:  1h 10m | Max:  1h 13m | Hits:  27%/1566  
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  2h 11m | Avg:  1h 05m | Max:  1h 05m
    🟩 cxx_family
      🟩 Clang              Pass: 100%/19  | Total: 16h 48m | Avg: 53m 03s | Max:  1h 02m
      🟩 GCC                Pass: 100%/21  | Total: 15h 29m | Avg: 44m 14s | Max:  1h 03m
      🟩 Intel              Pass: 100%/1   | Total:  1h 03m | Avg:  1h 03m | Max:  1h 03m
      🟩 MSVC               Pass: 100%/4   | Total:  4h 30m | Avg:  1h 07m | Max:  1h 13m | Hits:  26%/3132  
      🟩 NVHPC              Pass: 100%/2   | Total:  2h 11m | Avg:  1h 05m | Max:  1h 05m
    🟩 gpu
      🟩 h100               Pass: 100%/2   | Total: 50m 27s | Avg: 25m 13s | Max: 25m 51s
      🟩 rtxa6000           Pass: 100%/8   | Total:  4h 02m | Avg: 30m 15s | Max:  1h 01m
      🟩 v100               Pass: 100%/37  | Total:  1d 11h | Avg: 57m 01s | Max:  1h 13m | Hits:  26%/3132  
    🟩 jobs
      🟩 Build              Pass: 100%/40  | Total:  1d 13h | Avg: 56m 23s | Max:  1h 13m | Hits:  26%/3132  
      🟩 DeviceLaunch       Pass: 100%/1   | Total: 18m 45s | Avg: 18m 45s | Max: 18m 45s
      🟩 GraphCapture       Pass: 100%/1   | Total: 15m 34s | Avg: 15m 34s | Max: 15m 34s
      🟩 HostLaunch         Pass: 100%/3   | Total:  1h 12m | Avg: 24m 16s | Max: 24m 58s
      🟩 TestGPU            Pass: 100%/2   | Total: 39m 02s | Avg: 19m 31s | Max: 20m 29s
    🟩 sm
      🟩 90                 Pass: 100%/2   | Total: 50m 27s | Avg: 25m 13s | Max: 25m 51s
      🟩 90a                Pass: 100%/1   | Total: 25m 07s | Avg: 25m 07s | Max: 25m 07s
    🟩 std
      🟩 11                 Pass: 100%/5   | Total:  4h 14m | Avg: 50m 53s | Max: 55m 04s
      🟩 14                 Pass: 100%/4   | Total:  3h 35m | Avg: 53m 57s | Max: 57m 43s | Hits:  26%/783   
      🟩 17                 Pass: 100%/12  | Total: 11h 51m | Avg: 59m 17s | Max:  1h 13m | Hits:  26%/1566  
      🟩 20                 Pass: 100%/26  | Total: 20h 20m | Avg: 46m 56s | Max:  1h 07m | Hits:  26%/783   
    
  • 🟩 thrust: Pass: 100%/45 | Total: 1d 02h | Avg: 35m 26s | Max: 1h 21m | Hits: 66%/7408

    🟩 cmake_options
      🟩 -DTHRUST_DISPATCH_TYPE=Force32bit Pass: 100%/2   | Total: 42m 49s | Avg: 21m 24s | Max: 31m 49s
    🟩 cpu
      🟩 amd64              Pass: 100%/43  | Total:  1d 01h | Avg: 35m 31s | Max:  1h 21m | Hits:  66%/7408  
      🟩 arm64              Pass: 100%/2   | Total:  1h 07m | Avg: 33m 42s | Max: 34m 03s
    🟩 ctk
      🟩 11.1               Pass: 100%/7   | Total:  3h 44m | Avg: 32m 04s | Max:  1h 03m | Hits:  74%/1852  
      🟩 12.5               Pass: 100%/2   | Total:  2h 26m | Avg:  1h 13m | Max:  1h 14m
      🟩 12.6               Pass: 100%/36  | Total: 20h 24m | Avg: 34m 00s | Max:  1h 21m | Hits:  64%/5556  
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total:  1h 03m | Avg: 31m 43s | Max: 31m 59s
      🟩 nvcc11.1           Pass: 100%/7   | Total:  3h 44m | Avg: 32m 04s | Max:  1h 03m | Hits:  74%/1852  
      🟩 nvcc12.5           Pass: 100%/2   | Total:  2h 26m | Avg:  1h 13m | Max:  1h 14m
      🟩 nvcc12.6           Pass: 100%/34  | Total: 19h 20m | Avg: 34m 08s | Max:  1h 21m | Hits:  64%/5556  
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total:  1h 03m | Avg: 31m 43s | Max: 31m 59s
      🟩 nvcc               Pass: 100%/43  | Total:  1d 01h | Avg: 35m 37s | Max:  1h 21m | Hits:  66%/7408  
    🟩 cxx
      🟩 Clang9             Pass: 100%/4   | Total:  1h 56m | Avg: 29m 13s | Max: 33m 53s
      🟩 Clang10            Pass: 100%/1   | Total: 36m 56s | Avg: 36m 56s | Max: 36m 56s
      🟩 Clang11            Pass: 100%/1   | Total: 34m 48s | Avg: 34m 48s | Max: 34m 48s
      🟩 Clang12            Pass: 100%/1   | Total: 36m 13s | Avg: 36m 13s | Max: 36m 13s
      🟩 Clang13            Pass: 100%/1   | Total: 34m 29s | Avg: 34m 29s | Max: 34m 29s
      🟩 Clang14            Pass: 100%/1   | Total: 31m 57s | Avg: 31m 57s | Max: 31m 57s
      🟩 Clang15            Pass: 100%/1   | Total: 31m 43s | Avg: 31m 43s | Max: 31m 43s
      🟩 Clang16            Pass: 100%/1   | Total: 34m 41s | Avg: 34m 41s | Max: 34m 41s
      🟩 Clang17            Pass: 100%/1   | Total: 32m 52s | Avg: 32m 52s | Max: 32m 52s
      🟩 Clang18            Pass: 100%/7   | Total:  2h 58m | Avg: 25m 28s | Max: 33m 21s
      🟩 GCC6               Pass: 100%/2   | Total: 52m 50s | Avg: 26m 25s | Max: 29m 06s
      🟩 GCC7               Pass: 100%/2   | Total: 59m 12s | Avg: 29m 36s | Max: 34m 00s
      🟩 GCC8               Pass: 100%/1   | Total: 36m 46s | Avg: 36m 46s | Max: 36m 46s
      🟩 GCC9               Pass: 100%/3   | Total:  1h 31m | Avg: 30m 31s | Max: 36m 43s
      🟩 GCC10              Pass: 100%/1   | Total: 37m 58s | Avg: 37m 58s | Max: 37m 58s
      🟩 GCC11              Pass: 100%/1   | Total: 34m 51s | Avg: 34m 51s | Max: 34m 51s
      🟩 GCC12              Pass: 100%/1   | Total: 38m 50s | Avg: 38m 50s | Max: 38m 50s
      🟩 GCC13              Pass: 100%/8   | Total:  3h 16m | Avg: 24m 33s | Max: 40m 51s
      🟩 Intel2023.2.0      Pass: 100%/1   | Total: 47m 00s | Avg: 47m 00s | Max: 47m 00s
      🟩 MSVC14.16          Pass: 100%/1   | Total:  1h 03m | Avg:  1h 03m | Max:  1h 03m | Hits:  74%/1852  
      🟩 MSVC14.29          Pass: 100%/1   | Total:  1h 07m | Avg:  1h 07m | Max:  1h 07m | Hits:  64%/1852  
      🟩 MSVC14.39          Pass: 100%/2   | Total:  2h 32m | Avg:  1h 16m | Max:  1h 21m | Hits:  64%/3704  
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  2h 26m | Avg:  1h 13m | Max:  1h 14m
    🟩 cxx_family
      🟩 Clang              Pass: 100%/19  | Total:  9h 28m | Avg: 29m 56s | Max: 36m 56s
      🟩 GCC                Pass: 100%/19  | Total:  9h 08m | Avg: 28m 52s | Max: 40m 51s
      🟩 Intel              Pass: 100%/1   | Total: 47m 00s | Avg: 47m 00s | Max: 47m 00s
      🟩 MSVC               Pass: 100%/4   | Total:  4h 44m | Avg:  1h 11m | Max:  1h 21m | Hits:  66%/7408  
      🟩 NVHPC              Pass: 100%/2   | Total:  2h 26m | Avg:  1h 13m | Max:  1h 14m
    🟩 gpu
      🟩 rtx4090            Pass: 100%/8   | Total:  2h 32m | Avg: 19m 04s | Max: 40m 51s
      🟩 v100               Pass: 100%/37  | Total:  1d 00h | Avg: 38m 59s | Max:  1h 21m | Hits:  66%/7408  
    🟩 jobs
      🟩 Build              Pass: 100%/40  | Total:  1d 01h | Avg: 38m 41s | Max:  1h 21m | Hits:  66%/7408  
      🟩 TestCPU            Pass: 100%/2   | Total: 14m 59s | Avg:  7m 29s | Max:  7m 49s
      🟩 TestGPU            Pass: 100%/3   | Total: 32m 37s | Avg: 10m 52s | Max: 11m 19s
    🟩 sm
      🟩 90a                Pass: 100%/1   | Total: 22m 46s | Avg: 22m 46s | Max: 22m 46s
    🟩 std
      🟩 11                 Pass: 100%/5   | Total:  2h 06m | Avg: 25m 19s | Max: 30m 02s
      🟩 14                 Pass: 100%/4   | Total:  2h 40m | Avg: 40m 12s | Max:  1h 03m | Hits:  74%/1852  
      🟩 17                 Pass: 100%/12  | Total:  8h 52m | Avg: 44m 23s | Max:  1h 14m | Hits:  64%/3704  
      🟩 20                 Pass: 100%/22  | Total: 12h 12m | Avg: 33m 16s | Max:  1h 21m | Hits:  64%/1852  
    
  • 🟩 cudax: Pass: 100%/26 | Total: 2h 24m | Avg: 5m 32s | Max: 14m 48s | Hits: 67%/312

    🟩 cpu
      🟩 amd64              Pass: 100%/22  | Total:  2h 10m | Avg:  5m 55s | Max: 14m 48s | Hits:  67%/312   
      🟩 arm64              Pass: 100%/4   | Total: 13m 51s | Avg:  3m 27s | Max:  3m 33s
    🟩 ctk
      🟩 12.0               Pass: 100%/3   | Total: 18m 41s | Avg:  6m 13s | Max: 11m 06s | Hits:  67%/156   
      🟩 12.5               Pass: 100%/2   | Total: 17m 29s | Avg:  8m 44s | Max:  8m 54s
      🟩 12.6               Pass: 100%/21  | Total:  1h 47m | Avg:  5m 08s | Max: 14m 48s | Hits:  67%/156   
    🟩 cudacxx
      🟩 nvcc12.0           Pass: 100%/3   | Total: 18m 41s | Avg:  6m 13s | Max: 11m 06s | Hits:  67%/156   
      🟩 nvcc12.5           Pass: 100%/2   | Total: 17m 29s | Avg:  8m 44s | Max:  8m 54s
      🟩 nvcc12.6           Pass: 100%/21  | Total:  1h 47m | Avg:  5m 08s | Max: 14m 48s | Hits:  67%/156   
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/26  | Total:  2h 24m | Avg:  5m 32s | Max: 14m 48s | Hits:  67%/312   
    🟩 cxx
      🟩 Clang9             Pass: 100%/1   | Total:  3m 54s | Avg:  3m 54s | Max:  3m 54s
      🟩 Clang10            Pass: 100%/1   | Total:  4m 39s | Avg:  4m 39s | Max:  4m 39s
      🟩 Clang11            Pass: 100%/1   | Total:  3m 57s | Avg:  3m 57s | Max:  3m 57s
      🟩 Clang12            Pass: 100%/1   | Total:  3m 38s | Avg:  3m 38s | Max:  3m 38s
      🟩 Clang13            Pass: 100%/1   | Total:  3m 48s | Avg:  3m 48s | Max:  3m 48s
      🟩 Clang14            Pass: 100%/1   | Total:  4m 04s | Avg:  4m 04s | Max:  4m 04s
      🟩 Clang15            Pass: 100%/1   | Total:  4m 02s | Avg:  4m 02s | Max:  4m 02s
      🟩 Clang16            Pass: 100%/1   | Total:  4m 07s | Avg:  4m 07s | Max:  4m 07s
      🟩 Clang17            Pass: 100%/1   | Total:  4m 14s | Avg:  4m 14s | Max:  4m 14s
      🟩 Clang18            Pass: 100%/4   | Total: 25m 37s | Avg:  6m 24s | Max: 14m 48s
      🟩 GCC9               Pass: 100%/1   | Total:  3m 41s | Avg:  3m 41s | Max:  3m 41s
      🟩 GCC10              Pass: 100%/1   | Total:  3m 36s | Avg:  3m 36s | Max:  3m 36s
      🟩 GCC11              Pass: 100%/1   | Total:  3m 40s | Avg:  3m 40s | Max:  3m 40s
      🟩 GCC12              Pass: 100%/2   | Total: 16m 10s | Avg:  8m 05s | Max: 11m 49s
      🟩 GCC13              Pass: 100%/4   | Total: 13m 42s | Avg:  3m 25s | Max:  3m 32s
      🟩 MSVC14.36          Pass: 100%/1   | Total: 11m 06s | Avg: 11m 06s | Max: 11m 06s | Hits:  67%/156   
      🟩 MSVC14.39          Pass: 100%/1   | Total: 12m 41s | Avg: 12m 41s | Max: 12m 41s | Hits:  67%/156   
      🟩 NVHPC24.7          Pass: 100%/2   | Total: 17m 29s | Avg:  8m 44s | Max:  8m 54s
    🟩 cxx_family
      🟩 Clang              Pass: 100%/13  | Total:  1h 02m | Avg:  4m 46s | Max: 14m 48s
      🟩 GCC                Pass: 100%/9   | Total: 40m 49s | Avg:  4m 32s | Max: 11m 49s
      🟩 MSVC               Pass: 100%/2   | Total: 23m 47s | Avg: 11m 53s | Max: 12m 41s | Hits:  67%/312   
      🟩 NVHPC              Pass: 100%/2   | Total: 17m 29s | Avg:  8m 44s | Max:  8m 54s
    🟩 gpu
      🟩 rtx2080            Pass: 100%/4   | Total: 34m 53s | Avg:  8m 43s | Max: 14m 48s
      🟩 v100               Pass: 100%/22  | Total:  1h 49m | Avg:  4m 57s | Max: 12m 41s | Hits:  67%/312   
    🟩 jobs
      🟩 Build              Pass: 100%/24  | Total:  1h 57m | Avg:  4m 53s | Max: 12m 41s | Hits:  67%/312   
      🟩 Test               Pass: 100%/2   | Total: 26m 37s | Avg: 13m 18s | Max: 14m 48s
    🟩 sm
      🟩 90                 Pass: 100%/1   | Total:  3m 23s | Avg:  3m 23s | Max:  3m 23s
      🟩 90a                Pass: 100%/1   | Total:  3m 22s | Avg:  3m 22s | Max:  3m 22s
    🟩 std
      🟩 17                 Pass: 100%/6   | Total: 26m 31s | Avg:  4m 25s | Max:  8m 35s
      🟩 20                 Pass: 100%/20  | Total:  1h 57m | Avg:  5m 52s | Max: 14m 48s | Hits:  67%/312   
    
  • 🟩 cccl_c_parallel: Pass: 100%/2 | Total: 8m 26s | Avg: 4m 13s | Max: 6m 15s

    🟩 cpu
      🟩 amd64              Pass: 100%/2   | Total:  8m 26s | Avg:  4m 13s | Max:  6m 15s
    🟩 ctk
      🟩 12.6               Pass: 100%/2   | Total:  8m 26s | Avg:  4m 13s | Max:  6m 15s
    🟩 cudacxx
      🟩 nvcc12.6           Pass: 100%/2   | Total:  8m 26s | Avg:  4m 13s | Max:  6m 15s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/2   | Total:  8m 26s | Avg:  4m 13s | Max:  6m 15s
    🟩 cxx
      🟩 GCC13              Pass: 100%/2   | Total:  8m 26s | Avg:  4m 13s | Max:  6m 15s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/2   | Total:  8m 26s | Avg:  4m 13s | Max:  6m 15s
    🟩 gpu
      🟩 rtx2080            Pass: 100%/2   | Total:  8m 26s | Avg:  4m 13s | Max:  6m 15s
    🟩 jobs
      🟩 Build              Pass: 100%/1   | Total:  2m 11s | Avg:  2m 11s | Max:  2m 11s
      🟩 Test               Pass: 100%/1   | Total:  6m 15s | Avg:  6m 15s | Max:  6m 15s
    
  • 🟩 python: Pass: 100%/1 | Total: 30m 28s | Avg: 30m 28s | Max: 30m 28s

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total: 30m 28s | Avg: 30m 28s | Max: 30m 28s
    🟩 ctk
      🟩 12.6               Pass: 100%/1   | Total: 30m 28s | Avg: 30m 28s | Max: 30m 28s
    🟩 cudacxx
      🟩 nvcc12.6           Pass: 100%/1   | Total: 30m 28s | Avg: 30m 28s | Max: 30m 28s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total: 30m 28s | Avg: 30m 28s | Max: 30m 28s
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total: 30m 28s | Avg: 30m 28s | Max: 30m 28s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total: 30m 28s | Avg: 30m 28s | Max: 30m 28s
    🟩 gpu
      🟩 rtx2080            Pass: 100%/1   | Total: 30m 28s | Avg: 30m 28s | Max: 30m 28s
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total: 30m 28s | Avg: 30m 28s | Max: 30m 28s
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
+/- libcu++
CUB
Thrust
CUDA Experimental
python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
+/- libcu++
+/- CUB
+/- Thrust
+/- CUDA Experimental
+/- python
+/- CCCL C Parallel Library
+/- Catch2Helper

🏃‍ Runner counts (total jobs: 169)

# Runner
125 linux-amd64-cpu16
14 windows-amd64-cpu16
10 linux-amd64-gpu-rtx2080-latest-1
10 linux-arm64-cpu16
6 linux-amd64-gpu-rtxa6000-latest-1
3 linux-amd64-gpu-rtx4090-latest-1
1 linux-amd64-gpu-h100-latest-1

Copy link
Contributor

🟩 CI finished in 1h 25m: Pass: 100%/169 | Total: 3d 03h | Avg: 26m 55s | Max: 1h 19m | Hits: 407%/20880
  • 🟩 libcudacxx: Pass: 100%/48 | Total: 6h 56m | Avg: 8m 40s | Max: 26m 56s | Hits: 681%/10028

    🟩 cpu
      🟩 amd64              Pass: 100%/46  | Total:  6h 49m | Avg:  8m 53s | Max: 26m 56s | Hits: 681%/10028 
      🟩 arm64              Pass: 100%/2   | Total:  7m 34s | Avg:  3m 47s | Max:  4m 05s
    🟩 ctk
      🟩 11.1               Pass: 100%/7   | Total: 57m 04s | Avg:  8m 09s | Max: 26m 38s | Hits: 682%/2324  
      🟩 12.5               Pass: 100%/2   | Total: 16m 17s | Avg:  8m 08s | Max:  8m 18s
      🟩 12.6               Pass: 100%/39  | Total:  5h 43m | Avg:  8m 48s | Max: 26m 56s | Hits: 681%/7704  
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/4   | Total:  1h 08m | Avg: 17m 06s | Max: 21m 30s
      🟩 nvcc11.1           Pass: 100%/7   | Total: 57m 04s | Avg:  8m 09s | Max: 26m 38s | Hits: 682%/2324  
      🟩 nvcc12.5           Pass: 100%/2   | Total: 16m 17s | Avg:  8m 08s | Max:  8m 18s
      🟩 nvcc12.6           Pass: 100%/35  | Total:  4h 34m | Avg:  7m 51s | Max: 26m 56s | Hits: 681%/7704  
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/4   | Total:  1h 08m | Avg: 17m 06s | Max: 21m 30s
      🟩 nvcc               Pass: 100%/44  | Total:  5h 48m | Avg:  7m 54s | Max: 26m 56s | Hits: 681%/10028 
    🟩 cxx
      🟩 Clang9             Pass: 100%/4   | Total: 28m 44s | Avg:  7m 11s | Max: 15m 33s
      🟩 Clang10            Pass: 100%/1   | Total:  5m 10s | Avg:  5m 10s | Max:  5m 10s
      🟩 Clang11            Pass: 100%/1   | Total:  4m 27s | Avg:  4m 27s | Max:  4m 27s
      🟩 Clang12            Pass: 100%/1   | Total:  4m 25s | Avg:  4m 25s | Max:  4m 25s
      🟩 Clang13            Pass: 100%/1   | Total:  4m 39s | Avg:  4m 39s | Max:  4m 39s
      🟩 Clang14            Pass: 100%/1   | Total:  4m 22s | Avg:  4m 22s | Max:  4m 22s
      🟩 Clang15            Pass: 100%/1   | Total:  4m 56s | Avg:  4m 56s | Max:  4m 56s
      🟩 Clang16            Pass: 100%/1   | Total:  4m 40s | Avg:  4m 40s | Max:  4m 40s
      🟩 Clang17            Pass: 100%/1   | Total:  4m 49s | Avg:  4m 49s | Max:  4m 49s
      🟩 Clang18            Pass: 100%/8   | Total:  1h 30m | Avg: 11m 16s | Max: 21m 30s
      🟩 GCC6               Pass: 100%/2   | Total:  5m 38s | Avg:  2m 49s | Max:  2m 58s
      🟩 GCC7               Pass: 100%/2   | Total:  7m 13s | Avg:  3m 36s | Max:  3m 48s
      🟩 GCC8               Pass: 100%/1   | Total:  3m 54s | Avg:  3m 54s | Max:  3m 54s
      🟩 GCC9               Pass: 100%/3   | Total:  9m 47s | Avg:  3m 15s | Max:  3m 58s
      🟩 GCC10              Pass: 100%/1   | Total:  4m 10s | Avg:  4m 10s | Max:  4m 10s
      🟩 GCC11              Pass: 100%/1   | Total:  4m 26s | Avg:  4m 26s | Max:  4m 26s
      🟩 GCC12              Pass: 100%/1   | Total:  4m 03s | Avg:  4m 03s | Max:  4m 03s
      🟩 GCC13              Pass: 100%/10  | Total:  1h 33m | Avg:  9m 19s | Max: 19m 24s
      🟩 Intel2023.2.0      Pass: 100%/1   | Total:  5m 46s | Avg:  5m 46s | Max:  5m 46s
      🟩 MSVC14.16          Pass: 100%/1   | Total: 26m 38s | Avg: 26m 38s | Max: 26m 38s | Hits: 682%/2324  
      🟩 MSVC14.29          Pass: 100%/1   | Total: 26m 56s | Avg: 26m 56s | Max: 26m 56s | Hits: 682%/2519  
      🟩 MSVC14.39          Pass: 100%/2   | Total: 52m 17s | Avg: 26m 08s | Max: 26m 14s | Hits: 681%/5185  
      🟩 NVHPC24.7          Pass: 100%/2   | Total: 16m 17s | Avg:  8m 08s | Max:  8m 18s
    🟩 cxx_family
      🟩 Clang              Pass: 100%/20  | Total:  2h 36m | Avg:  7m 49s | Max: 21m 30s
      🟩 GCC                Pass: 100%/21  | Total:  2h 12m | Avg:  6m 18s | Max: 19m 24s
      🟩 Intel              Pass: 100%/1   | Total:  5m 46s | Avg:  5m 46s | Max:  5m 46s
      🟩 MSVC               Pass: 100%/4   | Total:  1h 45m | Avg: 26m 27s | Max: 26m 56s | Hits: 681%/10028 
      🟩 NVHPC              Pass: 100%/2   | Total: 16m 17s | Avg:  8m 08s | Max:  8m 18s
    🟩 gpu
      🟩 rtx2080            Pass: 100%/8   | Total:  1h 32m | Avg: 11m 35s | Max: 19m 24s
      🟩 v100               Pass: 100%/40  | Total:  5h 23m | Avg:  8m 05s | Max: 26m 56s | Hits: 681%/10028 
    🟩 jobs
      🟩 Build              Pass: 100%/41  | Total:  5h 30m | Avg:  8m 03s | Max: 26m 56s | Hits: 681%/10028 
      🟩 NVRTC              Pass: 100%/4   | Total:  1h 07m | Avg: 16m 45s | Max: 19m 24s
      🟩 Test               Pass: 100%/2   | Total: 16m 55s | Avg:  8m 27s | Max:  8m 30s
      🟩 VerifyCodegen      Pass: 100%/1   | Total:  2m 02s | Avg:  2m 02s | Max:  2m 02s
    🟩 sm
      🟩 75                 Pass: 100%/4   | Total:  1h 07m | Avg: 16m 45s | Max: 19m 24s
      🟩 90                 Pass: 100%/1   | Total: 13m 16s | Avg: 13m 16s | Max: 13m 16s
      🟩 90a                Pass: 100%/2   | Total: 17m 11s | Avg:  8m 35s | Max: 13m 19s
    🟩 std
      🟩 11                 Pass: 100%/6   | Total: 35m 44s | Avg:  5m 57s | Max: 18m 59s
      🟩 14                 Pass: 100%/5   | Total: 53m 09s | Avg: 10m 37s | Max: 26m 38s | Hits: 682%/2324  
      🟩 17                 Pass: 100%/13  | Total:  2h 27m | Avg: 11m 19s | Max: 26m 56s | Hits: 682%/5038  
      🟩 20                 Pass: 100%/23  | Total:  2h 58m | Avg:  7m 45s | Max: 26m 14s | Hits: 681%/2666  
    
  • 🟩 cub: Pass: 100%/47 | Total: 1d 15h | Avg: 50m 28s | Max: 1h 12m | Hits: 212%/3132

    🟩 cpu
      🟩 amd64              Pass: 100%/45  | Total:  1d 13h | Avg: 50m 09s | Max:  1h 12m | Hits: 212%/3132  
      🟩 arm64              Pass: 100%/2   | Total:  1h 55m | Avg: 57m 35s | Max: 59m 01s
    🟩 ctk
      🟩 11.1               Pass: 100%/7   | Total:  5h 44m | Avg: 49m 12s | Max: 58m 44s | Hits: 212%/783   
      🟩 12.5               Pass: 100%/2   | Total:  2h 14m | Avg:  1h 07m | Max:  1h 11m
      🟩 12.6               Pass: 100%/38  | Total:  1d 07h | Avg: 49m 49s | Max:  1h 12m | Hits: 212%/2349  
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total:  2h 00m | Avg:  1h 00m | Max:  1h 02m
      🟩 nvcc11.1           Pass: 100%/7   | Total:  5h 44m | Avg: 49m 12s | Max: 58m 44s | Hits: 212%/783   
      🟩 nvcc12.5           Pass: 100%/2   | Total:  2h 14m | Avg:  1h 07m | Max:  1h 11m
      🟩 nvcc12.6           Pass: 100%/36  | Total:  1d 05h | Avg: 49m 14s | Max:  1h 12m | Hits: 212%/2349  
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total:  2h 00m | Avg:  1h 00m | Max:  1h 02m
      🟩 nvcc               Pass: 100%/45  | Total:  1d 13h | Avg: 50m 02s | Max:  1h 12m | Hits: 212%/3132  
    🟩 cxx
      🟩 Clang9             Pass: 100%/4   | Total:  3h 24m | Avg: 51m 00s | Max: 54m 35s
      🟩 Clang10            Pass: 100%/1   | Total:  1h 01m | Avg:  1h 01m | Max:  1h 01m
      🟩 Clang11            Pass: 100%/1   | Total:  1h 00m | Avg:  1h 00m | Max:  1h 00m
      🟩 Clang12            Pass: 100%/1   | Total: 54m 29s | Avg: 54m 29s | Max: 54m 29s
      🟩 Clang13            Pass: 100%/1   | Total: 58m 39s | Avg: 58m 39s | Max: 58m 39s
      🟩 Clang14            Pass: 100%/1   | Total: 57m 11s | Avg: 57m 11s | Max: 57m 11s
      🟩 Clang15            Pass: 100%/1   | Total: 56m 27s | Avg: 56m 27s | Max: 56m 27s
      🟩 Clang16            Pass: 100%/1   | Total: 56m 53s | Avg: 56m 53s | Max: 56m 53s
      🟩 Clang17            Pass: 100%/1   | Total: 53m 21s | Avg: 53m 21s | Max: 53m 21s
      🟩 Clang18            Pass: 100%/7   | Total:  5h 31m | Avg: 47m 23s | Max:  1h 02m
      🟩 GCC6               Pass: 100%/2   | Total:  1h 31m | Avg: 45m 41s | Max: 47m 04s
      🟩 GCC7               Pass: 100%/2   | Total:  1h 47m | Avg: 53m 45s | Max: 54m 43s
      🟩 GCC8               Pass: 100%/1   | Total: 59m 07s | Avg: 59m 07s | Max: 59m 07s
      🟩 GCC9               Pass: 100%/3   | Total:  2h 37m | Avg: 52m 28s | Max:  1h 00m
      🟩 GCC10              Pass: 100%/1   | Total: 58m 09s | Avg: 58m 09s | Max: 58m 09s
      🟩 GCC11              Pass: 100%/1   | Total: 59m 31s | Avg: 59m 31s | Max: 59m 31s
      🟩 GCC12              Pass: 100%/3   | Total:  1h 48m | Avg: 36m 01s | Max: 58m 15s
      🟩 GCC13              Pass: 100%/8   | Total:  4h 29m | Avg: 33m 41s | Max: 59m 45s
      🟩 Intel2023.2.0      Pass: 100%/1   | Total:  1h 05m | Avg:  1h 05m | Max:  1h 05m
      🟩 MSVC14.16          Pass: 100%/1   | Total: 58m 44s | Avg: 58m 44s | Max: 58m 44s | Hits: 212%/783   
      🟩 MSVC14.29          Pass: 100%/1   | Total:  1h 08m | Avg:  1h 08m | Max:  1h 08m | Hits: 213%/783   
      🟩 MSVC14.39          Pass: 100%/2   | Total:  2h 20m | Avg:  1h 10m | Max:  1h 12m | Hits: 212%/1566  
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  2h 14m | Avg:  1h 07m | Max:  1h 11m
    🟩 cxx_family
      🟩 Clang              Pass: 100%/19  | Total: 16h 34m | Avg: 52m 20s | Max:  1h 02m
      🟩 GCC                Pass: 100%/21  | Total: 15h 10m | Avg: 43m 22s | Max:  1h 00m
      🟩 Intel              Pass: 100%/1   | Total:  1h 05m | Avg:  1h 05m | Max:  1h 05m
      🟩 MSVC               Pass: 100%/4   | Total:  4h 27m | Avg:  1h 06m | Max:  1h 12m | Hits: 212%/3132  
      🟩 NVHPC              Pass: 100%/2   | Total:  2h 14m | Avg:  1h 07m | Max:  1h 11m
    🟩 gpu
      🟩 h100               Pass: 100%/2   | Total: 49m 50s | Avg: 24m 55s | Max: 27m 09s
      🟩 rtxa6000           Pass: 100%/8   | Total:  3h 49m | Avg: 28m 42s | Max: 59m 45s
      🟩 v100               Pass: 100%/37  | Total:  1d 10h | Avg: 56m 33s | Max:  1h 12m | Hits: 212%/3132  
    🟩 jobs
      🟩 Build              Pass: 100%/40  | Total:  1d 13h | Avg: 55m 55s | Max:  1h 12m | Hits: 212%/3132  
      🟩 DeviceLaunch       Pass: 100%/1   | Total: 16m 16s | Avg: 16m 16s | Max: 16m 16s
      🟩 GraphCapture       Pass: 100%/1   | Total: 14m 35s | Avg: 14m 35s | Max: 14m 35s
      🟩 HostLaunch         Pass: 100%/3   | Total:  1h 05m | Avg: 21m 55s | Max: 22m 41s
      🟩 TestGPU            Pass: 100%/2   | Total: 38m 22s | Avg: 19m 11s | Max: 20m 18s
    🟩 sm
      🟩 90                 Pass: 100%/2   | Total: 49m 50s | Avg: 24m 55s | Max: 27m 09s
      🟩 90a                Pass: 100%/1   | Total: 24m 17s | Avg: 24m 17s | Max: 24m 17s
    🟩 std
      🟩 11                 Pass: 100%/5   | Total:  4h 06m | Avg: 49m 23s | Max: 52m 47s
      🟩 14                 Pass: 100%/4   | Total:  3h 35m | Avg: 53m 46s | Max: 58m 44s | Hits: 212%/783   
      🟩 17                 Pass: 100%/12  | Total: 12h 03m | Avg:  1h 00m | Max:  1h 11m | Hits: 213%/1566  
      🟩 20                 Pass: 100%/26  | Total: 19h 46m | Avg: 45m 37s | Max:  1h 12m | Hits: 211%/783   
    
  • 🟩 thrust: Pass: 100%/45 | Total: 1d 02h | Avg: 35m 22s | Max: 1h 19m | Hits: 112%/7408

    🟩 cmake_options
      🟩 -DTHRUST_DISPATCH_TYPE=Force32bit Pass: 100%/2   | Total: 41m 34s | Avg: 20m 47s | Max: 30m 56s
    🟩 cpu
      🟩 amd64              Pass: 100%/43  | Total:  1d 01h | Avg: 35m 30s | Max:  1h 19m | Hits: 112%/7408  
      🟩 arm64              Pass: 100%/2   | Total:  1h 05m | Avg: 32m 32s | Max: 34m 24s
    🟩 ctk
      🟩 11.1               Pass: 100%/7   | Total:  3h 50m | Avg: 32m 52s | Max:  1h 04m | Hits: 120%/1852  
      🟩 12.5               Pass: 100%/2   | Total:  2h 33m | Avg:  1h 16m | Max:  1h 19m
      🟩 12.6               Pass: 100%/36  | Total: 20h 08m | Avg: 33m 34s | Max:  1h 14m | Hits: 110%/5556  
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total:  1h 01m | Avg: 30m 59s | Max: 32m 13s
      🟩 nvcc11.1           Pass: 100%/7   | Total:  3h 50m | Avg: 32m 52s | Max:  1h 04m | Hits: 120%/1852  
      🟩 nvcc12.5           Pass: 100%/2   | Total:  2h 33m | Avg:  1h 16m | Max:  1h 19m
      🟩 nvcc12.6           Pass: 100%/34  | Total: 19h 06m | Avg: 33m 44s | Max:  1h 14m | Hits: 110%/5556  
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total:  1h 01m | Avg: 30m 59s | Max: 32m 13s
      🟩 nvcc               Pass: 100%/43  | Total:  1d 01h | Avg: 35m 35s | Max:  1h 19m | Hits: 112%/7408  
    🟩 cxx
      🟩 Clang9             Pass: 100%/4   | Total:  1h 55m | Avg: 28m 48s | Max: 32m 03s
      🟩 Clang10            Pass: 100%/1   | Total: 37m 09s | Avg: 37m 09s | Max: 37m 09s
      🟩 Clang11            Pass: 100%/1   | Total: 31m 58s | Avg: 31m 58s | Max: 31m 58s
      🟩 Clang12            Pass: 100%/1   | Total: 33m 01s | Avg: 33m 01s | Max: 33m 01s
      🟩 Clang13            Pass: 100%/1   | Total: 34m 32s | Avg: 34m 32s | Max: 34m 32s
      🟩 Clang14            Pass: 100%/1   | Total: 33m 48s | Avg: 33m 48s | Max: 33m 48s
      🟩 Clang15            Pass: 100%/1   | Total: 36m 02s | Avg: 36m 02s | Max: 36m 02s
      🟩 Clang16            Pass: 100%/1   | Total: 37m 58s | Avg: 37m 58s | Max: 37m 58s
      🟩 Clang17            Pass: 100%/1   | Total: 36m 29s | Avg: 36m 29s | Max: 36m 29s
      🟩 Clang18            Pass: 100%/7   | Total:  2h 59m | Avg: 25m 40s | Max: 34m 55s
      🟩 GCC6               Pass: 100%/2   | Total: 54m 45s | Avg: 27m 22s | Max: 31m 25s
      🟩 GCC7               Pass: 100%/2   | Total:  1h 02m | Avg: 31m 04s | Max: 34m 49s
      🟩 GCC8               Pass: 100%/1   | Total: 37m 08s | Avg: 37m 08s | Max: 37m 08s
      🟩 GCC9               Pass: 100%/3   | Total:  1h 27m | Avg: 29m 10s | Max: 32m 54s
      🟩 GCC10              Pass: 100%/1   | Total: 38m 50s | Avg: 38m 50s | Max: 38m 50s
      🟩 GCC11              Pass: 100%/1   | Total: 37m 44s | Avg: 37m 44s | Max: 37m 44s
      🟩 GCC12              Pass: 100%/1   | Total: 38m 43s | Avg: 38m 43s | Max: 38m 43s
      🟩 GCC13              Pass: 100%/8   | Total:  3h 05m | Avg: 23m 08s | Max: 36m 17s
      🟩 Intel2023.2.0      Pass: 100%/1   | Total: 41m 15s | Avg: 41m 15s | Max: 41m 15s
      🟩 MSVC14.16          Pass: 100%/1   | Total:  1h 04m | Avg:  1h 04m | Max:  1h 04m | Hits: 120%/1852  
      🟩 MSVC14.29          Pass: 100%/1   | Total:  1h 08m | Avg:  1h 08m | Max:  1h 08m | Hits: 110%/1852  
      🟩 MSVC14.39          Pass: 100%/2   | Total:  2h 27m | Avg:  1h 13m | Max:  1h 14m | Hits: 110%/3704  
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  2h 33m | Avg:  1h 16m | Max:  1h 19m
    🟩 cxx_family
      🟩 Clang              Pass: 100%/19  | Total:  9h 35m | Avg: 30m 18s | Max: 37m 58s
      🟩 GCC                Pass: 100%/19  | Total:  9h 01m | Avg: 28m 31s | Max: 38m 50s
      🟩 Intel              Pass: 100%/1   | Total: 41m 15s | Avg: 41m 15s | Max: 41m 15s
      🟩 MSVC               Pass: 100%/4   | Total:  4h 39m | Avg:  1h 09m | Max:  1h 14m | Hits: 112%/7408  
      🟩 NVHPC              Pass: 100%/2   | Total:  2h 33m | Avg:  1h 16m | Max:  1h 19m
    🟩 gpu
      🟩 rtx4090            Pass: 100%/8   | Total:  2h 28m | Avg: 18m 34s | Max: 36m 17s
      🟩 v100               Pass: 100%/37  | Total:  1d 00h | Avg: 39m 01s | Max:  1h 19m | Hits: 112%/7408  
    🟩 jobs
      🟩 Build              Pass: 100%/40  | Total:  1d 01h | Avg: 38m 38s | Max:  1h 19m | Hits: 112%/7408  
      🟩 TestCPU            Pass: 100%/2   | Total: 15m 32s | Avg:  7m 46s | Max:  7m 47s
      🟩 TestGPU            Pass: 100%/3   | Total: 30m 52s | Avg: 10m 17s | Max: 10m 38s
    🟩 sm
      🟩 90a                Pass: 100%/1   | Total: 22m 01s | Avg: 22m 01s | Max: 22m 01s
    🟩 std
      🟩 11                 Pass: 100%/5   | Total:  2h 05m | Avg: 25m 09s | Max: 27m 19s
      🟩 14                 Pass: 100%/4   | Total:  2h 42m | Avg: 40m 35s | Max:  1h 04m | Hits: 120%/1852  
      🟩 17                 Pass: 100%/12  | Total:  8h 45m | Avg: 43m 45s | Max:  1h 14m | Hits: 110%/3704  
      🟩 20                 Pass: 100%/22  | Total: 12h 17m | Avg: 33m 31s | Max:  1h 19m | Hits: 110%/1852  
    
  • 🟩 cudax: Pass: 100%/26 | Total: 2h 16m | Avg: 5m 16s | Max: 13m 20s | Hits: 574%/312

    🟩 cpu
      🟩 amd64              Pass: 100%/22  | Total:  2h 02m | Avg:  5m 34s | Max: 13m 20s | Hits: 574%/312   
      🟩 arm64              Pass: 100%/4   | Total: 14m 10s | Avg:  3m 32s | Max:  3m 39s
    🟩 ctk
      🟩 12.0               Pass: 100%/3   | Total: 17m 11s | Avg:  5m 43s | Max:  9m 22s | Hits: 574%/156   
      🟩 12.5               Pass: 100%/2   | Total: 12m 20s | Avg:  6m 10s | Max:  6m 10s
      🟩 12.6               Pass: 100%/21  | Total:  1h 47m | Avg:  5m 06s | Max: 13m 20s | Hits: 574%/156   
    🟩 cudacxx
      🟩 nvcc12.0           Pass: 100%/3   | Total: 17m 11s | Avg:  5m 43s | Max:  9m 22s | Hits: 574%/156   
      🟩 nvcc12.5           Pass: 100%/2   | Total: 12m 20s | Avg:  6m 10s | Max:  6m 10s
      🟩 nvcc12.6           Pass: 100%/21  | Total:  1h 47m | Avg:  5m 06s | Max: 13m 20s | Hits: 574%/156   
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/26  | Total:  2h 16m | Avg:  5m 16s | Max: 13m 20s | Hits: 574%/312   
    🟩 cxx
      🟩 Clang9             Pass: 100%/1   | Total:  4m 00s | Avg:  4m 00s | Max:  4m 00s
      🟩 Clang10            Pass: 100%/1   | Total:  4m 21s | Avg:  4m 21s | Max:  4m 21s
      🟩 Clang11            Pass: 100%/1   | Total:  4m 05s | Avg:  4m 05s | Max:  4m 05s
      🟩 Clang12            Pass: 100%/1   | Total:  3m 52s | Avg:  3m 52s | Max:  3m 52s
      🟩 Clang13            Pass: 100%/1   | Total:  3m 58s | Avg:  3m 58s | Max:  3m 58s
      🟩 Clang14            Pass: 100%/1   | Total:  3m 51s | Avg:  3m 51s | Max:  3m 51s
      🟩 Clang15            Pass: 100%/1   | Total:  3m 46s | Avg:  3m 46s | Max:  3m 46s
      🟩 Clang16            Pass: 100%/1   | Total:  3m 55s | Avg:  3m 55s | Max:  3m 55s
      🟩 Clang17            Pass: 100%/1   | Total:  4m 16s | Avg:  4m 16s | Max:  4m 16s
      🟩 Clang18            Pass: 100%/4   | Total: 24m 14s | Avg:  6m 03s | Max: 13m 09s
      🟩 GCC9               Pass: 100%/1   | Total:  3m 49s | Avg:  3m 49s | Max:  3m 49s
      🟩 GCC10              Pass: 100%/1   | Total:  3m 43s | Avg:  3m 43s | Max:  3m 43s
      🟩 GCC11              Pass: 100%/1   | Total:  3m 58s | Avg:  3m 58s | Max:  3m 58s
      🟩 GCC12              Pass: 100%/2   | Total: 17m 25s | Avg:  8m 42s | Max: 13m 20s
      🟩 GCC13              Pass: 100%/4   | Total: 13m 40s | Avg:  3m 25s | Max:  3m 39s
      🟩 MSVC14.36          Pass: 100%/1   | Total:  9m 22s | Avg:  9m 22s | Max:  9m 22s | Hits: 574%/156   
      🟩 MSVC14.39          Pass: 100%/1   | Total: 12m 22s | Avg: 12m 22s | Max: 12m 22s | Hits: 574%/156   
      🟩 NVHPC24.7          Pass: 100%/2   | Total: 12m 20s | Avg:  6m 10s | Max:  6m 10s
    🟩 cxx_family
      🟩 Clang              Pass: 100%/13  | Total:  1h 00m | Avg:  4m 38s | Max: 13m 09s
      🟩 GCC                Pass: 100%/9   | Total: 42m 35s | Avg:  4m 43s | Max: 13m 20s
      🟩 MSVC               Pass: 100%/2   | Total: 21m 44s | Avg: 10m 52s | Max: 12m 22s | Hits: 574%/312   
      🟩 NVHPC              Pass: 100%/2   | Total: 12m 20s | Avg:  6m 10s | Max:  6m 10s
    🟩 gpu
      🟩 rtx2080            Pass: 100%/4   | Total: 34m 31s | Avg:  8m 37s | Max: 13m 20s
      🟩 v100               Pass: 100%/22  | Total:  1h 42m | Avg:  4m 39s | Max: 12m 22s | Hits: 574%/312   
    🟩 jobs
      🟩 Build              Pass: 100%/24  | Total:  1h 50m | Avg:  4m 36s | Max: 12m 22s | Hits: 574%/312   
      🟩 Test               Pass: 100%/2   | Total: 26m 29s | Avg: 13m 14s | Max: 13m 20s
    🟩 sm
      🟩 90                 Pass: 100%/1   | Total:  3m 24s | Avg:  3m 24s | Max:  3m 24s
      🟩 90a                Pass: 100%/1   | Total:  3m 14s | Avg:  3m 14s | Max:  3m 14s
    🟩 std
      🟩 17                 Pass: 100%/6   | Total: 24m 21s | Avg:  4m 03s | Max:  6m 10s
      🟩 20                 Pass: 100%/20  | Total:  1h 52m | Avg:  5m 37s | Max: 13m 20s | Hits: 574%/312   
    
  • 🟩 cccl_c_parallel: Pass: 100%/2 | Total: 7m 03s | Avg: 3m 31s | Max: 4m 44s

    🟩 cpu
      🟩 amd64              Pass: 100%/2   | Total:  7m 03s | Avg:  3m 31s | Max:  4m 44s
    🟩 ctk
      🟩 12.6               Pass: 100%/2   | Total:  7m 03s | Avg:  3m 31s | Max:  4m 44s
    🟩 cudacxx
      🟩 nvcc12.6           Pass: 100%/2   | Total:  7m 03s | Avg:  3m 31s | Max:  4m 44s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/2   | Total:  7m 03s | Avg:  3m 31s | Max:  4m 44s
    🟩 cxx
      🟩 GCC13              Pass: 100%/2   | Total:  7m 03s | Avg:  3m 31s | Max:  4m 44s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/2   | Total:  7m 03s | Avg:  3m 31s | Max:  4m 44s
    🟩 gpu
      🟩 rtx2080            Pass: 100%/2   | Total:  7m 03s | Avg:  3m 31s | Max:  4m 44s
    🟩 jobs
      🟩 Build              Pass: 100%/1   | Total:  2m 19s | Avg:  2m 19s | Max:  2m 19s
      🟩 Test               Pass: 100%/1   | Total:  4m 44s | Avg:  4m 44s | Max:  4m 44s
    
  • 🟩 python: Pass: 100%/1 | Total: 25m 19s | Avg: 25m 19s | Max: 25m 19s

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total: 25m 19s | Avg: 25m 19s | Max: 25m 19s
    🟩 ctk
      🟩 12.6               Pass: 100%/1   | Total: 25m 19s | Avg: 25m 19s | Max: 25m 19s
    🟩 cudacxx
      🟩 nvcc12.6           Pass: 100%/1   | Total: 25m 19s | Avg: 25m 19s | Max: 25m 19s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total: 25m 19s | Avg: 25m 19s | Max: 25m 19s
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total: 25m 19s | Avg: 25m 19s | Max: 25m 19s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total: 25m 19s | Avg: 25m 19s | Max: 25m 19s
    🟩 gpu
      🟩 rtx2080            Pass: 100%/1   | Total: 25m 19s | Avg: 25m 19s | Max: 25m 19s
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total: 25m 19s | Avg: 25m 19s | Max: 25m 19s
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
+/- libcu++
CUB
Thrust
CUDA Experimental
python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
+/- libcu++
+/- CUB
+/- Thrust
+/- CUDA Experimental
+/- python
+/- CCCL C Parallel Library
+/- Catch2Helper

🏃‍ Runner counts (total jobs: 169)

# Runner
125 linux-amd64-cpu16
14 windows-amd64-cpu16
10 linux-amd64-gpu-rtx2080-latest-1
10 linux-arm64-cpu16
6 linux-amd64-gpu-rtxa6000-latest-1
3 linux-amd64-gpu-rtx4090-latest-1
1 linux-amd64-gpu-h100-latest-1

@miscco miscco merged commit fcc5205 into branch/2.8.x Jan 31, 2025
184 checks passed
@bernhardmgruber bernhardmgruber deleted the backport_ptx branch January 31, 2025 19:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

2 participants