cuda.parallel: Minor perf improvements #3718

shwina · 2025-02-06T15:30:53Z

Description

This PR addresses some of the performance issues found by @oleksandr-pavlyk' in #3213. Still WIP - there are broader questions regarding API/behaviour we need to answer before considering the changes in this PR.

Before this PR

$ python bench.py --cuda_parallel
Benchmark Results (input size, average time with first run, average time without first run):
Input size:         10 | Avg time with first run: 0.00108042 seconds | Avg time without first run: 0.00003948 seconds
Input size:        100 | Avg time with first run: 0.00003718 seconds | Avg time without first run: 0.00003711 seconds
Input size:       1000 | Avg time with first run: 0.00003729 seconds | Avg time without first run: 0.00003730 seconds
Input size:      10000 | Avg time with first run: 0.00003983 seconds | Avg time without first run: 0.00003977 seconds
Input size:     100000 | Avg time with first run: 0.00004985 seconds | Avg time without first run: 0.00004985 seconds
Input size:    1000000 | Avg time with first run: 0.00018083 seconds | Avg time without first run: 0.00017201 seconds
Input size:   10000000 | Avg time with first run: 0.00104969 seconds | Avg time without first run: 0.00104555 seconds
Input size:  100000000 | Avg time with first run: 0.01038684 seconds | Avg time without first run: 0.01038403 seconds

After this PR:

$ python bench.py --cuda_parallel
Benchmark Results (input size, average time with first run, average time without first run):
Input size:         10 | Avg time with first run: 0.00106483 seconds | Avg time without first run: 0.00001107 seconds
Input size:        100 | Avg time with first run: 0.00001096 seconds | Avg time without first run: 0.00001095 seconds
Input size:       1000 | Avg time with first run: 0.00001093 seconds | Avg time without first run: 0.00001092 seconds
Input size:      10000 | Avg time with first run: 0.00001658 seconds | Avg time without first run: 0.00001652 seconds
Input size:     100000 | Avg time with first run: 0.00005286 seconds | Avg time without first run: 0.00005286 seconds
Input size:    1000000 | Avg time with first run: 0.00021406 seconds | Avg time without first run: 0.00020699 seconds
Input size:   10000000 | Avg time with first run: 0.00105273 seconds | Avg time without first run: 0.00105112 seconds
Input size:  100000000 | Avg time with first run: 0.01051427 seconds | Avg time without first run: 0.01051234 seconds

CuPy

Benching reduce_cupy
Benchmark Results (input size, average time with first run, average time without first run):
Input size:         10 | Avg time with first run: 0.00003298 seconds | Avg time without first run: 0.00001065 seconds
Input size:        100 | Avg time with first run: 0.00001066 seconds | Avg time without first run: 0.00001065 seconds
Input size:       1000 | Avg time with first run: 0.00001069 seconds | Avg time without first run: 0.00001069 seconds
Input size:      10000 | Avg time with first run: 0.00001300 seconds | Avg time without first run: 0.00001296 seconds
Input size:     100000 | Avg time with first run: 0.00001534 seconds | Avg time without first run: 0.00001534 seconds
Input size:    1000000 | Avg time with first run: 0.00009608 seconds | Avg time without first run: 0.00009601 seconds
Input size:   10000000 | Avg time with first run: 0.00088291 seconds | Avg time without first run: 0.00088328 seconds
Input size:  100000000 | Avg time with first run: 0.00860134 seconds | Avg time without first run: 0.00860142 seconds

Checklist

New or existing tests cover these changes.
The documentation is up to date with these changes.

oleksandr-pavlyk · 2025-02-06T15:37:52Z

python/cuda_parallel/cuda/parallel/experimental/_utils/protocols.py

+    try:
+        return np.dtype(arr.dtype)  # type: ignore
+    except Exception:
+        typestr = arr.__cuda_array_interface__["typestr"]

-    if typestr.startswith("|V"):
-        # it's a structured dtype, use the descr field:
-        return np.dtype(arr.__cuda_array_interface__["descr"])
-    else:
-        # a simple dtype, use the typestr field:
-        return np.dtype(typestr)
+        if typestr.startswith("|V"):
+            # it's a structured dtype, use the descr field:
+            return np.dtype(arr.__cuda_array_interface__["descr"])
+        else:
+            # a simple dtype, use the typestr field:
+            return np.dtype(typestr)


The call to arr.__cuda_array_interface__ was expensive. So it pays to save the interface into a temporary and reuse it in the function scope.

With this approach, we still make repeated calls to arr.__cuda_array_interface__ (in every one of these functions). We should instead create a calls that encapsulate arr in it, and provides cached __cuda_array_interface__ property. This is likely what GpuMemoryView class is meant to be. @leofang for comments

Yes, this PR simply adds a fast path for cupy arrays (and numba as well). If something else is passed (e.g., a torch tensor, we fall back to using __cuda_array_interface__).

I found that StridedMemoryView is expensive enough to construct that it doesn't give us sufficient speedup.

I found that StridedMemoryView is expensive enough to construct that it doesn't give us sufficient speedup.

How was the timing done? I would like to know better on this.

I really think the current implementation of cuda/parallel/experimental/_utils/protocols.py should be replaced by StridedMemoryView, because what happens now is the call to __cuda_array_interface__ is not cached, so we repeatedly call it every time to just get one attribute, instead of caching and reusing it for all attribute queries. But once we cache it essentially it is a re-implementation (of some sort) of StridedMemoryView. I don't understand what is going on and would like to get down to the bottom of it. StridedMemoryView is designed so that it supports all single-CPU/GPU array libraries, beyond just CuPy, a capability that we eventually will need for all CUDA projects.

How was the timing done? I would like to know better on this.

Consider 9cc2902, where I am converting the user-provided arrays to strided memory views and grabbing the data ptr from that. I even deleted all the contiguity checks and other validation code. The benchmarks numbers are then:

Benchmark Results (input size, average time with first run, average time without first run): Input size: 10 | Avg time with first run: 0.00102104 seconds | Avg time without first run: 0.00002196 seconds Input size: 100 | Avg time with first run: 0.00002170 seconds | Avg time without first run: 0.00002169 seconds Input size: 1000 | Avg time with first run: 0.00002289 seconds | Avg time without first run: 0.00002302 seconds Input size: 10000 | Avg time with first run: 0.00002413 seconds | Avg time without first run: 0.00002405 seconds Input size: 100000 | Avg time with first run: 0.00004833 seconds | Avg time without first run: 0.00004833 seconds Input size: 1000000 | Avg time with first run: 0.00019740 seconds | Avg time without first run: 0.00019125 seconds Input size: 10000000 | Avg time with first run: 0.00104487 seconds | Avg time without first run: 0.00104453 seconds Input size: 100000000 | Avg time with first run: 0.01038157 seconds | Avg time without first run: 0.01038252 seconds

Which is still a significant improvement over the existing benchmark numbers, but not quite good enough. The overhead here is all in the construction of the StridedMemoryViews from d_in and d_out.

Thanks, Ashwin, this helps. Would you mind passing stream_ptr=-1 to StridedMemoryView to bypass the stream ordering, and then point me to where bench.py is so that I can also try this out myself?

The scripts bench.py and device_reduce.py can be found here: #3213 (comment)

Would you mind passing stream_ptr=-1 to StridedMemoryView

This improved things very slightly:

Benchmark Results (input size, average time with first run, average time without first run): Input size: 10 | Avg time with first run: 0.00103634 seconds | Avg time without first run: 0.00002310 seconds Input size: 100 | Avg time with first run: 0.00002255 seconds | Avg time without first run: 0.00002250 seconds Input size: 1000 | Avg time with first run: 0.00002412 seconds | Avg time without first run: 0.00002272 seconds Input size: 10000 | Avg time with first run: 0.00002496 seconds | Avg time without first run: 0.00002491 seconds Input size: 100000 | Avg time with first run: 0.00005463 seconds | Avg time without first run: 0.00005463 seconds Input size: 1000000 | Avg time with first run: 0.00018084 seconds | Avg time without first run: 0.00016899 seconds Input size: 10000000 | Avg time with first run: 0.00106087 seconds | Avg time without first run: 0.00104471 seconds Input size: 100000000 | Avg time with first run: 0.01038340 seconds | Avg time without first run: 0.01038483 seconds

github-actions · 2025-02-06T15:38:54Z

🟥 CI finished in 5m 58s: Pass: 0%/1 | Total: 5m 58s | Avg: 5m 58s | Max: 5m 58s

🟥 python: Pass: 0%/1 | Total: 5m 58s | Avg: 5m 58s | Max: 5m 58s

🟥 cpu
  🟥 amd64              Pass:   0%/1   | Total:  5m 58s | Avg:  5m 58s | Max:  5m 58s
🟥 ctk
  🟥 12.8               Pass:   0%/1   | Total:  5m 58s | Avg:  5m 58s | Max:  5m 58s
🟥 cudacxx
  🟥 nvcc12.8           Pass:   0%/1   | Total:  5m 58s | Avg:  5m 58s | Max:  5m 58s
🟥 cudacxx_family
  🟥 nvcc               Pass:   0%/1   | Total:  5m 58s | Avg:  5m 58s | Max:  5m 58s
🟥 cxx
  🟥 GCC13              Pass:   0%/1   | Total:  5m 58s | Avg:  5m 58s | Max:  5m 58s
🟥 cxx_family
  🟥 GCC                Pass:   0%/1   | Total:  5m 58s | Avg:  5m 58s | Max:  5m 58s
🟥 gpu
  🟥 rtx2080            Pass:   0%/1   | Total:  5m 58s | Avg:  5m 58s | Max:  5m 58s
🟥 jobs
  🟥 Test               Pass:   0%/1   | Total:  5m 58s | Avg:  5m 58s | Max:  5m 58s

👃 Inspect Changes

Modifications in project?

	Project
	CCCL Infrastructure
	libcu++
	CUB
	Thrust
	CUDA Experimental
+/-	python
	CCCL C Parallel Library
	Catch2Helper

Modifications in project or dependencies?

	Project
	CCCL Infrastructure
	libcu++
	CUB
	Thrust
	CUDA Experimental
+/-	python
	CCCL C Parallel Library
	Catch2Helper

🏃‍ Runner counts (total jobs: 1)

#	Runner
1	`linux-amd64-gpu-rtx2080-latest-1`

jrhemstad · 2025-02-06T15:40:54Z

How are we compared to cupy now?

github-actions · 2025-02-06T15:46:24Z

🟥 CI finished in 5m 55s: Pass: 0%/1 | Total: 5m 55s | Avg: 5m 55s | Max: 5m 55s

🟥 python: Pass: 0%/1 | Total: 5m 55s | Avg: 5m 55s | Max: 5m 55s

🟥 cpu
  🟥 amd64              Pass:   0%/1   | Total:  5m 55s | Avg:  5m 55s | Max:  5m 55s
🟥 ctk
  🟥 12.8               Pass:   0%/1   | Total:  5m 55s | Avg:  5m 55s | Max:  5m 55s
🟥 cudacxx
  🟥 nvcc12.8           Pass:   0%/1   | Total:  5m 55s | Avg:  5m 55s | Max:  5m 55s
🟥 cudacxx_family
  🟥 nvcc               Pass:   0%/1   | Total:  5m 55s | Avg:  5m 55s | Max:  5m 55s
🟥 cxx
  🟥 GCC13              Pass:   0%/1   | Total:  5m 55s | Avg:  5m 55s | Max:  5m 55s
🟥 cxx_family
  🟥 GCC                Pass:   0%/1   | Total:  5m 55s | Avg:  5m 55s | Max:  5m 55s
🟥 gpu
  🟥 rtx2080            Pass:   0%/1   | Total:  5m 55s | Avg:  5m 55s | Max:  5m 55s
🟥 jobs
  🟥 Test               Pass:   0%/1   | Total:  5m 55s | Avg:  5m 55s | Max:  5m 55s

👃 Inspect Changes

Modifications in project?

	Project
	CCCL Infrastructure
	libcu++
	CUB
	Thrust
	CUDA Experimental
+/-	python
	CCCL C Parallel Library
	Catch2Helper

Modifications in project or dependencies?

	Project
	CCCL Infrastructure
	libcu++
	CUB
	Thrust
	CUDA Experimental
+/-	python
	CCCL C Parallel Library
	Catch2Helper

🏃‍ Runner counts (total jobs: 1)

#	Runner
1	`linux-amd64-gpu-rtx2080-latest-1`

copy-pr-bot · 2025-02-06T16:02:02Z

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

shwina · 2025-02-06T16:03:07Z

How are we compared to cupy now?

Closer, but not there quite yet. We have ~15us of constant overhead versus CuPy's ~10us. I'll iterate on this PR until we reach parity

leofang · 2025-02-06T18:02:07Z

Closer, but not there quite yet. We have ~15ms of constant overhead versus CuPy's ~10ms.

btw I think you meant us (microseconds) not ms (millisecond). I feel we are pushing to the limit where Python overhead could be something to worry about.

shwina · 2025-02-06T19:51:58Z

Closer, but not there quite yet. We have ~15ms of constant overhead versus CuPy's ~10ms. I'll iterate on this PR until we reach parity

With the latest changes which rip out all the validation checks we do between the call to Reduce.__init__ and Reduce.__call__, as well as using CuPy to get the current device's compute capability, we do reach parity with CuPy:

Benchmark Results (input size, average time with first run, average time without first run):
Input size:         10 | Avg time with first run: 0.00106483 seconds | Avg time without first run: 0.00001107 seconds
Input size:        100 | Avg time with first run: 0.00001096 seconds | Avg time without first run: 0.00001095 seconds
Input size:       1000 | Avg time with first run: 0.00001093 seconds | Avg time without first run: 0.00001092 seconds
Input size:      10000 | Avg time with first run: 0.00001658 seconds | Avg time without first run: 0.00001652 seconds
Input size:     100000 | Avg time with first run: 0.00005286 seconds | Avg time without first run: 0.00005286 seconds
Input size:    1000000 | Avg time with first run: 0.00021406 seconds | Avg time without first run: 0.00020699 seconds
Input size:   10000000 | Avg time with first run: 0.00105273 seconds | Avg time without first run: 0.00105112 seconds
Input size:  100000000 | Avg time with first run: 0.01051427 seconds | Avg time without first run: 0.01051234 seconds

I feel we are pushing to the limit where Python overhead could be something to worry about.

We are absolutely there already - this PR is trying to minimize the number of Python operations we're doing in the __call__ method.

shwina · 2025-02-06T19:59:46Z

/ok to test

github-actions · 2025-02-06T20:19:42Z

🟥 CI finished in 6m 06s: Pass: 0%/1 | Total: 6m 06s | Avg: 6m 06s | Max: 6m 06s

🟥 python: Pass: 0%/1 | Total: 6m 06s | Avg: 6m 06s | Max: 6m 06s

🟥 cpu
  🟥 amd64              Pass:   0%/1   | Total:  6m 06s | Avg:  6m 06s | Max:  6m 06s
🟥 ctk
  🟥 12.8               Pass:   0%/1   | Total:  6m 06s | Avg:  6m 06s | Max:  6m 06s
🟥 cudacxx
  🟥 nvcc12.8           Pass:   0%/1   | Total:  6m 06s | Avg:  6m 06s | Max:  6m 06s
🟥 cudacxx_family
  🟥 nvcc               Pass:   0%/1   | Total:  6m 06s | Avg:  6m 06s | Max:  6m 06s
🟥 cxx
  🟥 GCC13              Pass:   0%/1   | Total:  6m 06s | Avg:  6m 06s | Max:  6m 06s
🟥 cxx_family
  🟥 GCC                Pass:   0%/1   | Total:  6m 06s | Avg:  6m 06s | Max:  6m 06s
🟥 gpu
  🟥 rtx2080            Pass:   0%/1   | Total:  6m 06s | Avg:  6m 06s | Max:  6m 06s
🟥 jobs
  🟥 Test               Pass:   0%/1   | Total:  6m 06s | Avg:  6m 06s | Max:  6m 06s

👃 Inspect Changes

Modifications in project?

	Project
	CCCL Infrastructure
	libcu++
	CUB
	Thrust
	CUDA Experimental
+/-	python
	CCCL C Parallel Library
	Catch2Helper

Modifications in project or dependencies?

	Project
	CCCL Infrastructure
	libcu++
	CUB
	Thrust
	CUDA Experimental
+/-	python
	CCCL C Parallel Library
	Catch2Helper

🏃‍ Runner counts (total jobs: 1)

#	Runner
1	`linux-amd64-gpu-rtx2080-latest-1`

shwina · 2025-02-06T20:41:01Z

/ok to test

github-actions · 2025-02-06T20:49:53Z

🟥 CI finished in 6m 05s: Pass: 0%/1 | Total: 6m 05s | Avg: 6m 05s | Max: 6m 05s

🟥 python: Pass: 0%/1 | Total: 6m 05s | Avg: 6m 05s | Max: 6m 05s

🟥 cpu
  🟥 amd64              Pass:   0%/1   | Total:  6m 05s | Avg:  6m 05s | Max:  6m 05s
🟥 ctk
  🟥 12.8               Pass:   0%/1   | Total:  6m 05s | Avg:  6m 05s | Max:  6m 05s
🟥 cudacxx
  🟥 nvcc12.8           Pass:   0%/1   | Total:  6m 05s | Avg:  6m 05s | Max:  6m 05s
🟥 cudacxx_family
  🟥 nvcc               Pass:   0%/1   | Total:  6m 05s | Avg:  6m 05s | Max:  6m 05s
🟥 cxx
  🟥 GCC13              Pass:   0%/1   | Total:  6m 05s | Avg:  6m 05s | Max:  6m 05s
🟥 cxx_family
  🟥 GCC                Pass:   0%/1   | Total:  6m 05s | Avg:  6m 05s | Max:  6m 05s
🟥 gpu
  🟥 rtx2080            Pass:   0%/1   | Total:  6m 05s | Avg:  6m 05s | Max:  6m 05s
🟥 jobs
  🟥 Test               Pass:   0%/1   | Total:  6m 05s | Avg:  6m 05s | Max:  6m 05s

👃 Inspect Changes

Modifications in project?

	Project
	CCCL Infrastructure
	libcu++
	CUB
	Thrust
	CUDA Experimental
+/-	python
	CCCL C Parallel Library
	Catch2Helper

Modifications in project or dependencies?

	Project
	CCCL Infrastructure
	libcu++
	CUB
	Thrust
	CUDA Experimental
+/-	python
	CCCL C Parallel Library
	Catch2Helper

🏃‍ Runner counts (total jobs: 1)

#	Runner
1	`linux-amd64-gpu-rtx2080-latest-1`

shwina · 2025-02-06T21:02:06Z

/ok to test

github-actions · 2025-02-06T21:41:54Z

🟩 CI finished in 33m 23s: Pass: 100%/1 | Total: 33m 23s | Avg: 33m 23s | Max: 33m 23s

🟩 python: Pass: 100%/1 | Total: 33m 23s | Avg: 33m 23s | Max: 33m 23s

🟩 cpu
  🟩 amd64              Pass: 100%/1   | Total: 33m 23s | Avg: 33m 23s | Max: 33m 23s
🟩 ctk
  🟩 12.8               Pass: 100%/1   | Total: 33m 23s | Avg: 33m 23s | Max: 33m 23s
🟩 cudacxx
  🟩 nvcc12.8           Pass: 100%/1   | Total: 33m 23s | Avg: 33m 23s | Max: 33m 23s
🟩 cudacxx_family
  🟩 nvcc               Pass: 100%/1   | Total: 33m 23s | Avg: 33m 23s | Max: 33m 23s
🟩 cxx
  🟩 GCC13              Pass: 100%/1   | Total: 33m 23s | Avg: 33m 23s | Max: 33m 23s
🟩 cxx_family
  🟩 GCC                Pass: 100%/1   | Total: 33m 23s | Avg: 33m 23s | Max: 33m 23s
🟩 gpu
  🟩 rtx2080            Pass: 100%/1   | Total: 33m 23s | Avg: 33m 23s | Max: 33m 23s
🟩 jobs
  🟩 Test               Pass: 100%/1   | Total: 33m 23s | Avg: 33m 23s | Max: 33m 23s

👃 Inspect Changes

Modifications in project?

	Project
	CCCL Infrastructure
	libcu++
	CUB
	Thrust
	CUDA Experimental
+/-	python
	CCCL C Parallel Library
	Catch2Helper

Modifications in project or dependencies?

	Project
	CCCL Infrastructure
	libcu++
	CUB
	Thrust
	CUDA Experimental
+/-	python
	CCCL C Parallel Library
	Catch2Helper

🏃‍ Runner counts (total jobs: 1)

#	Runner
1	`linux-amd64-gpu-rtx2080-latest-1`

This reverts commit 9cc2902.

shwina · 2025-02-07T18:04:29Z

/ok to test

github-actions · 2025-02-07T19:09:41Z

🟩 CI finished in 28m 40s: Pass: 100%/1 | Total: 28m 40s | Avg: 28m 40s | Max: 28m 40s

🟩 python: Pass: 100%/1 | Total: 28m 40s | Avg: 28m 40s | Max: 28m 40s

🟩 cpu
  🟩 amd64              Pass: 100%/1   | Total: 28m 40s | Avg: 28m 40s | Max: 28m 40s
🟩 ctk
  🟩 12.8               Pass: 100%/1   | Total: 28m 40s | Avg: 28m 40s | Max: 28m 40s
🟩 cudacxx
  🟩 nvcc12.8           Pass: 100%/1   | Total: 28m 40s | Avg: 28m 40s | Max: 28m 40s
🟩 cudacxx_family
  🟩 nvcc               Pass: 100%/1   | Total: 28m 40s | Avg: 28m 40s | Max: 28m 40s
🟩 cxx
  🟩 GCC13              Pass: 100%/1   | Total: 28m 40s | Avg: 28m 40s | Max: 28m 40s
🟩 cxx_family
  🟩 GCC                Pass: 100%/1   | Total: 28m 40s | Avg: 28m 40s | Max: 28m 40s
🟩 gpu
  🟩 rtx2080            Pass: 100%/1   | Total: 28m 40s | Avg: 28m 40s | Max: 28m 40s
🟩 jobs
  🟩 Test               Pass: 100%/1   | Total: 28m 40s | Avg: 28m 40s | Max: 28m 40s

👃 Inspect Changes

Modifications in project?

	Project
	CCCL Infrastructure
	libcu++
	CUB
	Thrust
	CUDA Experimental
+/-	python
	CCCL C Parallel Library
	Catch2Helper

Modifications in project or dependencies?

	Project
	CCCL Infrastructure
	libcu++
	CUB
	Thrust
	CUDA Experimental
+/-	python
	CCCL C Parallel Library
	Catch2Helper

🏃‍ Runner counts (total jobs: 1)

#	Runner
1	`linux-amd64-gpu-rtx2080-latest-1`

shwina requested a review from a team as a code owner February 6, 2025 15:30

shwina requested a review from gevtushenko February 6, 2025 15:30

oleksandr-pavlyk reviewed Feb 6, 2025

View reviewed changes

shwina force-pushed the cuda-parallel-minor-perf-improvements branch from bf5b043 to 4dcfc6f Compare February 6, 2025 15:39

shwina marked this pull request as draft February 6, 2025 16:02

shwina added 12 commits February 7, 2025 11:38

Add fast paths for getting __cuda_array_interface__ information

58ee276

Add fast paths for getting __cuda_array_interface__ information

9b2a3f3

Some more tricks to improve performance

019343f

Remove __init__.py to avoid interfering with cuda namepsace pkg

05464f2

Use relative import to silence mypy

a9eb586

Convert d_in/d_out to StridedMemoryView and get ptr from that

122341d

Revert "Convert d_in/d_out to StridedMemoryView and get ptr from that"

59a2d2a

This reverts commit 9cc2902.

Remove validations and make num_items required

00c6125

Use CuPy to get the compute capability

c77d094

Small fix

9233da4

Add an __array_interface__ attribute to GpuStruct

c1f816b

Add cupy to mock imports

41f652d

shwina force-pushed the cuda-parallel-minor-perf-improvements branch from 0f404e7 to 41f652d Compare February 7, 2025 16:38

cuda.parallel: Minor perf improvements #3718

Are you sure you want to change the base?

cuda.parallel: Minor perf improvements #3718

Conversation

shwina commented Feb 6, 2025 • edited Loading

Description

Before this PR

After this PR:

CuPy

Checklist

oleksandr-pavlyk Feb 6, 2025

Choose a reason for hiding this comment

shwina Feb 6, 2025 • edited Loading

Choose a reason for hiding this comment

leofang Feb 6, 2025

Choose a reason for hiding this comment

leofang Feb 6, 2025 • edited Loading

Choose a reason for hiding this comment

shwina Feb 6, 2025 • edited Loading

Choose a reason for hiding this comment

leofang Feb 6, 2025

Choose a reason for hiding this comment

shwina Feb 6, 2025

Choose a reason for hiding this comment

github-actions bot commented Feb 6, 2025

🟥 python: Pass: 0%/1 | Total: 5m 58s | Avg: 5m 58s | Max: 5m 58s

👃 Inspect Changes

Modifications in project?

Modifications in project or dependencies?

🏃‍ Runner counts (total jobs: 1)

jrhemstad commented Feb 6, 2025

github-actions bot commented Feb 6, 2025

🟥 python: Pass: 0%/1 | Total: 5m 55s | Avg: 5m 55s | Max: 5m 55s

👃 Inspect Changes

Modifications in project?

Modifications in project or dependencies?

🏃‍ Runner counts (total jobs: 1)

copy-pr-bot bot commented Feb 6, 2025

shwina commented Feb 6, 2025 • edited Loading

leofang commented Feb 6, 2025

shwina commented Feb 6, 2025

shwina commented Feb 6, 2025

github-actions bot commented Feb 6, 2025

🟥 python: Pass: 0%/1 | Total: 6m 06s | Avg: 6m 06s | Max: 6m 06s

👃 Inspect Changes

Modifications in project?

Modifications in project or dependencies?

🏃‍ Runner counts (total jobs: 1)

shwina commented Feb 6, 2025

github-actions bot commented Feb 6, 2025

🟥 python: Pass: 0%/1 | Total: 6m 05s | Avg: 6m 05s | Max: 6m 05s

👃 Inspect Changes

Modifications in project?

Modifications in project or dependencies?

🏃‍ Runner counts (total jobs: 1)

shwina commented Feb 6, 2025

github-actions bot commented Feb 6, 2025

🟩 python: Pass: 100%/1 | Total: 33m 23s | Avg: 33m 23s | Max: 33m 23s

👃 Inspect Changes

Modifications in project?

Modifications in project or dependencies?

🏃‍ Runner counts (total jobs: 1)

shwina commented Feb 7, 2025

github-actions bot commented Feb 7, 2025

🟩 python: Pass: 100%/1 | Total: 28m 40s | Avg: 28m 40s | Max: 28m 40s

👃 Inspect Changes

Modifications in project?

Modifications in project or dependencies?

🏃‍ Runner counts (total jobs: 1)

shwina commented Feb 6, 2025 •

edited

Loading

shwina Feb 6, 2025 •

edited

Loading

leofang Feb 6, 2025 •

edited

Loading

shwina Feb 6, 2025 •

edited

Loading

shwina commented Feb 6, 2025 •

edited

Loading