Skip dispatching to GPU for unimplemented metrics in UMAP #6224

betatim · 2025-01-14T16:16:19Z

This lists more metrics that the CPU umap library supports but cuml doesn't yet support. By listing them as not implemented we don't dispatch to the GPU when a user selects them.

Also added them to a test to check that they do not raise an error. It would be nice to check that they ran on the CPU when the accelerator is enabled, but I couldn't find a nice way to do it :-/ Ideas welcome.

python/cuml/cuml/tests/experimental/accel/estimators_hyperparams/test_accel_umap.py

…ms/test_accel_umap.py Co-authored-by: Dante Gama Dessavre <[email protected]>

betatim · 2025-01-15T12:53:17Z

I feel like I am missing something. The tests are still failing with FAILED estimators_hyperparams/test_accel_umap.py::test_umap_metric[russelrao] - ValueError: metric is neither callable nor a recognised string - but russelrao is how the metric is spelt??

betatim · 2025-01-15T13:52:10Z

The answer is that the docs call it russelrao but the code actually looks for russellrao.

jameslamb

I recommend reverting that rapids-mamba-retry --quiet. I've explained the tradeoffs in a comment.

Marking this "approve" so you're not blocked waiting on ci-codeowners, and so you can merge it without further review if you all want to make a different choice than I would in the face of those tradeoffs.

jameslamb · 2025-01-16T14:47:34Z

ci/test_python_common.sh

@@ -17,7 +17,7 @@ rapids-dependency-file-generator \
  --prepend-channel "${CPP_CHANNEL}" \
  --prepend-channel "${PYTHON_CHANNEL}" | tee env.yaml

-rapids-mamba-retry env create --yes -f env.yaml -n test
+rapids-mamba-retry env create --quiet --yes -f env.yaml -n test


What is the goal of adding this --quiet?

For context... we'd recently tried to do that globally in CI images for RAPIDS, but found that it seemed to have the side effect of suppressing exception names from mamba, making rapids-mamba-retry ineffective:

rapidsai/ci-imgs#220

rapids-mamba-retry / rapids-conda-retry work by string-matching on Python exceptions:

https://github.com/rapidsai/gha-tools/blob/0558ffce255e4e7da5d5312e79f35dd81e444144/tools/rapids-conda-retry#L82

So adding --quiet here might mean you're trading quieter logs for more need to manually retry failures from conda.

Mostly added it here to reduce the number of lines I have to scroll past to see the output from the pytest command. It is somewhat annoying that the progress bars seem to take up many many many many lines :-/ but yeah, not really interested in negotiating with the rest of RAPIDS about this change (I'll revert it when I'm done with this PR) :D

to reduce the number of lines I have to scroll past to see the output from the pytest command

I agree, it's annoying :/

That's why we'd attempted to make quiet: true the global setting for RAPIDS CI: rapidsai/ci-imgs#217

But breaking the retry mechanism was just not worth it... I think having to scroll past some logs lines is better than having to manually re-run CI.

Anyway, the work to actually find the root cause of these excessive empty lines is still something we should do. Put up rapidsai/ci-imgs#228 to track that.

I assumed that it is something we'd have to fix in conda/mamba? Somehow making it aware of the fact that no human is watching so that it can either not output a progress bar or some such. At least it seems like there are CLI tools out there that somehow adjust the fancyness of their output. Alas, I have no idea how they do it :(

Maybe, it might also be a side effect of the RAPIDS-specific wrappers we have around those tools. Put up one idea at rapidsai/ci-imgs#228 (comment)

Anyway, that ci-imgs issue is a good tracking issue for this. Hopefully we can get more specific reproducible examples and make some progress there. I'll try to look into it when I can.

Skip dispatching to GPU for unimplemented metrics

2788304

betatim requested a review from a team as a code owner January 14, 2025 16:16

betatim requested review from teju85 and divyegala January 14, 2025 16:16

github-actions bot added the Cython / Python Cython or Python issue label Jan 14, 2025

dantegd reviewed Jan 14, 2025

View reviewed changes

python/cuml/cuml/tests/experimental/accel/estimators_hyperparams/test_accel_umap.py Show resolved Hide resolved

Update python/cuml/cuml/tests/experimental/accel/estimators_hyperpara…

72d7e33

…ms/test_accel_umap.py Co-authored-by: Dante Gama Dessavre <[email protected]>

betatim added the non-breaking Non-breaking change label Jan 15, 2025

Typo fix

3f90e99

betatim changed the title ~~Skip dispatching to GPU for unimplemented metrics~~ Skip dispatching to GPU for unimplemented metrics in UMAP Jan 15, 2025

betatim requested a review from a team as a code owner January 15, 2025 16:14

betatim requested a review from jameslamb January 15, 2025 16:14

github-actions bot added the ci label Jan 15, 2025

Make conda quieter

18dc640

betatim force-pushed the blacklist-unsupported-metrics branch from df524b9 to 18dc640 Compare January 16, 2025 09:09

jameslamb approved these changes Jan 16, 2025

View reviewed changes

This was referenced Jan 16, 2025

conda logs have many empty lines rapidsai/ci-imgs#228

Open

Enable 'quiet: true' in condarc rapidsai/ci-imgs#217

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Skip dispatching to GPU for unimplemented metrics in UMAP #6224

Skip dispatching to GPU for unimplemented metrics in UMAP #6224

betatim commented Jan 14, 2025

betatim commented Jan 15, 2025

betatim commented Jan 15, 2025

jameslamb left a comment

jameslamb Jan 16, 2025

betatim Jan 16, 2025 •

edited

Loading

jameslamb Jan 16, 2025

betatim Jan 17, 2025

jameslamb Jan 17, 2025

Skip dispatching to GPU for unimplemented metrics in UMAP #6224

Are you sure you want to change the base?

Skip dispatching to GPU for unimplemented metrics in UMAP #6224

Conversation

betatim commented Jan 14, 2025

betatim commented Jan 15, 2025

betatim commented Jan 15, 2025

jameslamb left a comment

Choose a reason for hiding this comment

jameslamb Jan 16, 2025

Choose a reason for hiding this comment

betatim Jan 16, 2025 • edited Loading

Choose a reason for hiding this comment

jameslamb Jan 16, 2025

Choose a reason for hiding this comment

betatim Jan 17, 2025

Choose a reason for hiding this comment

jameslamb Jan 17, 2025

Choose a reason for hiding this comment

betatim Jan 16, 2025 •

edited

Loading