Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] UMAP transform throws illegal memory access error when data_on_host=True #6216

Open
btepera opened this issue Jan 10, 2025 · 0 comments
Assignees
Labels
? - Needs Triage Need team to review and classify bug Something isn't working

Comments

@btepera
Copy link

btepera commented Jan 10, 2025

When running UMAP with batched nn descent, fit is supported today but not transform, which falls back to using brute force knn (#6215). If I run fit with data_on_host set to True, this causes the transform call to throw an illegal memory access error.

import numpy as np
from cuml.manifold import UMAP

N = 10000
K = 32

rng = np.random.default_rng()
data = rng.random((N, K), dtype="float32")

reducer = UMAP(
    n_components=2,
    n_neighbors=15,
    build_algo="nn_descent",
    build_kwds={"nnd_n_clusters": 4},
)

fitted_umap = reducer.fit(data, data_on_host=True)
embeddings = fitted_umap.transform(data)

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
Cell In[4], line 9
      1 reducer = UMAP(
      2     n_components=2,
      3     n_neighbors=15,
      4     build_algo="nn_descent",
      5     build_kwds={"nnd_n_clusters": 4},
      6 )
      8 fitted_umap = reducer.fit(data, data_on_host=True)
----> 9 embeddings = fitted_umap.transform(data)

File [/raid/btepera/miniforge3/envs/rapids-24.12/lib/python3.12/site-packages/cuml/internals/api_decorators.py:188](http://127.0.0.1:8890/raid/btepera/miniforge3/envs/rapids-24.12/lib/python3.12/site-packages/cuml/internals/api_decorators.py#line=187), in _make_decorator_function.<locals>.decorator_function.<locals>.decorator_closure.<locals>.wrapper(*args, **kwargs)
    185     set_api_output_dtype(output_dtype)
    187 if process_return:
--> 188     ret = func(*args, **kwargs)
    189 else:
    190     return func(*args, **kwargs)

File [/raid/btepera/miniforge3/envs/rapids-24.12/lib/python3.12/site-packages/cuml/internals/api_decorators.py:393](http://127.0.0.1:8890/raid/btepera/miniforge3/envs/rapids-24.12/lib/python3.12/site-packages/cuml/internals/api_decorators.py#line=392), in enable_device_interop.<locals>.dispatch(self, *args, **kwargs)
    391 if hasattr(self, "dispatch_func"):
    392     func_name = gpu_func.__name__
--> 393     return self.dispatch_func(func_name, gpu_func, *args, **kwargs)
    394 else:
    395     return gpu_func(self, *args, **kwargs)

File [/raid/btepera/miniforge3/envs/rapids-24.12/lib/python3.12/site-packages/cuml/internals/api_decorators.py:190](http://127.0.0.1:8890/raid/btepera/miniforge3/envs/rapids-24.12/lib/python3.12/site-packages/cuml/internals/api_decorators.py#line=189), in _make_decorator_function.<locals>.decorator_function.<locals>.decorator_closure.<locals>.wrapper(*args, **kwargs)
    188         ret = func(*args, **kwargs)
    189     else:
--> 190         return func(*args, **kwargs)
    192 return cm.process_return(ret)

File base.pyx:720, in cuml.internals.base.UniversalBase.dispatch_func()

File umap.pyx:841, in cuml.manifold.umap.UMAP.transform()

RuntimeError: CUDA error encountered at: file=[/opt/conda/conda-bld/work/cpp/src/umap/fuzzy_simpl_set/naive.cuh](http://127.0.0.1:8890/opt/conda/conda-bld/work/cpp/src/umap/fuzzy_simpl_set/naive.cuh) line=257: call='cudaPeekAtLastError()', Reason=cudaErrorIllegalAddress:an illegal memory access was encountered
Obtained 39 stack frames

Independent of having transform support batched nn descent (which I would imagine is a larger effort), we should handle this fallback appropriately in cases where data_on_host was True during the fit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
? - Needs Triage Need team to review and classify bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants