Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feat] Support bitset filter for Brute Force #560

Merged
merged 21 commits into from
Jan 31, 2025

Conversation

rhdong
Copy link
Member

@rhdong rhdong commented Jan 8, 2025

No description provided.

@rhdong rhdong requested review from a team as code owners January 8, 2025 07:11
@rhdong rhdong added feature request New feature or request non-breaking Introduces a non-breaking change and removed CMake labels Jan 8, 2025
@rhdong rhdong requested review from cjnolet, lowener and benfred January 8, 2025 07:13
@rhdong rhdong requested a review from a team as a code owner January 8, 2025 20:11
@github-actions github-actions bot removed the CMake label Jan 15, 2025
@@ -67,8 +67,8 @@ void _search(cuvsResources_t res,
using queries_mdspan_type = raft::device_matrix_view<T const, int64_t, QueriesLayoutT>;
using neighbors_mdspan_type = raft::device_matrix_view<int64_t, int64_t, raft::row_major>;
using distances_mdspan_type = raft::device_matrix_view<float, int64_t, raft::row_major>;
using prefilter_mds_type = raft::device_vector_view<const uint32_t, int64_t>;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we want to keep the filter immutable, don' we?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change is to be compatible with the bitset filter utilized in CAGRA and other algorithms. The immutable is implemented on the search API by restricting the filter to be const cuvs::neighbors::filtering::base_filter. Please refer to cagra, here

Copy link
Member

@cjnolet cjnolet Jan 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are using obj<const type> everywhere in the codebase for containers to immutable types so we should be doing that here as well. If the CAGRA API is not doing this, it should be.

struct base_filter {
virtual ~base_filter() = default;
virtual ~base_filter() = default;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I notice no changes have been made to brute_force.hpp. Ideally, we'll at at least be listing out in the docs which filters are supported, right? Otherwise this is going to be very confusing for users. Also, can we set the default to bitset filter? I suspect most users will want bitset.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’ve just added the comments. I believe using bitset as the default setting might not be ideal if we don't have enough input from end-users. Perhaps we should discuss this in the team group, as I noticed that the none filter is also set as the default in CAGRA.

Copy link
Member

@cjnolet cjnolet Jan 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe using bitset as the default setting might not be ideal if we don't have enough input from end-users.

I think you may have misunderstood me. The none filter is fine as the default for the the search functions, but for the code example in the docs, we should make sure we use a bitset and leave bitmap to users who need it. FAISS doesn't even support a bitmap and users aren't asking for it generally. It's good to keep it exposed for users who might need it.

@rhdong rhdong requested a review from cjnolet January 29, 2025 03:40
@rhdong
Copy link
Member Author

rhdong commented Jan 30, 2025

Hey, @cjnolet @benfred @lowener, I've addressed all comments. I'd appreciate it if you could take a moment to re-review and approve. Thanks!

* be considered for each query.
*
* - Supports two types of filters:
* 1. **Bitmap Filter**: A per-query filter with a logical shape of `[n_queries, index->size()]`,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we reorder these please? I think we should be pushing bitset as the first option.


namespace cuvs::neighbors::detail {

using namespace cuvs::neighbors::filtering;
Copy link
Member

@cjnolet cjnolet Jan 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Never use using namespace in a header file. It cascades to all header files. If you need to do this, do it in a source file. Otherwise, just use the absolute namespace.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All done! Thanks!

@rhdong rhdong requested a review from cjnolet January 30, 2025 20:48
@cjnolet
Copy link
Member

cjnolet commented Jan 31, 2025

/merge

@rapids-bot rapids-bot bot merged commit c778c88 into rapidsai:branch-25.02 Jan 31, 2025
61 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cpp feature request New feature or request non-breaking Introduces a non-breaking change Python
Projects
Development

Successfully merging this pull request may close these issues.

2 participants