-
Notifications
You must be signed in to change notification settings - Fork 123
Add build_from_args
for IVF-PQ
#1123
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: branch-25.10
Are you sure you want to change the base?
Conversation
Signed-off-by: Mickael Ide <[email protected]>
Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
Nitpick- but can we please call this "build_from_codebooks" instead of "build_from_centroids"? The codebooks are really the important piece here. Another thing we might consider doing, which would probably be more general would be to just overload the standard build function to accept the codebooks, centroids, and rotation matrix, etc... That would be more precise and more clean IMO. |
Actually, it might be more preferable to call this |
Signed-off-by: Mickael Ide <[email protected]>
Signed-off-by: Mickael Ide <[email protected]>
build_from_centroids
for IVF-PQbuild_from_args
for IVF-PQ
cpp/include/cuvs/neighbors/ivf_pq.h
Outdated
@@ -267,18 +267,55 @@ uint32_t cuvsIvfPqIndexGetNLists(cuvsIvfPqIndex_t index); | |||
/** Get the dimensionality */ | |||
uint32_t cuvsIvfPqIndexGetDim(cuvsIvfPqIndex_t index); | |||
|
|||
/** Get the pq_dim */ | |||
uint32_t cuvsIvfPqIndexGetPqDim(cuvsIvfPqIndex_t index); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should decide on the API convention for trivial getters like this in the C-API.
For ivf-pq we have the getters returning the value directly, but in cagra we have the getters returning an error code with the output being passed by a pointer :
cuvs/cpp/include/cuvs/neighbors/cagra.h
Line 359 in 699d32c
cuvsError_t cuvsCagraIndexGetDims(cuvsCagraIndex_t index, int* dim); |
I think we should be consistent here across index types, and while I like this version since the corresponding method in the C++ api should be noexcept
- I still think we maybe should be returning an error code here instead (just in case a different index can't guarantee noexcept
etc).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is a good proposition, I made the changes to support error codes for the getters
|
||
void extract_codebook(raft::resources const& res, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
void extract_codebook(raft::resources const& res, | |
/// @copydoc extract_codebook | |
void extract_codebook(raft::resources const& res, |
raft::device_mdspan<const float, pq_centers_extents, raft::row_major> pq_centers() const noexcept; | ||
/** Owning view of the PQ centers */ | ||
raft::device_mdspan<float, pq_centers_extents, raft::row_major> pq_centers_owning_view(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to expose both pq_centers
and pq_centers_owning_view
? I'm worried this is complicating the public API, and leaking implementation details about whether the underlying memory is owned or not to the user.
CAGRA handles this by returning the graph_view_
directly if its non-owning, and setting the graph_view_
to point at the owned graph if its owning the memory
cuvs/cpp/include/cuvs/neighbors/cagra.hpp
Line 512 in 699d32c
graph_view_ = graph_.view(); |
index.graph()
getter handles both cases transparently.
Would something similar work here? (and if not - why do we need to have two different views on potentially different pq_centers inside the index?).
(also the owning_view
naming convention here is a bit confusing to me - since I don't see how you can have a mdspan view that also owns the data )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes it is adding a bit of complexity to the API but it allows us to be able to get non-const pointers to those arrays that can be used in the building step.
If the only way to write to these owned array was by making a full copy, then operations like these that transpose the pq centers from a workspace buffer to the index would require an extra buffer of the same size to make the transpose into, then copy it into the index-owned array.
The user would not get two views of different pq-centers:
- If IVF-PQ was built with a dataset the 2 views (const and non-const) point to the same memory location
- If IVF-PQ was built from args, only the const view should be used, the non-const throws an exception
I think there is some value in being able to get a non-const pointer when needed.
What would be a better naming convention for those that would be less confusing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't you have this as an overload with the same name? I think they should be distinguishable here at the compile time due to different const-ness of the index. This would also be more along the lines of a typical C++ code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It could be an overload with the same name, but it would be much more error-prone for the user.
index.pq_centers().size()
would be ambiguous and could return the empty mdarray size (0) instead of the mdspan size.
I just tested locally the overload and for a call to raft::linalg::gemm
(here) that requires a const pointer, index.rotation_matrix().data_handle()
is ambiguous as well since the compiler can take non-const view and get a const pointer from it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, that gemm should probably be replaced with an mdspan-based version instead of using raw pointers :) but I see your point. I guess, one could use std::enable_if_t<...> to disable the inappropriate version, or even make a single function out of the two with an if constexpr
inside?..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(We discussed about it in a call, the problem is that having both under the same function name can lead to ambiguity at runtime and it would run into runtime exception)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should just keep the const or not keep it and not opt to support both. In practice, I think most users are going to be getting the centers and not modifying them, but we could probably remove the const designator altogether. Would that solve this problem?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But I agree w/ the others above- the user shouldn't be burdened with this- they should just be invoking an accessor method on the object, ideally named after the member they are accessing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I modified that code to use kind of the same mechanism as CAGRA with update_*
to only allow setting that memory through copy. The API is less confusing but users now don't have access to a non-const view. Let me know your opinions on it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I modified that code to use kind of the same mechanism as CAGRA with update_* to only allow setting that memory through copy.
We need to find a way to allow zero copy- both for accessing and mutating. They should be able to do the copy themselves if they so need to. The general assumption should be "direct access, copies should be explicit"
Signed-off-by: Mickael Ide <[email protected]>
Signed-off-by: Mickael Ide <[email protected]>
Signed-off-by: Mickael Ide <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Above we've been discussing the peculiarities of accessing the parts of the index depending on how it was constructed. It seems there's no perfect solution: before the PR, IVF-PQ index can be const or non-const, and that implies a user can modify/write its members, such as PQ centers. After the PR, it seems there's no way to override the pq_centers content if the user has built the index using the old api.
Also I'm not a fan of the idea to have IVF-PQ sometimes own its members and sometimes not own. We've tried that with CAGRA and I think that is a sub-optimal user experience.
What if instead we'd pass the mdarrays by rvalue references to the new build
overload and to the new constructor? Let the user pass the ownership to IVF-PQ. This would keep the index fully owning all its members and save the user from the burden of managing the lifetimes of multiple dependent objects.
And a small nitpick: could you please add a proper description to the PR? Just adding "closes #issue" doesn't look very nice in the main branch commit history and doesn't reflect how the PR addresses the issue. It is also funny that the build_from_args
in the title of the PR only ever appears in the changeset in a docsting :)
Java doesn't support unsigned types well, and this is leading to us only supporting 2**31 items in the cagra index - instead of 2**32. Fix by changing the getters in the c-api to return int64_t values for the size/dim etc, instead of uint32_t. We were also not consistent on returning error codes in ivf_flat/ivf_pq for these getters - with cagra returning an error code, and the ivf* methods just returning the value directly. Change to always return an error code to be consistent. (Note that rapidsai#1123 is also making this change for ivf-pq - and I'm skipping ivf-pq to avoid merge conflicts).
cpp/include/cuvs/neighbors/ivf_pq.h
Outdated
|
||
/** Get the dimensionality */ | ||
uint32_t cuvsIvfPqIndexGetDim(cuvsIvfPqIndex_t index); | ||
cuvsError_t cuvsIvfPqIndexGetDim(cuvsIvfPqIndex_t index, uint32_t* dim); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#1272 is using int64_t for these getters for ivf-flat and CAGRA? Can we do the same?
raft::device_matrix_view<const float, uint32_t, raft::row_major> centers() const noexcept; | ||
/** Setter for the centers [n_lists, dim_ext] or [n_lists, dim] */ | ||
void update_centers(raft::resources const& handle, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How exactly is this different from the set_centers
helper that we currently have? Can we re-use set_centers
in places where you are calling update_centers
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new update_*
functions are required because there is no possibility to obtain a non-const pointer to those data members. So any modification to those data members must go through an index member function.
So in the current implementation set_centers
can't modify the centers directly anymore, it has to go through update_centers
. There is still some value in keeping set_centers
because it will compute the rotated centers for the users.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good! Feel free to mark this one as resolved.
Signed-off-by: Mickael Ide <[email protected]>
In this PR, a new way to build IVF-PQ is added:
build_from_args
in Python that use a new overload ofbuild
in C++.This new build method takes views on device memory to construct an IVF-PQ index that is extendable and searchable with as few data transfer as possible.
It is made available in C++ and Python, and will be re-used in a future preprocessing function for product quantization.
Closes #1107.