Skip to content

Conversation

@jinsolp
Copy link
Contributor

@jinsolp jinsolp commented Nov 13, 2025

Closes #7143

This PR improves memory usage in UMAP when given a precomputed knn graph.
Previously, a user-given knn graph will occupy GPU memory throughout the full UMAP pipeline even though it is not needed in later steps of UMAP.

In this PR, if the user-given knn graph is on host memory, we keep it on host memory and copy to device at the cpp level to allow better memory management.

This PR with precomputed knn graph on CPU

Screenshot 2025-11-12 at 7 00 33 PM

Before with precomputed knn graph on CPU

Screenshot 2025-11-12 at 7 01 12 PM

@jinsolp jinsolp self-assigned this Nov 13, 2025
@jinsolp jinsolp requested review from a team as code owners November 13, 2025 03:01
@jinsolp jinsolp requested review from betatim and hcho3 November 13, 2025 03:01
@github-actions github-actions bot added Cython / Python Cython or Python issue CUDA/C++ labels Nov 13, 2025
@jinsolp jinsolp added non-breaking Non-breaking change improvement Improvement / enhancement to an existing function labels Nov 13, 2025
cudaGetLastError();
return false; // Assume host pointer if query fails
}
return attr.type == cudaMemoryTypeDevice || attr.type == cudaMemoryTypeManaged;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this pattern of checking if its a device pointer is being repeated way too much. In cuvs and raft we have abstracted it away into functions such as check_pointer_residency. Can you check if you can use those directly? If you cannot use those directly, we should create an issue for this -- ideally we'd have a function in raft that does this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes to using check_pointer_residency!

Copy link
Contributor

@hcho3 hcho3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jinsolp jinsolp changed the base branch from main to release/25.12 November 17, 2025 17:03
Copy link
Contributor

@csadorf csadorf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Just a few minor suggestions for improved wording in the docs.

@jinsolp
Copy link
Contributor Author

jinsolp commented Nov 19, 2025

Had to change the default based on the changes of this PR: #7501

@jinsolp
Copy link
Contributor Author

jinsolp commented Nov 21, 2025

/merge

@rapids-bot rapids-bot bot merged commit 45e220d into rapidsai:release/25.12 Nov 21, 2025
106 checks passed
@jinsolp jinsolp deleted the opt-umap-precomputed-knn-mem branch November 21, 2025 02:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CUDA/C++ Cython / Python Cython or Python issue improvement Improvement / enhancement to an existing function non-breaking Non-breaking change

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Improve memory efficiency when using precomputed KNN graph in UMAP

4 participants