-
Notifications
You must be signed in to change notification settings - Fork 341
Update API to leverage cuVS Spectral clustering #5326
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update API to leverage cuVS Spectral clustering #5326
Conversation
|
Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
aamijar
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @jnke2016, thanks for the PR! Looks good overall, just left some comments for you to check.
| unsigned long long seed2 = seed1 + 1; | ||
| // Use cuVS for float precision until double precision is supported | ||
| if constexpr (std::is_same_v<weight_t, float>) { | ||
| // Convert CSR to COO using thrust |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You might be able to use raft::sparse::convert::csr_to_coo here if its more convenient.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. I leveraged this raft utils function to convert from csr to coo
| coo_matrix, | ||
| raft::make_device_vector_view<vertex_t, vertex_t>(clustering, graph.number_of_vertices)); | ||
| } else { | ||
| CUGRAPH_FAIL("Double precision spectral clustering not yet supported with cuVS"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added double support for the cuvs PR so we should be able to try with that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. I updated the PR to support double and after merging your latest changes it worked
aamijar
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for incorporating the suggestions!
…-spectral-clustering
ChuckHastings
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, will do a final review after it passes CI
ChuckHastings
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good once we solve the build problem.
|
/ok to test fed3f33 |
|
@robertmaynard @bdice . I am still getting an |
|
I am still trying to figure out the source of the |
|
/ok to test d8f2df6 |
|
/ok to test b4117ba |
|
/ok to test 2e677c8 |
Yeah I was wrong about the components. The |
ci/build_wheel.sh
Outdated
| sccache --show-adv-stats | ||
|
|
||
| EXCLUDE_ARGS=( | ||
| --exclude "libcuvs.so" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This shouldn't be needed. Wheels should always be statically linking, so shared libraries shouldn't be present in the first place. See cuML's wheel building script, which doesn't need this exclusion. cuGraph should be the same. https://github.com/rapidsai/cuml/blob/main/ci/build_wheel_libcuml.sh
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am confused. Aren't wheels using prebuilt packages and link dynamically (hence we have libcuvs.so in the build artifact). Am I missing something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wheels should always be statically linking, so shared libraries shouldn't be present in the first place
In that case, I should update the wheels to compile and build from source instead of using prebuilt packages?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it now after reviewing cuML. Thanks
| >>> from cugraph.datasets import karate | ||
| >>> G = karate.get_graph(download=True) | ||
| >>> df = cugraph.spectralBalancedCutClustering(G, 5) | ||
| >>> df = cugraph.spectralModularityMaximizationClustering(G, 5) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an example for the docs of spectralBalancedCutClustering. We shouldn't change the example code here. We could add a deprecation notice in the docs, though.
| >>> df = cugraph.spectralModularityMaximizationClustering(G, 5) | |
| >>> df = cugraph.spectralBalancedCutClustering(G, 5) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We shouldn't change the example code here. We could add a deprecation notice in the docs, though.
Right but this is what I did initially and the pytest doctests would fail unless that warning failure get skipped. @rlratzel any thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should probably just remove the example entirely. I think an example is useful for someone trying to learn how to add this function to their code, which we/they don't want since it's going away. Make sure the docstring is updated to indicate it's deprecated and direct them to the new API they should call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should still add the ignores I mentioned above because this will apply to all deprecated APIs in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should still add the ignores I mentioned above because this will apply to all deprecated APIs in the future.
I'm thinking that ignoring errors from deprecated APIs in doctests is not good though, and perhaps we should just adopt the policy we're doing here (remove example docstring, ensure new API is referenced) for all deprecated APIs in the future.
I'm concerned the ignores will mask other API deprecations we need to know about (eg. an example docstring uses a deprecated cuDF API, or a cugraph docstring for function A uses deprecated cugraph API B as part of the example and should be updated).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm concerned the ignores will mask other API deprecations we need to know about
If it helps, ignore::FutureWarning:cugraph will ignore just the deprecations in the cugraph module, or you can use ignore:MESSAGE_REGEX:FutureWarning:cugraph with some regex string to ignore specific (known) deprecations from cugraph.
dependencies.yaml
Outdated
| includes: | ||
| - common_build | ||
| - depends_on_libcugraph | ||
| - depends_on_libcuvs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wheel builds shouldn't depend on libcuvs libraries, it should be fetched and built statically by CMake.
| - depends_on_libcuvs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it. I understand what you mean here based on this
dependencies.yaml
Outdated
| includes: | ||
| - common_build | ||
| - depends_on_librmm | ||
| - depends_on_libcuvs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wheel builds shouldn't depend on libcuvs libraries, it should be fetched and built statically by CMake.
| - depends_on_libcuvs |
dependencies.yaml
Outdated
| extras: | ||
| table: project | ||
| includes: | ||
| - depends_on_libcuvs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do not want a runtime dependency on libcuvs, that's the point of statically linking.
| - depends_on_libcuvs |
dependencies.yaml
Outdated
| includes: | ||
| - common_build | ||
| - depends_on_libcugraph | ||
| - depends_on_libcuvs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wheel builds shouldn't depend on libcuvs libraries, it should be fetched and built statically by CMake.
| - depends_on_libcuvs |
dependencies.yaml
Outdated
| common: | ||
| - output_types: conda | ||
| packages: | ||
| - &libcuvs_unsuffixed libcuvs==25.12.*,>=0.0.0a0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's what I think you want:
- conda developer environments: depend on shared library
- conda devcontainers: depend on shared library
- conda recipes: fetch with CMake, build and link libcuvs statically
- No dependency in
hostorrun - Note: this differs from cuML, which depends on a shared library here. This can go either way in my opinion.
- No dependency in
- wheel builds: fetch with CMake, build and link libcuvs statically
- wheel at runtime: no dependency (it's statically linked)
- pip devcontainers: ???
- This one we might want to discuss. Probably fetch with CMake, build and link libcuvs statically? I haven't checked what this PR does right now.
|
@jnke2016 Are you still planning on getting this into 25.12? If so please don't merge in |
Got it @robertmaynard . @ChuckHastings can you retarget it please? |
Done |
@bdice Good catch thanks. I didn't have a devcontainers section. I just added it and it installs pre-built libcuvs and link dynamically |
|
/ok to test 0adde0d |
|
/ok to test db3dbea |
dependencies.yaml
Outdated
| - test_python_common | ||
| - test_python_cugraph | ||
| - test_python_pylibcugraph | ||
| devcontainers: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This devcontainers key isn't doing anything right now. To make it take effect, you'd need to update the devcontainers repo like this: https://github.com/rapidsai/devcontainers/blob/f9a516d1e4cd9b669845527d4ae13a4fdc720933/features/src/rapids-build-utils/opt/rapids-build-utils/manifest.yaml#L186-L187
However, why do we need this? I think there should be a solution that doesn't require a new dependency file key.
Look at how cuML is solving this: https://github.com/rapidsai/cuml/blob/6b42da514957a909a6d9d8921bad9d705161d8d8/dependencies.yaml#L684-L688
cuML only defines libcuvs for conda output types and not pyproject or requirements. I think that's probably what you want here too. Following the other discussion where we broke down the expected behavior by build type, you probably never want libcuvs wheels because you'll always build and statically link libcuvs. The only cases where you want libcuvs packages are for conda developer environments and devcontainers (but not conda recipes!).
I propose:
- Delete the
devcontainerskey here - Modify
depends_on_libcuvsbelow to only have conda outputs, matching cuML
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So if I understand correctly, you are suggesting to modify depends_on_libcuvs to only output for conda (not pyproject or requirements) because:
- Conda developer environments & devcontainers automatically get libcuvs from the all: section
- Wheel builds & pip devcontainers don't get libcuvs because pyproject outputs won't include it (statically link)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also:
- Conda recipes don't get libcuvs as a runtime dependency (they also statically link cuVS)
… output types only
…12_update-spectral-clustering
bdice
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perfect. Approved!
|
/ok to test 20dfd1d |
|
/merge |
0f1afef
into
rapidsai:release/25.12
This PR integrates cuVS spectral clustering from upstream into cuGraph while maintaining full backward compatibility. The integration deprecates the legacy balanced cut clustering C++ and Python APIs in favor of spectral_modularity_maximization. In fact, cuVS consolidated the spectral clustering API to assign vertices to clusters that by default maximizes the modularity score
Static Linking with Conda Fallback:
Note: The header files
raft/spectral/partition.cuhandraft/spectral/modularity_maximization.cuhare still necessary inspectraly_clustering.cubecause APIs (likeraft::spectral::analyzePartition,raft::spectral::analyzeModularity) are still being called in cuGraph and are not supported in cuVS. This is not an issue because those APIs are not deprecated in raft in contrary tokmeansandsolvers