Skip to content

Conversation

@jcrist
Copy link
Member

@jcrist jcrist commented Nov 26, 2025

This is a fairly substantial rewrite/refactor of our existing type reflection system. There should be no user-visible changes from this refactor (though in the future we do want to make changes). For now this is just trying to simplify the internals so the existing system is easier to understand, reason about, and modify.

  • Moved (almost) all logic to a single file cuml.internals.outputs instead of being strewn around 5+ files.
  • Removed all the contextmanagers in api_context_managers.py in favor a simpler, more readable mechanisms
  • Removed all the decorators in api_decorators.py in favor of a single reflect decorator with sane defaults and only a few configurable knobs
  • Removed set_api_output_type; this feature was unnecessary, the reflect decorator can handle everything without an escape hatch.
  • Reduced state management for type reflection decisions down to 3 places (a combination of GlobalSettings().output_type, Base.output_type, and an array input type, depending on the call). The decision around what output type to return is now entirely in one location, and the conversion is also encompassed within a single function. This should hopefully be much easier to understand.
  • Removed the auto-decorating Base metaclass in favor of explicit decorators. This was done by logging the original auto-decorated versions, then inspecting each one when adding explicit versions to ensure they were accurate. Not everything that was decorated before needed to be decorated.
  • Removed decorators on functions that don't need them. This is mostly functions that return non-arrays and don't make any nested calls requiring a CumlArray output
  • Fixed a few decorators that weren't applied properly (e.g. LinearRegression.predict). These are bugfixes.

Once this is in, we should have an easier time making behavior changes and deprecating features (#7426) since the new implementation is simpler and has fewer moving pieces.

Fixes #5022.

@jcrist jcrist self-assigned this Nov 26, 2025
@jcrist jcrist requested a review from a team as a code owner November 26, 2025 03:10
@jcrist jcrist requested a review from divyegala November 26, 2025 03:10
@jcrist jcrist added Tech Debt Issues related to debt improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Nov 26, 2025
@github-actions github-actions bot added the Cython / Python Cython or Python issue label Nov 26, 2025
@jcrist jcrist requested review from csadorf and removed request for divyegala November 26, 2025 03:11
@jcrist
Copy link
Member Author

jcrist commented Nov 26, 2025

This still needs:

  • Docstring cleanups in all public functions/methods. The existing docstrings from before had bitrotted, and I want to improve them generally as developer docs. Same goes for Base
  • Another pass through for readability/comments where needed.
  • New tests to better specify the behavior. The existing tests still pass, but aren't good or sufficient, I plan to replace them with more thorough tests.
  • Probably some bugs have leaked through, pushed this up mostly to get a full CI run.

I'll ping when ready for review.

I recognize this is a large PR; most of the meat of it occurs before d6d8d21. The latter commits touch many many files, but are mostly mechanical changes applying explicit decorators where previously they were magically applied by BaseMetaClass.

If you pick through the changes in order they start off with fairly reasonable and small simplifications as I picked apart the existing complexity. Once things got simple enough to understand there are then a few more meaty commits that redo the implementation without the need for such complex contextmanagers/decorators.

These had no functional difference with
`api_return_array`/`api_base_return_array`. 2 fewer decorators to
understand.
- Simplify `ProcessReturn` from 3 classes into one
- Remove parametrizing the context manager by return processing
- Remove `__class_getitem__` usage entirely in favor of defining
  `ProcessEnter_Type` explicitly on subclasses

At this point, return handling for reflection is the same everywhere,
and the only switch is whether to enable it or not.
No need for second class at all at this point.
Some further trivial simplifications
These are now redundant and were aliases to their array counterparts.
Removing lets us simplify further.
Simplify, just move all logic to InternalContextAPIBase. Terrible name,
but we now have fewer moving pieces. All call logic is still the same as
before. Some of the branches don't make sense to me, but we're not
changing the logic, just the plumbing.
These should only be used on fits, but any fit should do all of them.
This can be easily inferred. Further simplifying.
Old implementation was scattered between too many files.

Also deletes some dead code.
Further simplifications - you can actually follow how output type is
inferred and propagated now! Now backdoor state.
@jcrist jcrist force-pushed the simplify-reflection branch from d07722d to a44e6df Compare November 26, 2025 20:43
Copy link
Member Author

@jcrist jcrist left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Annotated the diff for anything interesting/worth calling out.

Files without comments should contain uninteresting mechanical changes like swapping out decorator names (unless I missed something).

"""

bin_edges_internal_ = CumlArrayDescriptor()
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before bin_edges_ existed but would reflect incorrectly since it contains a numpy array of cupy arrays (upstream sklearn does a numpy array of arrays 🤷, the port here made minimal changes so that's what you get). We now move it to be a non-reflected attribute. I consider the previous behavior to be a bug since it wasn't working correctly.

raise ValueError("batch_size must be > 0")

# Reflect the output type from global settings or the clusterer
cuml.internals.set_api_output_type(clusterer._get_output_type())
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reflect decorator now handles cases like this automatically.

# Set the default output type to "cupy". This will be ignored if the user
# has set `cuml.global_settings.output_type`. Only necessary for array
# generation methods that do not take an array as input
cuml.internals.set_api_output_type("cupy")
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reflect decorator now handles cases like this automatically.

return array_to_memory_order(arr) is not None


def cuda_ptr(X):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was in memory_utils.py, moved it here since that file went away.

self.shared_state = {
"root_cm": None,
"_output_type": None,
"_external_output_type": False,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

State on GlobalSettings is:

  • _output_type: None or a valid output type string reflecting the current output type. Can also be "mirror" (this was term in the old implementation), which effectively means "return CumlArray values except for CumlArrayDescriptor where instead the original value set on the descriptor is returned". This could/should probably be merged with output_type="cuml" except there are too many places (mostly in _thirdparty where a non-CumlArray value is set. If we decide to simplify this and merge the terms this will have to be done in a followup.
  • _external_output_type: The originally configured output type from outside the internal API context, or False if not running in an internal API.

}


def determine_array_memtype(X):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This used to be in cuml.internals.memory_utils, but was only used in this test file. Moved it here instead.

Same with the other change below - that was inlining a utility function from cuml.internals.memory_utils that was only used in that single test.

@@ -1,317 +0,0 @@
# SPDX-FileCopyrightText: Copyright (c) 2020-2025, NVIDIA CORPORATION.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tests in this file passed fine in the new system (if you rollback a few commits you can run them).

I migrated the intentions (but not necessarily the implementation) to a more complete set of tests in test_reflection.py.

@@ -1,130 +0,0 @@
#
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tests in this file passed fine in the new system (if you rollback a few commits you can run them).

I migrated the intentions (but not necessarily the implementation) to a more complete set of tests in test_reflection.py.

.. autofunction:: cuml.internals.memory_utils.using_output_type
.. autofunction:: cuml.set_global_output_type

.. autofunction:: cuml.using_output_type
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These functions were only ever demonstrated from the top-level namespace, but did indeed contain the old filename in the full path.

If necessary we can add a shim cuml.internals.memory_utils file to keep around the old import paths for a deprecation cycle. I doubt it's needed, but 🤷.

@@ -0,0 +1,383 @@
# SPDX-FileCopyrightText: Copyright (c) 2020-2025, NVIDIA CORPORATION.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file contains the bulk of the reflection tests. It's also worth reviewing.

@jcrist
Copy link
Member Author

jcrist commented Nov 26, 2025

Gah, everything was passing last night, but then I rebased before tossing in a new test file. I assume something snuck in in the rebase (or maybe an upstream C++ dep issue, looking at the failures), I'll take another look next week after the holiday.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Cython / Python Cython or Python issue improvement Improvement / enhancement to an existing function non-breaking Non-breaking change Tech Debt Issues related to debt

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Epic] Reduce type reflection system complexity

1 participant