Skip to content

EXPLORATORY: Add a path that allows using new DType API#374

Draft
seberg wants to merge 6 commits intojax-ml:mainfrom
seberg:try-new-dtype-hack
Draft

EXPLORATORY: Add a path that allows using new DType API#374
seberg wants to merge 6 commits intojax-ml:mainfrom
seberg:try-new-dtype-hack

Conversation

@seberg
Copy link
Copy Markdown
Contributor

@seberg seberg commented Apr 20, 2026

gh-360 seems a little bit hard to get there, since we'll need some cleanups in newer NumPy versions, and who knows what else small fallouts might happen.

Thus, I am trying to figure out a middle-ground. One that allows keeping better compatibility for the simple dtypes here, but in the longer term NumPy might deprecate more and more and force migration.

E.g. I didn't bother migrate all casts, but of course it is better to do so and NumPy should probably force ml_dtypes to do so, but it makes this PR simpler without.

Now, how does this work? This requires NumPy 2 (I'll test that it actually works). From NumPy 2.5 (hopefully, otherwise 2.6), we would then introduce a new "middle ground" API for transition purposes meaning that the hack here is only used for older NumPy versions (where ABI cannot change, so it is safe to do so -- I could put this terrible hack into NumPy, but it seems easier to keep it here...).

(heavy use of claude to spit out code, but of course with absolute design micro-managing in many relevant parts -- but I'll need to go through once more myself).


What does it achieve? Right now, I implemented common_dtype, i.e. it would fix np.result_type(), that currently almost universally just fails. Another thing will allow is implementing finfo on NumPy 2.5+ versions (although, I need to add a small hack so that it also works for complex still).

The thing is that this reaches into things a bit too low for my liking. That is OK on non-released NumPy versions, but I need to add a path so that PyArrayInitDTypeMeta_FromSpec_WithLegacy can be a simple call to a NumPy API Function (probably not quite as done here, but rather passing the proto as a slot.).

@seberg seberg force-pushed the try-new-dtype-hack branch from 42f8fc5 to de3a5a8 Compare April 20, 2026 12:43
@seberg seberg force-pushed the try-new-dtype-hack branch from de3a5a8 to eff09d7 Compare April 20, 2026 12:58
@seberg
Copy link
Copy Markdown
Contributor Author

seberg commented Apr 21, 2026

@hawkinsp just in case you have a quick thought here. I tried to rewrite things to just use the new API but keep it a "legacy" dtype to some degree, so that there should be no real regressions but at the same time it works fine all the way back to NumPy 2 (the only regression I noticed it that arrays print a bit less nice).

However, this uses PyType_Ready. NumPy predated PyType_FromMetaclass and it still implements e.g. tp_new making it's use incompatible.
So that is a bigger downside with this approach: adopting the Python stable API may be hard or indefinitely deferred, because one would need to hack around this for old NumPy versions.
(I am pretty sure it is possible to hack around it, NumPy effectively does it and I think so does pybind11 probably. But it may be pretty ugly...)

The alternative I can currently think off is to just allow PyArrayInitDTypeMeta_FromSpec to amend the current legacy dtype.
Less forward looking and also I liked how this backported while the amending pattern is nice for new NumPy versions but backports worse, I expect (mainly because of cast definition patterns).

@hawkinsp
Copy link
Copy Markdown
Collaborator

I haven't looked yet, but

However, this uses PyType_Ready. NumPy predated PyType_FromMetaclass and it still implements e.g. tp_new making it's use incompatible.
So that is a bigger downside with this approach: adopting the Python stable API may be hard or indefinitely deferred, because one would need to hack around this for old NumPy versions.

It's not the end of the world to have to build ml_dtypes per Python version: it's a small enough package. I had previously abandoned trying to use the limited dtype API for similar reasons (#195). And eventually when the oldest supported NumPy ages off our support matrix, we can switch.

@seberg
Copy link
Copy Markdown
Contributor Author

seberg commented Apr 22, 2026

OK, cool, then I think I'll pursue this, we need a better way to transition a package like ml_dtypes and I think this is viable.

Long term for the stable API: I suspect the right thing will be to have a new DType creation function, that creates the full heap-type for you based on the spec.
(That way, even if we need a bit crazy things, that can live in NumPy. I.e. PyArrayDTypeMeta_FromSpecs(module, type_slots, dtype_slots), but that'll be a NumPy 2.6 thing at best -- I am also very curious about the Python stable API developments around this. Back in the day, I stole their ideas, but they seemed to have improved on them quite a lot!)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants