Skip to content

build: add freethreaded python ci and wheels#284

Merged
nsmith- merged 4 commits intocms-nanoAOD:masterfrom
lgray:patch-5
Feb 27, 2026
Merged

build: add freethreaded python ci and wheels#284
nsmith- merged 4 commits intocms-nanoAOD:masterfrom
lgray:patch-5

Conversation

@lgray
Copy link
Contributor

@lgray lgray commented Mar 12, 2025

  • ci
  • wheels
  • tests

Will add wheels when they're fixed up by #283

@lgray
Copy link
Contributor Author

lgray commented Mar 12, 2025

Ah - right we are blocked because pydantic is not updated to be compatible with 3.13t at all.

Nope it's in beta now!

@lgray lgray changed the title add freethreaded python ci and wheels build: add freethreaded python ci and wheels Mar 12, 2025
@lgray lgray closed this Mar 17, 2025
@lgray lgray reopened this Mar 17, 2025
@lgray
Copy link
Contributor Author

lgray commented Mar 17, 2025

refreshing this PR since there's now beta2 of the freethreaded pydantic

@lgray
Copy link
Contributor Author

lgray commented Mar 17, 2025

@mgorny @nsmith- @henryiii

I noticed when it's doing the freethreaded windows build that the MS linker is looking for python313.lib, rather than python313t.lib that's specified as PythonLib in CMake.

Results in the error (I think):

LINK : fatal error LNK1104: cannot open file 'python313.lib' [C:\Users\runneradmin\AppData\Local\Temp\tmpu4rnegbz\build\_core.vcxproj]

I think that's the only blocker on windows.

@mgorny
Copy link
Contributor

mgorny commented Mar 18, 2025

That looks like https://gitlab.kitware.com/cmake/cmake/-/issues/26016, which is supposedly fixed but I'm not sure in which CMake version.

@mgorny
Copy link
Contributor

mgorny commented Mar 18, 2025

That looks like https://gitlab.kitware.com/cmake/cmake/-/issues/26016, which is supposedly fixed but I'm not sure in which CMake version.

Ah, sorry, now I see 3.30.3 — so supposedly it should work here.

@mgorny
Copy link
Contributor

mgorny commented Mar 18, 2025

Sorry for thinking loudly. I see that CMake is doing the right thing:

2025-03-17T20:15:06.0037191Z   -- Found PythonInterp: C:/hostedtoolcache/windows/Python/3.13.1/x64-freethreaded/python.exe (found suitable version "3.13.1", minimum required is "3.7")
2025-03-17T20:15:06.0038736Z   -- Found PythonLibs: C:/hostedtoolcache/windows/Python/3.13.1/x64-freethreaded/libs/python313t.lib

So looks like python313.lib is coming from elsewhere. pybind11 perhaps?

@lgray
Copy link
Contributor Author

lgray commented Mar 18, 2025

No problem - think out loud all you want. I'm just not sure where to look :-)

@henryiii
Copy link

You need to use the modern FindPython, not the old one. FindPythonLibs / FindPythonInterp was "removed" (sort of) in CMake 3.27, so that's not what 3.30.3 is referring to.

@henryiii
Copy link

Right above here

include(FetchContent)
FetchContent_Declare(pybind11
SOURCE_DIR ${CMAKE_CURRENT_SOURCE_DIR}/pybind11
CMAKE_ARGS "-DBUILD_TESTING=OFF -DPYBIND11_NOPYTHON=ON"
)
FetchContent_MakeAvailable(pybind11)
you should set(PYBIND11_FINDPYTHON ON). pybind11 3.0 will change the default.

@lgray
Copy link
Contributor Author

lgray commented Mar 18, 2025

Thanks @henryiii, giving it a try.

@lgray
Copy link
Contributor Author

lgray commented Mar 18, 2025

3.13t manylinux wheels are dying on tests because of missing awkward 3.13t wheel. Otherwise they build fine!

@lgray
Copy link
Contributor Author

lgray commented Mar 18, 2025

@henryiii I guess the best way to test this in freethreaded mode for races and such would to first just try pytest-parallel? Probably mark the dask tests to not be run in parallel?

@henryiii
Copy link

Yes, I believe so.

@lgray
Copy link
Contributor Author

lgray commented Mar 18, 2025

Nice - parallel tests seem to go, except on windows. Looks like some issue with parallel processing on windows to begin with.

@lgray lgray closed this Mar 19, 2025
@lgray lgray reopened this Mar 19, 2025
@lgray lgray closed this Mar 19, 2025
@lgray lgray reopened this Mar 19, 2025
@lgray lgray closed this Apr 3, 2025
@lgray lgray reopened this Apr 3, 2025
@lgray
Copy link
Contributor Author

lgray commented Apr 4, 2025

I don't really understand this failure in macos.

Must be a threadsafety thing, but then why not in ubuntu?

@lgray
Copy link
Contributor Author

lgray commented Apr 4, 2025

Oh wow, it's definitely a thread safety issue (it passed in this latest commit!). Yikes!

@lgray
Copy link
Contributor Author

lgray commented Apr 4, 2025

@nsmith- Any first ideas on what needs a mutex around it?

@lgray
Copy link
Contributor Author

lgray commented Apr 4, 2025

For reference the error that crops up in the MT tests is:

self = <correctionlib.highlevel.CorrectionSet object at 0x2a9ea3d0150>

    def __iter__(self) -> Iterator[str]:
>       return iter(self._base)
E       TypeError: Object of type 'iterator' is not an instance of 'iterator'

/Library/Frameworks/PythonT.framework/Versions/3.13/lib/python3.13t/site-packages/correctionlib/highlevel.py:401: TypeError
________________________________ test_evaluator ________________________________

    def test_evaluator():
        with pytest.raises(RuntimeError):
            cset = core.CorrectionSet.from_string("{")
    
        with pytest.raises(RuntimeError):
            cset = core.CorrectionSet.from_string("{}")
    
        with pytest.raises(RuntimeError):
            cset = core.CorrectionSet.from_string('{"schema_version": "blah"}')
    
        with pytest.raises(RuntimeError):
            cset = core.CorrectionSet.from_string('{"schema_version": 2, "description": 3}')
    
        cset = core.CorrectionSet.from_string(
            '{"schema_version": 2, "description": "something", "corrections": []}'
        )
        assert cset.schema_version == 2
        assert cset.description == "something"
    
        cset = wrap(
            schema.Correction(
                name="test corr",
                version=2,
                inputs=[],
                output=schema.Variable(name="a scale", type="real"),
                data=1.234,
            )
        )
>       assert set(cset) == {"test corr"}
E       TypeError: Object of type 'iterator' is not an instance of 'iterator'

tests/test_core.py:47: TypeError

@nsmith-
Copy link
Collaborator

nsmith- commented Apr 15, 2025

My best guess would be something related to this:

correctionlib/src/python.cc

Lines 106 to 108 in 093ce46

.def("__iter__", [](const CorrectionSet &v) {
return py::make_key_iterator(v.begin(), v.end());
}, py::keep_alive<0, 1>())

is not threadsafe.

@lgray
Copy link
Contributor Author

lgray commented Apr 18, 2025

Digging around there seems to be something suspicious w.r.t. the py::keep_alive but I've not found an accurate description. make_key_iterator itself seems to be fine.

Will continue digging.

@lgray
Copy link
Contributor Author

lgray commented Feb 27, 2026

OK - majorly thinned down on changes. Let's see how it rolls :-)

@lgray lgray force-pushed the patch-5 branch 8 times, most recently from 1d3491c to 0d0fb91 Compare February 27, 2026 18:27
@ikrommyd
Copy link
Contributor

ikrommyd commented Feb 27, 2026

A lot of packages are now not supporting 3.13t as that was experimental and are just testing 3.14t only. We should probably do 3.14t only?

@lgray
Copy link
Contributor Author

lgray commented Feb 27, 2026

Yeah features might drift significantly from the 3.13t version so probably a good idea. I'll take it out.


- name: Test package (parallel)
if: endsWith(matrix.python-version , 't')
run: python -Xgil=0 -m pytest -ra --parallel-threads=16 --iterations=10
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this is too many threads? Idk how many threads the CI workers have. Numpy tests with 4 but they have a much larger test suite. Maybe 8 would be good here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It all runs so fast it probably doesn't matter.

pyproject.toml Outdated

[tool.cibuildwheel]
skip = ["cp314t-*"]
skip = []
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can this field be removed altogether if we're not skipping anything

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

very well...

pyproject.toml Outdated
test-groups = ["test"]
test-command = "python -m pytest {package}/tests"
test-skip = ["*-musllinux_*", "cp3{10,11,12}-win32"]
test-skip = ["cp314t-*", "cp313t-*", "*-musllinux_*", "cp3{10,11,12}-win32", "cp3*t-*:x86_64"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not in expert in cibuildwheel configuration but this skips tests when building correct? Is that what we want? We just trust that the regular CI testing is good here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll try removing it again but the threaded tests were failing because of missing wheels and building is not support in the cibuildwheel environment (at least if you're a sane person). It's getting tested anyway in the basic CI, so this is largely benign.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll remove it for now to see if it was largely people dropping 3.13t.

@lgray
Copy link
Contributor Author

lgray commented Feb 27, 2026

Ah yeah it because people were dropping 3.13t. Cool we pass now :-)

@lgray
Copy link
Contributor Author

lgray commented Feb 27, 2026

@nsmith- ok - looks like this is good to go too and is thoroughly tested. Please review and merge as you see fit.


- name: Test package (parallel)
if: endsWith(matrix.python-version , 't')
run: python -Xgil=0 -m pytest -ra --parallel-threads=16 --iterations=10
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One final comment I have is that I don't know if we should have -Xgil=0 here. This forces the gil to be disabled even if a module requires it because it cannot run without it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's awkward array and a few other packages. If you try to import without this is switches away from no-gil mode, destroying the efficacy of the tests.

Copy link
Contributor Author

@lgray lgray Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's a per-module re-instantiation of the gil, then that is a different matter. My understanding is that this globally re-enables the GIL behavior of all modules.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really? But we do test free-threaded awkward. This was my source:
https://py-free-threading.github.io/running-gil-disabled/#force-the-gil-to-be-disabled

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines +54 to +59
if: ${{ !endsWith(matrix.python-version , 't') }}
run: python -m pytest -ra

- name: Test package (parallel)
if: endsWith(matrix.python-version , 't')
run: python -m pytest -ra --parallel-threads=16 --iterations=10
Copy link
Contributor

@ikrommyd ikrommyd Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, apologies for seeing these one by one, but one final thought here. It is now doing only parallel testing in 3.14t. We should probably do normal testing on all of them and parallel testing as an extra in free-threaded python. (just no if condition for python -m pytest -ra) basically.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The parallel tests are strictly more stringent since we have deterministic outcomes here. Further testing is unnecessary.

Copy link
Contributor

@ikrommyd ikrommyd Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I would try to catch with this is a pybind11 or cpython bug that isn't there in free-threaded mode in 3.14, that's all.

Copy link
Contributor Author

@lgray lgray Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds like tests the python core team would/should do, not us. The code on our side is the same.

Copy link
Collaborator

@nsmith- nsmith- left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice to see this converge, thanks @ikrommyd for the review

@nsmith- nsmith- added this pull request to the merge queue Feb 27, 2026
Merged via the queue into cms-nanoAOD:master with commit 0a570f9 Feb 27, 2026
26 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

5 participants