Skip to content

Make function record subinterpreter safe #5771

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 28 commits into from
Aug 4, 2025

Conversation

b-pass
Copy link
Contributor

@b-pass b-pass commented Jul 27, 2025

Description

I think I have tracked down the sporadic unit test failure/crash.

The function record type is using a global static type definition, which is not subinterpreter safe. It needs to use a heap type which is registered with each (sub)interpreter.

Suggested changelog entry:

  • Fixed potential crash when using cpp_function objects with sub-interpreters.

📚 Documentation preview 📚: https://pybind11--5771.org.readthedocs.build/

@b-pass b-pass marked this pull request as ready for review July 27, 2025 19:53
@b-pass
Copy link
Contributor Author

b-pass commented Jul 28, 2025

I am sort of baffled by the graalpy failures ... looks like it is trying to run a closure during shutdown which tries to acquire the GIL, but fails because python is now null (having already shut down, I assume). Not sure how these changes would've altered this sort of behavior......

@rwgk
Copy link
Collaborator

rwgk commented Jul 28, 2025

Wow

I'm trying to get my head around this PR. Starting with a maybe simple question:

Is this unrelated to the function record changes?

I looked back, the code changed there was first introduced here, in August 2017:

It was changed recently (June 18, 2024) here:

Could it make sense to break out that commit as a separate PR?

(TBH I don't really understand the change. I see you're replacing static and thread_local with a lambda capture trick. I think it'll be best to have Sam look at that change.)

@rwgk
Copy link
Collaborator

rwgk commented Jul 28, 2025

I just edited my previous comment, to point here instead of your initial commit.

@b-pass
Copy link
Contributor Author

b-pass commented Jul 28, 2025

Is this unrelated to the function record changes?

Yes, I found looking for other statics which aren't sub-interpreter safe. It is not sub-interpreter safe (and I think the thread_local addition was incorrect (should be static thread_local).

Could it make sense to break out that commit as a separate PR?

Yes, I will split it out.

@rwgk rwgk mentioned this pull request Jul 28, 2025
3 tasks
@b-pass
Copy link
Contributor Author

b-pass commented Jul 28, 2025

Good call splitting them up @rwgk, turns out the function_record stuff wasn't the graal problem at all!

@rwgk
Copy link
Collaborator

rwgk commented Jul 29, 2025

I quizzed ChatGPT regarding this PR, could you please read? (I think it's a <5 minutes read.)

https://chatgpt.com/share/68886181-f5a8-8008-83b5-b30329e9aec5

The internals version bump is somewhat of a big ouch, but if ChatGPT's analysis is correct, trying to avoid that isn't easy.

WDYT?

@b-pass
Copy link
Contributor Author

b-pass commented Jul 29, 2025

The internals version bump is somewhat of a big ouch, but if ChatGPT's analysis is correct, trying to avoid that isn't easy.

ChatGPT seems pretty confused here to me. I've pushed a change to avoid the internals version bump.... this type was module local before this change, so keeping it local is fine, and avoids the version bump. It might even be better this way...

@rwgk
Copy link
Collaborator

rwgk commented Jul 30, 2025

ChatGPT seems pretty confused here to me. I've pushed a change to avoid the internals version bump.... this type was module local before this change, so keeping it local is fine, and avoids the version bump. It might even be better this way...

That's awesome, thanks! (I might need a couple days to find the time to carefully review the changes.)

Comment on lines 49 to 53
constexpr char tp_name_impl[]
constexpr char tp_qualname_impl[] = PYBIND11_INTERNAL_MODULE_NAME
"."
"pybind11_detail_function_record_" PYBIND11_DETAIL_FUNCTION_RECORD_ABI_ID
"_" PYBIND11_PLATFORM_ABI_ID;
constexpr char tp_plainname_impl[]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you consider adding

setattr((PyObject *) type, "__module__", str(PYBIND11_INTERNAL_MODULE_NAME));

in get_function_record_PyTypeObject(), instead of making this change here? That would seem more idiomatic and consistent within pybind11.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I tried that but it doesn't work and gives:

DeprecationWarning: builtin type pybind11_detail_function_record_v1_system_libstdcpp_gxx_abi_1xxx_use_cxx11_abi_1 has no __module__ attribute

I don't really understand why.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm ... I think it's worth spending time getting to the bottom of this. I might be able to drill down in a few hours.

Wild guess: Maybe we need to explicitly add Py_TPFLAGS_HEAPTYPE?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Explicitly adding Py_TPFLAGS_HEAPTYPE does not help, I'm still seeing this:

<frozen importlib._bootstrap>:488: 321 warnings
  <frozen importlib._bootstrap>:488: DeprecationWarning: builtin type pybind11_detail_function_record_v1_system_libstdcpp_gxx_abi_1xxx_use_cxx11_abi_1 has no __module__ attribute

I'll try to find out why.

@rwgk
Copy link
Collaborator

rwgk commented Aug 4, 2025

@b-pass It turns out, your approach is a great workaround. I added a writeup for the rational in commit 0551332.

I made a few small changes, but they are only of the polishing kind. Could you please review my changes and let me know what you think?

@b-pass
Copy link
Contributor Author

b-pass commented Aug 4, 2025

let me know what you think?

Everything looks good to me.

Copy link
Collaborator

@rwgk rwgk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@rwgk rwgk merged commit c5e8fec into pybind:master Aug 4, 2025
81 of 82 checks passed
@github-actions github-actions bot added the needs changelog Possibly needs a changelog entry label Aug 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs changelog Possibly needs a changelog entry
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants