-
-
Notifications
You must be signed in to change notification settings - Fork 32.1k
add internal asyncio implementation docs #135469
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be honest this looks more like a thorough description of and rationale for the changes to support free-threading (and asyncio pstree
), than a description of asyncio internals. As such, I think it belongs somewhere but I'm not sure if InternalDocs is the right place. Maybe @iritkatriel has an opinion?)
Nit: Call me old-fashioned, but could you please break lines longer than 100 or 120 characters? (Preferably at commas, semicolons, colons and periods.)
[`asyncio`](https://docs.python.org/3/library/asyncio.html) module. | ||
|
||
|
||
## Pre-Python 3.14 implementation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does internal documentation need a description of an old implementation that is no longer supported?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there was no docs at all for how all this worked before 3.14 so I thought to add it from my notes, I can remove it if you prefer
|
||
To address these issues, Python 3.14 implements several changes to improve the performance and thread safety of tasks management. | ||
|
||
- **Per-thread double linked list for tasks**: Python 3.14 introduces a per-thread circular double linked list implementation for storing tasks. This allows each thread to maintain its own list of tasks and allows for lock free addition and removal of tasks. This is designed to be efficient, and thread-safe and scales well with the number of threads in free-threading. This also allows external introspection tools such as `python -m asyncio pstree` to inspect tasks running in all threads and was implemented as part of [Audit asyncio thread safety](https://github.com/python/cpython/issues/128002). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not per-loop? That's what all_tasks()
needs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you store tasks per loop then it cannot be accessed from external introspection easily, by implementing it like this it can support arbitrary loops such as uvloop which would be non trivial for external introspection otherwise (it would also make it slower because you would have to do attribute lookup and several other calls just to add one task to the list on loop)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I initially implemented double linked impl for asyncio, it was a net 15% faster on pyperformance and used roughly 10% less memory whereas when I tried implementing it as per-loop it was slower on pyperformance and didn't save much memory.
|
||
- **Per-thread double linked list for tasks**: Python 3.14 introduces a per-thread circular double linked list implementation for storing tasks. This allows each thread to maintain its own list of tasks and allows for lock free addition and removal of tasks. This is designed to be efficient, and thread-safe and scales well with the number of threads in free-threading. This also allows external introspection tools such as `python -m asyncio pstree` to inspect tasks running in all threads and was implemented as part of [Audit asyncio thread safety](https://github.com/python/cpython/issues/128002). | ||
|
||
- **Per-thread current task**: Python 3.14 stores the current task on the current thread state instead of a global dictionary. This allows for faster access to the current task without the need for a dictionary lookup. Each thread maintains its own current task, which is stored in the `PyThreadState` structure. This was implemented in https://github.com/python/cpython/issues/129898. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, why not on the loop?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as per thread tasks, if you store it on the loop then accessing it for external introspection tools is not possible (currently is a simple fixed offset lookup on thread state)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I should elaborate on this in the docs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added this point to the docs
one --> two | ||
``` | ||
|
||
`asyncio.all_tasks` now iterates over the per-thread task lists of all threads and the interpreter's task list to get all the tasks. In free-threading this is done by pausing all the threads using the `stop-the-world` pause to ensure that no tasks are being added or removed while iterating over the lists. This allows for a consistent view of all task lists across all threads and is thread safe. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't stop-the-world hugely expensive?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
stop-the-world
is expensive but it's a tradeoff between optimizing for frequently used API rather than rare ones. In asyncio registering/unregistering of task is very performance critical and making it lock free is much more important than performance of asyncio.all_tasks
. Also asyncio.all_tasks
in only ever called by the event loop at shutdown so it's performance doesn't matter much compared to registering/unregistering of task.
|
||
`asyncio.all_tasks` now iterates over the per-thread task lists of all threads and the interpreter's task list to get all the tasks. In free-threading this is done by pausing all the threads using the `stop-the-world` pause to ensure that no tasks are being added or removed while iterating over the lists. This allows for a consistent view of all task lists across all threads and is thread safe. | ||
|
||
This design allows for lock free execution and scales well in free-threading with multiple event loops running in different threads. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I agree this is a major accomplishment!
} PyThreadState; | ||
``` | ||
|
||
When a task is entered or left, the current task is updated in the thread state using `enter_task` and `leave_task` functions. When `current_task(loop)` is called where `loop` is the current running event loop of the current thread, no locking is required as the current task is stored in the thread state and is returned directly (general case). Otherwise, if the `loop` is not current running event loop, the `stop-the-world` pause is used to pause all threads in free-threading and then by iterating over all the thread states and checking if the `loop` matches with `tstate->asyncio_current_loop`, the current task is found and returned. If no matching thread state is found, `None` is returned. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here I am really baffled why each loop can't just store its current task on the loop data structure itself.
Also, it sounds like this won't find the current task of a loop belonging to the current task that's not currently running (since it's not any thread's tstate->asyncio_current_loop
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, it sounds like this won't find the current task of a loop belonging to the current task that's not currently running (since it's not any thread's tstate->asyncio_current_loop).
asyncio.current_task only returns a running task and when a task is running it always has a running loop so it works, this behavior is same as it was before 3.14.
https://docs.python.org/3/library/asyncio-task.html#asyncio.current_task
Co-authored-by: Guido van Rossum <[email protected]>
This documents the work I did in 3.14 for better performance in both free-threading and gil builds and scaling asyncio in free-threading.