Skip to content

Fix CPU and database spinning when retrying sending events to servers whilst at the same time purging those events. #18499

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: develop
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions changelog.d/18499.bugfix
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Fix CPU and database spinning when retrying sending events to servers whilst at the same time purging those events.
14 changes: 13 additions & 1 deletion synapse/federation/sender/per_destination_queue.py
Original file line number Diff line number Diff line change
Expand Up @@ -129,6 +129,8 @@ def __init__(

# The stream_ordering of the most recent PDU that was discarded due to
# being in catch-up mode.
# Can be set to zero if no PDU has been discarded since the last time
# we queried for new PDUs during catch-up.
self._catchup_last_skipped: int = 0

# Cache of the last successfully-transmitted stream ordering for this
Expand Down Expand Up @@ -462,8 +464,18 @@ async def _catch_up_transmission_loop(self) -> None:
# of a race condition, so we check that no new events have been
# skipped due to us being in catch-up mode

if self._catchup_last_skipped > last_successful_stream_ordering:
if (
self._catchup_last_skipped != 0
and self._catchup_last_skipped > last_successful_stream_ordering
):
# another event has been skipped because we were in catch-up mode
# As an exception to this case: we can hit this branch if the
# room has been purged whilst we have been looping.
# In that case we avoid hot-looping by resetting the 'catch-up skipped
# PDU' flag.
# Then if there is still no progress to be made at the next iteration,
# we can exit catch-up mode.
self._catchup_last_skipped = 0
continue

# we are done catching up!
Expand Down
Loading