Skip to content

Deregister cancelled broadcast recv futures from WaitSet #115

Description

@orthur2

Back in #94, while fixing broadcast, I mentioned that a cancelled recv future could leave its waker in WaitSet. This was discussed in #101. At that time, tison has pointed out that removing a waiter from every Drop would add a lock acquisition, while wake_all currently locks once and drains all waiters.

This came up again while I was implementing the unbounded policy from #95. A broadcast channel may remain alive and idle for a long time, while recv is commonly cancelled through select or timeout. In that case, cancelled receive futures remain registered until the next send and can accumulate during the idle period.

Tokio's broadcast uses the same general approach, although its waiter storage is different. It keeps an intrusive waiter list inside broadcast. Recv::drop first checks whether its waiter is still queued and only takes the lock when it needs to unlink it. The sender clears the queued state before releasing the lock and waking the task. See the Tokio implementation.

For mea, I think the cleanup should also avoid adding a lock to the normal completion path. Ready paths can clear the waiter id, and Recv::drop can return immediately when there is no registration to remove. The lock is then only needed when a registered recv future is dropped before completion.

Removing one waiter requires a stronger identity than the current slab index. As discussed in #103, an index can be reused after wake_all drains the slab. An old future could otherwise remove a newer waiter occupying the same slot. My current approach is WaiterId { index, generation }, with both values checked before removal.

I would limit this cleanup to broadcast receivers. Latch, Once, and WaitGroup can keep their existing behavior rather than adding a lock to every wait future's Drop.

I also looked at Barrier while checking the other WaitSet users. Cancelling Barrier::wait after it records an arrival can allow the generation to complete with fewer active callers. However, tokio::sync::Barrier explicitly documents wait as not cancel safe and also keeps the recorded arrival.

There are two reasonable choices here: keep the Tokio behavior and document Barrier::wait as not cancel safe, or make mea's Barrier stronger by withdrawing the arrival from BarrierWait::drop. I currently prefer keeping the Tokio behavior unless we intentionally want to provide stronger cancellation semantics.

@tisonkun, what do you think about limiting deregistration to broadcast recv futures, keeping the countdown primitives unchanged, and documenting Barrier::wait as not cancel safe? Do you think the generation-based waiter id is preferable to leaving tombstone entries in WaitSet? Or any other more elegant solution as well ?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions