Skip to content

Conversation

JakeHillion
Copy link
Contributor

Add futex delays to chaos. To best reproduce deadlocks and other futex issues we need to affect locking.

The approach here:

  • Delays a waiter when a lock has contention up to futex_uncontended_delay_ns.
  • Swaps out the existing delayed waiter when another waiter comes along.
  • Delays the previous waiter by a random delay between futex_contended_delay_ns and futex_uncontended_delay_ns.

This approach is chosen over random delays to flip futex conditions with minimal performance impact on a machine/process. If we had a futex and pair of threads that have many idle seconds after a short period of contention we would need huge random delays to affect their ordering at all, on every task that touches the futex. Instead we can limit the delays to a solo waiter at any point, and have a much smaller delay when we know the mutex is already under contention. We'll see how this works in practice.

This is the most complicated chaos trait in terms of data structures by far. Currently we use a BPF hash map and a built in DSQ to maintain the data. The hash map maps a specific futex (well, close, a tgid/uaddr pair) to an entry in a CPU's delay DSQ. The delay DSQ holds the task until its timeout, and the map stores how to find that entry in the DSQ to re-queue it with the uncontended timeout. As commented in the code, the complexity of a search in a native DSQ is hideous - it's O(n). We can change the implementation in the future while keeping the logic the same.

Test plan:

  • Lightly tested. Futex is attached to and sees many entries. Slow futex waiters are delayed. The hand off between an old delayed waiter and a new delayed waiter are not reliable and likely have a bug.
  • This change is a no-op unless you provide new command line flags.

@JakeHillion
Copy link
Contributor Author

Putting this up as a draft so I can work on it in the open now. It seems to have a stall bug even without the DSQ searching/re-queuing, and that path never gets hit because the searching isn't correct.

Add futex delays to chaos. To best reproduce deadlocks and other futex
issues we need to affect locking.

The approach here:
- Delays a waiter when a lock has contention up to
  futex_uncontended_delay_ns.
- Swaps out the existing delayed waiter when another waiter comes along.
- Delays the previous waiter by a random delay between
  futex_contended_delay_ns and futex_uncontended_delay_ns.

This approach is chosen over random delays to flip futex conditions with
minimal performance impact on a machine/process. If we had a futex and
pair of threads that have many idle seconds after a short period of
contention we would need huge random delays to affect their ordering at
all, on every task that touches the futex. Instead we can limit the
delays to a solo waiter at any point, and have a much smaller delay when
we know the mutex is already under contention. We'll see how this works
in practice.

This is the most complicated chaos trait in terms of data structures
by far. Currently we use a BPF hash map and a built in DSQ to maintain
the data. The hash map maps a specific futex (well, close, a tgid/uaddr
pair) to an entry in a CPU's delay DSQ. The delay DSQ holds the task
until its timeout, and the map stores how to find that entry in the DSQ
to re-queue it with the uncontended timeout. As commented in the code,
the complexity of a search in a native DSQ is hideous - it's O(n). We
can change the implementation in the future while keeping the logic the
same.

Test plan:
- Lightly tested. Futex is attached to and sees many entries. Slow futex
  waiters are delayed. The hand off between an old delayed waiter and a
  new delayed waiter are not reliable and likely have a bug.
- This change is a no-op unless you provide new command line flags.
@JakeHillion JakeHillion force-pushed the jakehillion/chaos-futex branch from 3406cfe to 5144a4e Compare July 10, 2025 16:57
@JakeHillion
Copy link
Contributor Author

Things are working substantially better with the recent chaos/p2dq fixes! Current issue is we attempt to call scx_bpf_dsq_move_vtime from deep inside chaos_enqueue and this isn't allowed. Evaluating the workarounds at the minute.

$ nix run nixpkgs#stress-ng -- --futex 8

Triggers the crash case well, general use doesn't hit the contended use case very often on my quiet machine.

chaos_stat_inc(CHAOS_STAT_TRAIT_FUTEX_DELAYS);
scx_bpf_dsq_insert_vtime(p, get_cpu_delay_dsq(cpu), 0, now + futex_uncontended_delay_ns, enq_flags);

// critical sections can't call kfuncs which makes this very complicated.
Copy link
Contributor

@hodgesds hodgesds Jul 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice explanation!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants