Skip to content

libbpf-tools/klockstat: Support nested mutex_trylock#5461

Open
jeromemarchand wants to merge 2 commits intoiovisor:masterfrom
jeromemarchand:klockstat-mutex_trylock
Open

libbpf-tools/klockstat: Support nested mutex_trylock#5461
jeromemarchand wants to merge 2 commits intoiovisor:masterfrom
jeromemarchand:klockstat-mutex_trylock

Conversation

@jeromemarchand
Copy link
Copy Markdown
Contributor

Since kernel 6.16, mutex_trylock() also have a nested version. Add the support for it.

Moreover the nested version of mutex_lock_killable() is implemented differently and mutex_lock_killable_nested symbol is replaced by _mutex_lock_killable.

It fixes the following error of debug kernels:
libbpf: prog 'mutex_trylock_exit': failed to find kernel BTF type ID of 'mutex_trylock': -3 libbpf: prog 'mutex_trylock_exit': failed to prepare load attributes: -3 libbpf: prog 'mutex_trylock_exit': failed to load: -3 libbpf: failed to load object 'klockstat_bpf'
libbpf: failed to load BPF skeleton 'klockstat_bpf': -3 failed to load BPF object

/* Since v6.16 mutex_trylock also have a nested variant */

SEC("kprobe/_mutex_trylock_nest_lock")
int BPF_KPROBE(kprobe_mutex_trylock_nest_lock, struct mutex *lock)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicate BPF programs unnecessary? How about reusing existing kprobe_mutex_trylock and kprobe_mutex_lock_killable programs by changing attach targets in klockstat.c, same as fentry approach.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I'll write a patch that does that and rebase that one on top of it.

"_mutex_trylock_nest_lock");

/* Since v6.16 mutex_lock_killable nested variant is implemented differently */
if (fentry_can_attach("_mutex_lock_killable", NULL)) {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_mutex_lock_killable handling is inside DEBUG-only section.
This function exists in all kernels >= 6.16 (not debug-specific) ?
Line 849 may also need to be adjusted depending on line 877.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see it in /proc/kallsyms on a non-debug kernel and both definitions of _mutex_lock_killable(), in mutex.c and rtmutex_api.c are inside a #ifdef CONFIG_DEBUG_LOCK_ALLOC section.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. You are right — _mutex_lock_killable handling belongs in the debug section. Thanks for the correction.
  2. The two assignments to mutex_lock_killable / mutex_lock_killable_exit are mutually exclusive alternatives depending on the kernel version, but the code expresses this as an unconditional default followed by a conditional update. An if/else would make the intent clearer.

Thank you.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright. I'll update the pull request soon with your suggestion.

The BPF programs for the kprobe fallback are duplicated for their
nested version. This is unnecessary. We can reuse the original program
and reattach it to the nested function such as what is already done in
the fentry case. It simplifies the code quite a bit.

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
Since kernel 6.16, mutex_trylock() also have a nested version. Add the
support for it.

Moreover the nested version of mutex_lock_killable() is implemented
differently and mutex_lock_killable_nested symbol is replaced by
_mutex_lock_killable.

It fixes the following error of debug kernels:
libbpf: prog 'mutex_trylock_exit': failed to find kernel BTF type ID of 'mutex_trylock': -3
libbpf: prog 'mutex_trylock_exit': failed to prepare load attributes: -3
libbpf: prog 'mutex_trylock_exit': failed to load: -3
libbpf: failed to load object 'klockstat_bpf'
libbpf: failed to load BPF skeleton 'klockstat_bpf': -3
failed to load BPF object

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates klockstat to handle kernel 6.16+ lock symbol changes (notably nested mutex_trylock() and the replacement of mutex_lock_killable_nested with _mutex_lock_killable) to avoid BTF attach failures on debug kernels.

Changes:

  • Extend the list of lock-related ksym names to include _mutex_lock_killable and _mutex_trylock_nest_lock.
  • Adjust fentry/kprobe enablement logic to retarget attach points for newer kernel symbol layouts.
  • Remove the dedicated “nested” kprobe program variants from the BPF object.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 5 comments.

File Description
libbpf-tools/klockstat.c Adds new lock symbols and updates attach-target selection logic for fentry/kprobe modes.
libbpf-tools/klockstat.bpf.c Removes now-unneeded nested kprobe program variants (intended to be replaced by retargeting).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +850 to +854
} else {
bpf_program__set_attach_target(obj->progs.down_write_killable, 0,
"down_write_killable_nested");
bpf_program__set_attach_target(obj->progs.down_write_killable_exit, 0,
"down_write_killable_nested");
Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In enable_fentry(), the fallback branch of the _mutex_lock_killable check is setting attach targets for down_write_killable(_exit), which is unrelated to mutex_lock_killable. This also means down_write_killable(_exit) won’t be retargeted to down_write_killable_nested when _mutex_lock_killable exists. Please retarget down_write_killable(_exit) to down_write_killable_nested unconditionally (for CONFIG_DEBUG_LOCK_ALLOC), and keep the mutex_lock_killable targets on mutex_lock_killable_nested when _mutex_lock_killable isn’t available.

Copilot uses AI. Check for mistakes.
if (!kprobe_exists("mutex_lock_nested"))
return;

bpf_program__set_attach_target(obj->progs.mutex_lock, 0,
Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

enable_kprobes() is retargeting obj->progs.mutex_lock (the fentry program that was just set to autoload=false) to mutex_lock_nested, instead of retargeting the kprobe entry program. This likely leaves kprobe_mutex_lock still attached to mutex_lock (non-nested), breaking DEBUG_LOCK_ALLOC nested tracing. Please set the attach target on obj->progs.kprobe_mutex_lock (and its kretprobe) instead of the fentry program.

Suggested change
bpf_program__set_attach_target(obj->progs.mutex_lock, 0,
bpf_program__set_attach_target(obj->progs.kprobe_mutex_lock, 0,

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes indeed, that's a typo. Will fix.

Comment on lines +928 to +930
bpf_program__set_attach_target(obj->progs.mutex_lock_killable, 0,
"_mutex_lock_killable");
bpf_program__set_attach_target(obj->progs.mutex_lock_killable_exit, 0,
Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In enable_kprobes(), when _mutex_lock_killable exists you’re calling bpf_program__set_attach_target() on the fentry mutex_lock_killable programs, but those were set to autoload=false at the top of enable_kprobes(). This means kprobe_mutex_lock_killable(_exit) will still attach to the old symbol and miss nested killable locks on newer kernels. Please retarget the kprobe mutex_lock_killable entry/exit programs to _mutex_lock_killable instead.

Suggested change
bpf_program__set_attach_target(obj->progs.mutex_lock_killable, 0,
"_mutex_lock_killable");
bpf_program__set_attach_target(obj->progs.mutex_lock_killable_exit, 0,
bpf_program__set_attach_target(obj->progs.kprobe_mutex_lock_killable, 0,
"_mutex_lock_killable");
bpf_program__set_attach_target(obj->progs.kprobe_mutex_lock_killable_exit, 0,

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Obviously some sloppy copy and paste business on my part. My bad! Will fix.

Comment on lines 932 to +936
} else {
bpf_program__set_autoload(obj->progs.kprobe_mutex_lock_nested, false);
bpf_program__set_autoload(obj->progs.kprobe_mutex_lock_exit_nested, false);
bpf_program__set_autoload(obj->progs.kprobe_mutex_lock_interruptible_nested, false);
bpf_program__set_autoload(obj->progs.kprobe_mutex_lock_interruptible_exit_nested, false);
bpf_program__set_autoload(obj->progs.kprobe_mutex_lock_killable_nested, false);
bpf_program__set_autoload(obj->progs.kprobe_mutex_lock_killable_exit_nested, false);

bpf_program__set_autoload(obj->progs.kprobe_down_read_nested, false);
bpf_program__set_autoload(obj->progs.kprobe_down_read_exit_nested, false);
bpf_program__set_autoload(obj->progs.kprobe_down_read_killable_nested, false);
bpf_program__set_autoload(obj->progs.kprobe_down_read_killable_exit_nested, false);
bpf_program__set_autoload(obj->progs.kprobe_down_write_nested, false);
bpf_program__set_autoload(obj->progs.kprobe_down_write_exit_nested, false);
bpf_program__set_autoload(obj->progs.kprobe_down_write_killable_nested, false);
bpf_program__set_autoload(obj->progs.kprobe_down_write_killable_exit_nested, false);
bpf_program__set_attach_target(obj->progs.kprobe_down_write_killable, 0,
"down_write_killable_nested");
bpf_program__set_attach_target(obj->progs.kprobe_down_write_killable_exit, 0,
"down_write_killable_nested");
Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In enable_kprobes(), retargeting down_write_killable(_exit) to down_write_killable_nested is currently inside the else branch of the _mutex_lock_killable check. down_write_killable_nested selection shouldn’t depend on whether _mutex_lock_killable exists; on kernels where _mutex_lock_killable exists, down_write_killable(_exit) will remain attached to non-nested symbols. Please move the down_write_killable(_exit) retargeting out of this conditional (or duplicate it in both branches).

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same issue as above. Same fix.

Comment on lines +940 to +942
if (kprobe_exists("_mutex_trylock_nest_lock"))
bpf_program__set_attach_target(obj->progs.mutex_trylock_exit, 0,
"_mutex_trylock_nest_lock");
Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In enable_kprobes(), the nested mutex_trylock handling is retargeting obj->progs.mutex_trylock_exit (fexit program) even though enable_kprobes() disables autoload for that program. It also retargets only the exit side; for kprobe/kretprobe pairs you need both entry and exit attached to the same target to keep the per-thread lock map consistent. Please retarget kprobe_mutex_trylock and kprobe_mutex_trylock_exit to _mutex_trylock_nest_lock when present.

Suggested change
if (kprobe_exists("_mutex_trylock_nest_lock"))
bpf_program__set_attach_target(obj->progs.mutex_trylock_exit, 0,
"_mutex_trylock_nest_lock");
if (kprobe_exists("_mutex_trylock_nest_lock")) {
bpf_program__set_attach_target(obj->progs.kprobe_mutex_trylock, 0,
"_mutex_trylock_nest_lock");
bpf_program__set_attach_target(obj->progs.kprobe_mutex_trylock_exit, 0,
"_mutex_trylock_nest_lock");
}

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. the *_trylock mutexes only have an exit version in the fentry case, not the kprobe one. Will fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants