libbpf-tools/klockstat: Support nested mutex_trylock#5461
libbpf-tools/klockstat: Support nested mutex_trylock#5461jeromemarchand wants to merge 2 commits intoiovisor:masterfrom
Conversation
libbpf-tools/klockstat.bpf.c
Outdated
| /* Since v6.16 mutex_trylock also have a nested variant */ | ||
|
|
||
| SEC("kprobe/_mutex_trylock_nest_lock") | ||
| int BPF_KPROBE(kprobe_mutex_trylock_nest_lock, struct mutex *lock) |
There was a problem hiding this comment.
Duplicate BPF programs unnecessary? How about reusing existing kprobe_mutex_trylock and kprobe_mutex_lock_killable programs by changing attach targets in klockstat.c, same as fentry approach.
There was a problem hiding this comment.
Good point. I'll write a patch that does that and rebase that one on top of it.
| "_mutex_trylock_nest_lock"); | ||
|
|
||
| /* Since v6.16 mutex_lock_killable nested variant is implemented differently */ | ||
| if (fentry_can_attach("_mutex_lock_killable", NULL)) { |
There was a problem hiding this comment.
_mutex_lock_killable handling is inside DEBUG-only section.
This function exists in all kernels >= 6.16 (not debug-specific) ?
Line 849 may also need to be adjusted depending on line 877.
There was a problem hiding this comment.
I don't see it in /proc/kallsyms on a non-debug kernel and both definitions of _mutex_lock_killable(), in mutex.c and rtmutex_api.c are inside a #ifdef CONFIG_DEBUG_LOCK_ALLOC section.
There was a problem hiding this comment.
- You are right — _mutex_lock_killable handling belongs in the debug section. Thanks for the correction.
- The two assignments to mutex_lock_killable / mutex_lock_killable_exit are mutually exclusive alternatives depending on the kernel version, but the code expresses this as an unconditional default followed by a conditional update. An if/else would make the intent clearer.
Thank you.
There was a problem hiding this comment.
Alright. I'll update the pull request soon with your suggestion.
The BPF programs for the kprobe fallback are duplicated for their nested version. This is unnecessary. We can reuse the original program and reattach it to the nested function such as what is already done in the fentry case. It simplifies the code quite a bit. Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
Since kernel 6.16, mutex_trylock() also have a nested version. Add the support for it. Moreover the nested version of mutex_lock_killable() is implemented differently and mutex_lock_killable_nested symbol is replaced by _mutex_lock_killable. It fixes the following error of debug kernels: libbpf: prog 'mutex_trylock_exit': failed to find kernel BTF type ID of 'mutex_trylock': -3 libbpf: prog 'mutex_trylock_exit': failed to prepare load attributes: -3 libbpf: prog 'mutex_trylock_exit': failed to load: -3 libbpf: failed to load object 'klockstat_bpf' libbpf: failed to load BPF skeleton 'klockstat_bpf': -3 failed to load BPF object Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
b8a8fcb to
a53eaab
Compare
There was a problem hiding this comment.
Pull request overview
This PR updates klockstat to handle kernel 6.16+ lock symbol changes (notably nested mutex_trylock() and the replacement of mutex_lock_killable_nested with _mutex_lock_killable) to avoid BTF attach failures on debug kernels.
Changes:
- Extend the list of lock-related ksym names to include
_mutex_lock_killableand_mutex_trylock_nest_lock. - Adjust fentry/kprobe enablement logic to retarget attach points for newer kernel symbol layouts.
- Remove the dedicated “nested” kprobe program variants from the BPF object.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
| libbpf-tools/klockstat.c | Adds new lock symbols and updates attach-target selection logic for fentry/kprobe modes. |
| libbpf-tools/klockstat.bpf.c | Removes now-unneeded nested kprobe program variants (intended to be replaced by retargeting). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| } else { | ||
| bpf_program__set_attach_target(obj->progs.down_write_killable, 0, | ||
| "down_write_killable_nested"); | ||
| bpf_program__set_attach_target(obj->progs.down_write_killable_exit, 0, | ||
| "down_write_killable_nested"); |
There was a problem hiding this comment.
In enable_fentry(), the fallback branch of the _mutex_lock_killable check is setting attach targets for down_write_killable(_exit), which is unrelated to mutex_lock_killable. This also means down_write_killable(_exit) won’t be retargeted to down_write_killable_nested when _mutex_lock_killable exists. Please retarget down_write_killable(_exit) to down_write_killable_nested unconditionally (for CONFIG_DEBUG_LOCK_ALLOC), and keep the mutex_lock_killable targets on mutex_lock_killable_nested when _mutex_lock_killable isn’t available.
| if (!kprobe_exists("mutex_lock_nested")) | ||
| return; | ||
|
|
||
| bpf_program__set_attach_target(obj->progs.mutex_lock, 0, |
There was a problem hiding this comment.
enable_kprobes() is retargeting obj->progs.mutex_lock (the fentry program that was just set to autoload=false) to mutex_lock_nested, instead of retargeting the kprobe entry program. This likely leaves kprobe_mutex_lock still attached to mutex_lock (non-nested), breaking DEBUG_LOCK_ALLOC nested tracing. Please set the attach target on obj->progs.kprobe_mutex_lock (and its kretprobe) instead of the fentry program.
| bpf_program__set_attach_target(obj->progs.mutex_lock, 0, | |
| bpf_program__set_attach_target(obj->progs.kprobe_mutex_lock, 0, |
There was a problem hiding this comment.
Yes indeed, that's a typo. Will fix.
| bpf_program__set_attach_target(obj->progs.mutex_lock_killable, 0, | ||
| "_mutex_lock_killable"); | ||
| bpf_program__set_attach_target(obj->progs.mutex_lock_killable_exit, 0, |
There was a problem hiding this comment.
In enable_kprobes(), when _mutex_lock_killable exists you’re calling bpf_program__set_attach_target() on the fentry mutex_lock_killable programs, but those were set to autoload=false at the top of enable_kprobes(). This means kprobe_mutex_lock_killable(_exit) will still attach to the old symbol and miss nested killable locks on newer kernels. Please retarget the kprobe mutex_lock_killable entry/exit programs to _mutex_lock_killable instead.
| bpf_program__set_attach_target(obj->progs.mutex_lock_killable, 0, | |
| "_mutex_lock_killable"); | |
| bpf_program__set_attach_target(obj->progs.mutex_lock_killable_exit, 0, | |
| bpf_program__set_attach_target(obj->progs.kprobe_mutex_lock_killable, 0, | |
| "_mutex_lock_killable"); | |
| bpf_program__set_attach_target(obj->progs.kprobe_mutex_lock_killable_exit, 0, |
There was a problem hiding this comment.
Obviously some sloppy copy and paste business on my part. My bad! Will fix.
| } else { | ||
| bpf_program__set_autoload(obj->progs.kprobe_mutex_lock_nested, false); | ||
| bpf_program__set_autoload(obj->progs.kprobe_mutex_lock_exit_nested, false); | ||
| bpf_program__set_autoload(obj->progs.kprobe_mutex_lock_interruptible_nested, false); | ||
| bpf_program__set_autoload(obj->progs.kprobe_mutex_lock_interruptible_exit_nested, false); | ||
| bpf_program__set_autoload(obj->progs.kprobe_mutex_lock_killable_nested, false); | ||
| bpf_program__set_autoload(obj->progs.kprobe_mutex_lock_killable_exit_nested, false); | ||
|
|
||
| bpf_program__set_autoload(obj->progs.kprobe_down_read_nested, false); | ||
| bpf_program__set_autoload(obj->progs.kprobe_down_read_exit_nested, false); | ||
| bpf_program__set_autoload(obj->progs.kprobe_down_read_killable_nested, false); | ||
| bpf_program__set_autoload(obj->progs.kprobe_down_read_killable_exit_nested, false); | ||
| bpf_program__set_autoload(obj->progs.kprobe_down_write_nested, false); | ||
| bpf_program__set_autoload(obj->progs.kprobe_down_write_exit_nested, false); | ||
| bpf_program__set_autoload(obj->progs.kprobe_down_write_killable_nested, false); | ||
| bpf_program__set_autoload(obj->progs.kprobe_down_write_killable_exit_nested, false); | ||
| bpf_program__set_attach_target(obj->progs.kprobe_down_write_killable, 0, | ||
| "down_write_killable_nested"); | ||
| bpf_program__set_attach_target(obj->progs.kprobe_down_write_killable_exit, 0, | ||
| "down_write_killable_nested"); |
There was a problem hiding this comment.
In enable_kprobes(), retargeting down_write_killable(_exit) to down_write_killable_nested is currently inside the else branch of the _mutex_lock_killable check. down_write_killable_nested selection shouldn’t depend on whether _mutex_lock_killable exists; on kernels where _mutex_lock_killable exists, down_write_killable(_exit) will remain attached to non-nested symbols. Please move the down_write_killable(_exit) retargeting out of this conditional (or duplicate it in both branches).
There was a problem hiding this comment.
Same issue as above. Same fix.
| if (kprobe_exists("_mutex_trylock_nest_lock")) | ||
| bpf_program__set_attach_target(obj->progs.mutex_trylock_exit, 0, | ||
| "_mutex_trylock_nest_lock"); |
There was a problem hiding this comment.
In enable_kprobes(), the nested mutex_trylock handling is retargeting obj->progs.mutex_trylock_exit (fexit program) even though enable_kprobes() disables autoload for that program. It also retargets only the exit side; for kprobe/kretprobe pairs you need both entry and exit attached to the same target to keep the per-thread lock map consistent. Please retarget kprobe_mutex_trylock and kprobe_mutex_trylock_exit to _mutex_trylock_nest_lock when present.
| if (kprobe_exists("_mutex_trylock_nest_lock")) | |
| bpf_program__set_attach_target(obj->progs.mutex_trylock_exit, 0, | |
| "_mutex_trylock_nest_lock"); | |
| if (kprobe_exists("_mutex_trylock_nest_lock")) { | |
| bpf_program__set_attach_target(obj->progs.kprobe_mutex_trylock, 0, | |
| "_mutex_trylock_nest_lock"); | |
| bpf_program__set_attach_target(obj->progs.kprobe_mutex_trylock_exit, 0, | |
| "_mutex_trylock_nest_lock"); | |
| } |
There was a problem hiding this comment.
Right. the *_trylock mutexes only have an exit version in the fentry case, not the kprobe one. Will fix.
Since kernel 6.16, mutex_trylock() also have a nested version. Add the support for it.
Moreover the nested version of mutex_lock_killable() is implemented differently and mutex_lock_killable_nested symbol is replaced by _mutex_lock_killable.
It fixes the following error of debug kernels:
libbpf: prog 'mutex_trylock_exit': failed to find kernel BTF type ID of 'mutex_trylock': -3 libbpf: prog 'mutex_trylock_exit': failed to prepare load attributes: -3 libbpf: prog 'mutex_trylock_exit': failed to load: -3 libbpf: failed to load object 'klockstat_bpf'
libbpf: failed to load BPF skeleton 'klockstat_bpf': -3 failed to load BPF object