libbpf-tools: Add new feature doublefree#4286
libbpf-tools: Add new feature doublefree#4286Bojun-Seo wants to merge 1 commit intoiovisor:masterfrom
Conversation
| /* default log level */ | ||
| static enum log_level log_level = ERROR; | ||
|
|
||
| void sig_handler(int signo) |
There was a problem hiding this comment.
static ? Here and below.
There was a problem hiding this comment.
There is a reason why I use static here and below. (below means static functions, right?)
I could remove static but I'd like to use static in here and below, if there is no problem.
I use this tool and other tools in my company, and manage the source code bit differently.
I compile every tool into one binary for user to use easily. So symbol can conflict on compilation if I don't use static.
There was a problem hiding this comment.
I removed static on this enum value.
There was a problem hiding this comment.
I have restored the static that I had previously removed. If there is a reason why static needs to be removed, please let me know.
9125f3c to
e3dd3ca
Compare
eiffel-fl
left a comment
There was a problem hiding this comment.
Hi!
Thank you for it, this tool is cool!
I tested it and it worked fine:
$ ./a.out &
[2] 99365
$ sudo ./doublefree -p 99365
bar: 0x55c9bd2912a0
baz: 0x55c9bd2912a0
bazz: 84
free 84
free(): double free detected in tcache 2
[2] + abort (core dumped) ./a.out
Warn: Is this process alive? pid: 99365
Found double free...
Allocation happended on:
stack_id: 30033
#1 0x0055c9bc24724b foo
#2 0x0055c9bc2472a4 main
#3 0x007fabf0d8c083 __libc_start_main
First deallocation happended on:
stack_id: 9246
#1 0x007fabf0e026d0 free
#2 0x0055c9bc2472be main
#3 0x007fabf0d8c083 __libc_start_main
Second deallocation happended on:
stack_id: 3231
#1 0x007fabf0e026d0 free
#2 0x0055c9bc247236 baz
#3 0x0055c9bc2472d4 main
#4 0x007fabf0d8c083 __libc_start_mainI nonetheless have a question regarding the memptrs map.
Best regards.
a01519a to
aa8a849
Compare
|
Add feature to detect doublefree on kernel(currently support arm and arm64) |
2420682 to
0b01ca7
Compare
|
I found some issues on detecting kernel doublefree. So I remove it. |
|
I took a slightly new look at this tool and I am wondering if this can be merged with |
|
Sounds great to merging tools. In fact, it could be possible to merge |
Sure, this can definitely be done in a future PR! |
ce9f147 to
af89da4
Compare
|
I simplify the usage and the example, and I updated it on commit message as well as description. |
aae151a to
86e5525
Compare
eiffel-fl
left a comment
There was a problem hiding this comment.
Hi!
I have one comment regarding the macros which I think can be replaced by a function.
Otherwise, it looks good to me.
Let me know about my comment and I will test it later.
Best regards.
| #define ATTACH_URETPROBE_CHECKED(obj, lib_path, func_name, pid) \ | ||
| do { \ | ||
| off_t func_off = get_elf_func_offset(lib_path, #func_name); \ | ||
| _CHECK_OFFSET(func_off); \ | ||
| _ATTACH_UPROBE(obj, lib_path, func_name##_return, true, func_off, pid); \ | ||
| _CHECK_PROGRAM(obj, func_name##_return); \ | ||
| } while (false) |
There was a problem hiding this comment.
Cannot you use a function for this? Something like this:
int attach_uprobe(struct bpf_link **func_link, const struct bpf_program *func_prog, char *lib_path, char * func_name, int pid, bool is_ret, bool check)
{
off_t func_off = get_elf_func_offset(lib_path, func_name);
if (func_off < 0) {
p_warn("Failed to get func_offset");
return func_off;
}
*func_link = bpf_program__attach_uprobe(func_prog, is_ret, pid, lib_path, func_off);
if (check) {
int err = libbpf_get_error(func_link);
if (err) {
p_warn("Failed to attach %s: %d", #func_name, err);
return err;
}
}
return 0;
}You would call it like the following:
/* Think to check the return code. */
attach_uprobe(&obj->link.malloc_return, obj->progs.malloc_return, libc_path, "malloc" env.pid, true, true);
attach_uprobe(&obj->link.free_entry, obj->progs.free_entry, libc_path, "free", env.pid, false, true);
/* ... */
attach_uprobe(&obj->link.aligned_alloc_return, obj->progs.aligned_alloc_return, libc_path, "alligned_alloc", env.pid, true, false);
attach_uprobe(&obj->link.valloc_return, obj->progs.valloc_return, libc_path, "valloc", env.pid, true, false);You could even declare a struct to hold everything:
struct probe {
struct bpf_link **link;
struct bpf_program *prog;
const char *name;
bool is_ret;
bool check;
};
/* ... */
struct probe probes[] = {
{
.link = &obj->link.malloc_return,
.prog = obj->progs.malloc_return,
.name = "malloc",
.is_ret = true,
.check = true,
},
/* ... */
};
for (int i = 0; i < sizeof(probes) / sizeof(probes[0]); i++) {
attach_uprobe(probes[i].link, probes[i].prog, libc_path, probes[i].name, env.pid, probes[i].is_ret, probes[i].check);
}What do you think?
There was a problem hiding this comment.
Sounds great! I always appreciate your valuable opinion.
I prefer the second code you suggested(using struct).
And I think it would be better to pass struct probe instance(using pointer) instead of each members to attach_uprobe function, like followings.
for (int i = 0; i < sizeof(probes) / sizeof(probes[0]); i++) {
attach_uprobe(&probes[i], libc_path);
}What do you think?
There was a problem hiding this comment.
And I think it would be better to pass struct probe instance(using pointer) instead of each members to attach_uprobe function, like followings.
Far better approach, it is easier to read as we have less arguments and also easier to maintain!
There was a problem hiding this comment.
I changed uprobe attaching codes. I tried implementing it while incorporating your opinion, but in a slightly different way.
ekyooo
left a comment
There was a problem hiding this comment.
"ERROR Failed to poll perf_buffer" in usage example always displayed?
| p_err("Failed to poll perf_buffer"); | ||
| break; | ||
| } | ||
| if (getpgid(env.pid) < 0) { |
There was a problem hiding this comment.
I have a question. Is there no need to limit it to the ESRCH errno case?
There was a problem hiding this comment.
There is no need to limit it to the ESRCH errno case according to the following manual page.
https://linux.die.net/man/2/getpgid
getpgid returns -1 on error and errno is set appropriately.
And there is only one errno for getpgid, which is ESRCH.
Which means, if getpgid fails, the errno should be ESRCH.
Thanks for your question.
There was a problem hiding this comment.
And I found that calling getpgid is not necessary. So I remove that code.
|
|
||
| err = bpf_map_lookup_elem(fd, &key, &val); | ||
| if (err < 0) { | ||
| p_err("Failed to lookup elem on fd"); |
There was a problem hiding this comment.
Did you intend the following code?
p_err("Failed` to lookup elem on fd: %d", fd);
There was a problem hiding this comment.
You are right. I'll change the code. Thanks!
There was a problem hiding this comment.
I'd rather change the code to print strerror(errno) instead of fd
| ptr = strtok(NULL, delim); | ||
| } else { | ||
| p_err("failed to exec %s", cmd); | ||
| goto cleanup; |
There was a problem hiding this comment.
How about handling the error cases first?
This sentence duplicates line 196.
There was a problem hiding this comment.
I can and will remove duplicates but cannot handle the error cases first.
Because error case can be checked after tokenizing or calling fork.
| return; | ||
| } | ||
| syms = syms_cache__get_syms(syms_cache, env.pid); | ||
| if (syms != NULL) { |
There was a problem hiding this comment.
how about handling err case first?
There was a problem hiding this comment.
Sounds nice, that change could reduce one indent as well.
| int deallocs_fd = bpf_map__fd(obj->maps.deallocs); | ||
|
|
||
| if (e->err == -1) { | ||
| p_err("This message should not be printed.."); |
There was a problem hiding this comment.
How about removing code that shouldn't happen?
There was a problem hiding this comment.
It's a good point. However, thinking that something will never be executed is just the developer's assumption. In software development, it is not uncommon for code that should not be executed to actually be executed. Especially when contributing code to open source projects, where multiple developers are involved, there is a higher possibility of unintentionally executing code that should not be executed. Therefore, I believe that this code should be kept as a precaution for future mistakes by someone else. However, the message itself seems lengthy and lacks specific details. I will consider possible improvements.
There was a problem hiding this comment.
I didn't remove the message but changed it.
| p_err("Failed to poll perf_buffer"); | ||
| break; | ||
| } | ||
| if (getpgid(env.pid) < 0) { |
There was a problem hiding this comment.
There is no need to limit it to the ESRCH errno case according to the following manual page.
https://linux.die.net/man/2/getpgid
getpgid returns -1 on error and errno is set appropriately.
And there is only one errno for getpgid, which is ESRCH.
Which means, if getpgid fails, the errno should be ESRCH.
Thanks for your question.
e661c38 to
d451f55
Compare
eiffel-fl
left a comment
There was a problem hiding this comment.
Hi!
I just found some small nits, but I think we are ready to go:
$ sudo ./doublefree -c /tmp/doublefree_generator (remotes/bojun/doublefree) *%
[sudo] Mot de passe de francis :
Désolé, essayez de nouveau.
[sudo] Mot de passe de francis :
2024-Feb-29 16:20:13 INFO Execute command: /tmp/doublefree_generator(pid 94846)
Tracing doublefree... Hit Ctrl-C to stop
free(): double free detected in tcache 2
Allocation:
#1 0x0056477a65b19b foo+0x12
#2 0x0056477a65b1e3 main+0x27
#3 0x007f2259229d90 [unknown]
First deallocation:
#1 0x007f22592a53e0 free+0
#2 0x0056477a65b1fd main+0x41
#3 0x007f2259229d90 [unknown]
Second deallocation:
#1 0x007f22592a53e0 free+0
#2 0x0056477a65b213 main+0x57
#3 0x007f2259229d90 [unknown]Best regards.
| execve(filepath, argv, NULL); | ||
| free(argv); | ||
|
|
||
| return -1; |
There was a problem hiding this comment.
Even though the fork()/exec() is successful, you still return -1 here.
Is this what you want to achieve?
There was a problem hiding this comment.
Thank you for the good question.
When execve succeeds, it launches a new program and the current process(child)'s execution flow is not passed to subsequent lines of code. As a result, employing an if statement for error checking is unnecessary; if the execution flow reaches beyond the execve invocation, it inherently signifies an error has occurred.
There was a problem hiding this comment.
When execve succeeds, it launches a new program and the current process(child)'s execution flow is not passed to subsequent lines of code. As a result, employing an if statement for error checking is unnecessary; if the execution flow reaches beyond the execve invocation, it inherently signifies an error has occurred.
Indeed! Long time I did not write C code and fork()/exec(), seems I begin to rust 🤣!
eiffel-fl
left a comment
There was a problem hiding this comment.
Hi!
I just have one comment with your latest push, but you can ignore it.
Best regards.
| if (!err) { | ||
| if (sinfo.sym_name) | ||
| printf(" %s+0x%lx", sinfo.sym_name, sinfo.sym_offset); | ||
| else | ||
| printf(" [unknown]"); | ||
|
|
||
| printf(" (%s+0x%lx)\n", sinfo.dso_name, sinfo.dso_offset); | ||
| } |
There was a problem hiding this comment.
| if (!err) { | |
| if (sinfo.sym_name) | |
| printf(" %s+0x%lx", sinfo.sym_name, sinfo.sym_offset); | |
| else | |
| printf(" [unknown]"); | |
| printf(" (%s+0x%lx)\n", sinfo.dso_name, sinfo.dso_offset); | |
| } | |
| if (err) | |
| continue | |
| if (sinfo.sym_name) | |
| printf(" %s+0x%lx (%s+0x%lx)\n", sinfo.sym_name, sinfo.sym_offset, sinfo.dso_name, sinfo.dso_offset); | |
| else | |
| printf(" [unknown] (%s+0x%lx)\n", sinfo.dso_name, sinfo.dso_offset); |
There was a problem hiding this comment.
It seems more readable. I'll change the code as you suggested.
Thank you!
There was a problem hiding this comment.
I found that the location of newline is not proper in my code.
So I change it. I can find it thanks for your help!
8e3e55c to
f3bdaf3
Compare
|
I just rebased this PR and I have confirmed that this feature is still working well, and I hope it gets merged into the mainline. I will discuss this PR at OSS Korea 2025. |
f3bdaf3 to
c48c37d
Compare
Interesting. I come here from the OSS Korea event. |
|
Here is the video explaining this tool on OSS Korea. |
e7f5da2 to
2aace57
Compare
Add doublefree tool to detect double free. It could detect user level double
free error currently and can be expanded to detect kernel level double free
error. Followings are the usage and example.
Usage:
$ ./doublefree -h
Usage: doublefree [OPTION...]
Detect and report doublefree error.
-c or -p is a mandatory option
EXAMPLES:
doublefree -p 1234 # Detect doublefree on process id 1234
doublefree -c a.out # Detect doublefree on a.out
doublefree -c 'a.out arg' # Detect doublefree on a.out with argument
-c, --command=COMMAND Execute the command and detect doublefree
-p, --pid=PID Detect doublefree on the specified process
-v, --verbose Verbose debug output
-?, --help Give this help list
--usage Give a short usage message
-V, --version Print program version
Mandatory or optional arguments to long options are also mandatory or optional
for any corresponding short options.
Report bugs to https://github.com/iovisor/bcc/tree/master/libbpf-tools.
Example:
$ cat doublefree_generator.c
#include <unistd.h>
#include <stdlib.h>
#include <stdio.h>
int* foo() {
printf("foo\n");
return (int*)malloc(sizeof(int));
}
void bar(int* p) {
printf("bar\n");
free(p);
}
int main(int argc, char* argv[]) {
sleep(10);
int *val = foo();
*val = 33;
bar(val);
*val = 84;
bar(val);
return 0;
}
$ gcc doublefree_generator.c
$ sudo ./doublefree -c a.out
2024-May-15 22:30:24 INFO Execute command: a.out(pid 99584)
Tracing doublefree... Hit Ctrl-C to stop
free(): double free detected in tcache 2
Allocation:
iovisor#1 0x00564d1455819b foo+0x12 (/home/bojun/doublefree/libbpf-tools/a.out+0x119b)
iovisor#2 0x00564d145581e3 main+0x27 (/home/bojun/doublefree/libbpf-tools/a.out+0x11e3)
iovisor#3 0x0070c4e9429d90 [unknown] (/usr/lib/x86_64-linux-gnu/libc.so.6+0x29d90)
First deallocation:
iovisor#1 0x0070c4e94a53e0 free+0x0 (/usr/lib/x86_64-linux-gnu/libc.so.6+0xa53e0)
iovisor#2 0x00564d145581fd main+0x41 (/home/bojun/doublefree/libbpf-tools/a.out+0x11fd)
iovisor#3 0x0070c4e9429d90 [unknown] (/usr/lib/x86_64-linux-gnu/libc.so.6+0x29d90)
Second deallocation:
iovisor#1 0x0070c4e94a53e0 free+0x0 (/usr/lib/x86_64-linux-gnu/libc.so.6+0xa53e0)
iovisor#2 0x00564d14558213 main+0x57 (/home/bojun/doublefree/libbpf-tools/a.out+0x1213)
iovisor#3 0x0070c4e9429d90 [unknown] (/usr/lib/x86_64-linux-gnu/libc.so.6+0x29d90)
|
This patch is an updated version of the original submission. The changes listed below are assisted by GitHub Copilot, and the full patch is reviewed by Copilot before posting. Changes compared to the original commit: Bug fixes:
Improvements:
Changes:
|
Add doublefree tool to detect double free. It could detect user level double
free error currently and can be expanded to detect kernel level double free
error. Followings are the usage and example.
Usage:
Example: