add a new tool in libbpf-tools named ext4File#5440
add a new tool in libbpf-tools named ext4File#5440niebowen666 wants to merge 1 commit intoiovisor:masterfrom
Conversation
|
Here are my quick notes:
Thanks |
Thanks for your reply. |
|
When I said |
7cc4f5e to
a5912d8
Compare
Hi Bojun,
|
libbpf-tools/Makefile
Outdated
| tcptop \ | ||
| vfsstat \ | ||
| wakeuptime \ | ||
| ext4file \ |
There was a problem hiding this comment.
Why do we need this new tool ? We already have fsdist/fsslower/filelife/filetop ?
There was a problem hiding this comment.
ext4file is a tool used to track file-level buffer or direct I/O.
If a certain file is expected to be accessed by direct I/O, ext4file can detect abnormal I/O access.
I have read the source code of the tools you listed above.
- fsdist focuses on the execution time of operations like read, write, open, and sync, which is different from the issue we are concerned with(Buffer I/O and Direct I/O).
- Compared to fsdist, fsslower is more powerful. The information it traces includes file names and pays attention to the size of I/O. It also sets a threshold, and if the execution time of an operation is below this threshold, it will skip tracing. Although it tracks file names, it does not achieve file-level tracking, because a file name does not represent a unique file. In addition, it tracks the size of I/O rather than the distribution between buffer and direct, so the results of ext4file can complement those of fsslower.
- filelife ignores I/O and only focuses on the creation and deletion of files.
- To prevent excessive output, filetop only displays part of the data, and filetop's I/O tracking cannot further determine whether it is buffer or direct.
ext4file can complement the tools mentioned above and can determine whether a file exhibits unexpected I/O under complex workloads.
|
I'm someone who believes that each commit/patch should be self-contained and complete (self-contained atomic unit). I think developers should be able to understand the full context and intent just by reading the commit message alone, without having to dig through the PR description or conversation thread. Therefore, it would be great if you could include the PR description into the commit message(s). Also, if you revise the patches so that each individual commit/patch maintains its own completeness (rather than scattering fixes across multiple small follow-up commits), it would make the review much easier. Additionally, it would be helpful to add the answer to question of @chenhengqi directly into the explanation under the |
Bojun-Seo
left a comment
There was a problem hiding this comment.
- I don’t think we need to split this into two commits. How about combining them into one?
- Too long commit title. Commit title is usually shorter than 70 or 80 characters.
- I just quickly checked
bpf.cfor now.
libbpf-tools/ext4file.bpf.c
Outdated
| @@ -0,0 +1,196 @@ | |||
| // SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) | |||
| // Copyright (c) 2025 Samsung Electronics Co., Ltd. | |||
There was a problem hiding this comment.
I have fixed the license title of .bpf.c, .c and .h file.
| __type(value, struct file_info_val); | ||
| } file_info_map SEC(".maps"); | ||
|
|
||
| static __always_inline bool str_equal(const char *a, const char *b) { |
There was a problem hiding this comment.
Please check coding style consistency.
char* a vs char *a
There was a problem hiding this comment.
I have changed the coding style into a unified format: char *a
libbpf-tools/ext4file.bpf.c
Outdated
| int BPF_PROG(my_ext4_add_entry, handle_t* handle, | ||
| struct dentry* dentry, struct inode* inode) | ||
| { | ||
| bpf_printk("ext4_add_entry"); |
There was a problem hiding this comment.
Using bpf_printk in BPF code can cause performance issues.
I recommend removing it—not just here, but in all other places as well.
You'll see that no other tools except memleak use bpf_printk in their BPF code.
There was a problem hiding this comment.
I have deleted the unnecessary bpf_printk.
libbpf-tools/ext4file.bpf.c
Outdated
| int BPF_PROG(my_ext4_file_read_iter, | ||
| struct kiocb *iocb, struct iov_iter *to) | ||
| { | ||
| //bpf_printk("ext4_file_read_iter\n"); |
There was a problem hiding this comment.
Please remove unnecessary comments.
Thank you, get it√ |
|
@niebowen666 |
Thanks for your advice. |
|
@Bojun-Seo |
Bojun-Seo
left a comment
There was a problem hiding this comment.
First, please change the title of commit message.
I noticed there are two maps in the bpf.c code. Conceptually, it seems like only one map would be sufficient. The ino_name_map uses the inode as the key and file_info_key as the value, while the file_info_map uses file_info_key as the key and file_info_value as the value. This effectively means that the inode is the ultimate key, and all other information is stored as part of the value in one map.
So, I’m wondering—was there a specific reason for separating them into two maps?
| } file_info_map SEC(".maps"); | ||
|
|
||
| static __always_inline bool str_equal(const char *a, const char *b) { | ||
| for (size_t i = 0; i < MAX_FILE_NAME; i++) { |
There was a problem hiding this comment.
What is the value of MAX_FILE_NAME? Where is it defined?
There was a problem hiding this comment.
The value is 255 which is defined in ext4file.h. I set this value based on the definition of NAME_MAX in the Linux kernel.
There was a problem hiding this comment.
Is MAX_FILE_NAME the same as NAME_MAX?
Even if the macro names are different, does it get automatically converted or something?
There was a problem hiding this comment.
Yes, I found that the NAME_MAX in Linux kernel is set to 255. So I set a new macro MAX_FILE_NAME to 255
There was a problem hiding this comment.
Yes, I found that the
NAME_MAXin Linux kernel is set to 255.
OK.
So I set a new macro
MAX_FILE_NAMEto 255
I cannot find the code that sets the new macro MAX_FILE_NAME. Could you tell me the line number of ext4file.h where it is defined?
There was a problem hiding this comment.
Sorry, NAME_MAX in ext4file.h should be changed to MAX_FILE_NAME. I have fixed it
As you can see, ext4file tracks the deletion of files. We envision that users need to frequently create and delete files, and the file descriptor (fd) resources in the kernel are limited, so fd reuse may occur. In this case, an fd may not represent a specific file. Based on the idea, we believe that the existing method of uniquely representing a file is more reasonable. |
Got it — that makes sense. |
fa23d18 to
d2773bb
Compare
Sure, I have already modified the commit message. |
Bojun-Seo
left a comment
There was a problem hiding this comment.
Please add ext4file on .gitignore file.
libbpf-tools/ext4file.c
Outdated
| #include "trace_helpers.h" | ||
| #include "ext4file.h" | ||
|
|
||
| #define BG_LIST_NUM 57232 |
… is used to monitor the I/O patterns (buffer or direct) of each file in the target ext4 filesystem, as well as the hint used by each file. ext4file is used to monitor the I/O patterns (buffer or direct) of each file in the target ext4 filesystem, as well as the hint used by each file. ext4file is a tool used to track file-level buffer or direct I/O. Currently, in the repository, there exist block-layer tools to monitor I/O patterns of whole disk (such as biopattern, biolatency, etc.) and VFS-layer tools to trace file lifecycle and I/O behavior of files throughout the entire VFS (such as filelife and vfsstat, etc.). Below is a comparative summary: | Tool Name | Layer | Main Function | Tracks Filename? | Differences | |---------------|---------------------|---------------------------------------------------------------------------------------------------------------|------------------|------------------------| | biopattern | Block | Measures the proportion of random vs. sequential I/O on a storage device | No | Can not achieve file-layer tracing(The other bio tools all have this problem) | | fsdist | VFS | Tracks latency distribution of operations like read, write, open, and sync | No | Focus on latency, not I/O pattern | | fsslower | VFS | Traces slow file operations (e.g., long-latency reads/writes), focuses on I/O size | Yes| Trace the I/O size and latency, not I/O pattern | | filelife | VFS | Monitors file lifecycle events (creation and deletion) | Yes| Only focus on file creation and deletion | | filetop | VFS | Shows real-time I/O activity of active files (displays only top entries to avoid verbosity) | Yes| There exists no distinction between buffer and direct I/O | | ext4file | ext4 Filesystem | Tracks **buffer vs. direct I/O patterns** per file, enables fine-grained file-level monitoring using inode | Yes| | Run ext4file before executing your test. You can refer to ./ext4file -h to get the usage of the tool Show I/O pattern for every file in ext4 filesystem. Usage: ./ext4file [-h] [-d DIR] [-o FILE] [interval] [count] Options: -h, --help Print this help message -d DIR, --dir=DIR Trace the ext4 filesystem mounted on the specified directory -o FILE, --output=FILE Write output to a file (optional; default: stdout) interval Time interval (in seconds) between reports (default: unlimited) count Number of reports to generate (default: unlimited) Examples: ./ext4file -d /mnt/ext4 # Trace I/O patterns of files on the ext4 filesystem mounted at /mnt/ext4 ./ext4file -d /mnt/ext4 1 10 # Generate 10 reports, one per second ./ext4file -d /mnt/ext4 -o output 1 10 # Generate 10 reports at 1-second intervals, saving output to ./output The output could be: root@server:/home/nbw/OpenSource/biohint/libbpf-tools# ./ext4file -d /mnt/ext4File/ EXT4 Filesystem Info: blocks_count=3750232064 blocks_per_group=32768 bg_cnt=114448 Tracing Ext4 read/write... Hit Ctrl-C to end. 2026-01-14 13:58:21 file_name inode pa_inode hint buffer_read direct_read buffer_write direct_write delete test3 83361794 83361793 0 0 0 0 0 False test2 34 2 2 8 0 1 0 False dir1 83361793 2 0 0 0 0 0 False dir2 440467457 2 0 0 0 0 0 True test3 33 2 3 8 0 1 0 False test1 33 2 5 8 0 1 0 True test3 440467458 440467457 0 0 0 0 0 True Below is the detailed explanation of each field in the ext4file output. This tool traces per-file I/O patterns (buffered vs. direct) on ext4 filesystems, providing fine-grained visibility into application behavior. | Field | Description | |---------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | file_name | The name of the file (without full path). Note: multiple files may share the same name. | | inode | The inode number of the file. Inode is the unique identifier for a file within the filesystem, even across renames or hard links. This enables accurate tracking of I/O for specific files. | | pa_inode | The inode number of the file’s parent directory. | | hint | The FDP (Flexible Data Placement) hint value associated with the file. FDP is a new NVMe feature that enables the host to guide data placement on the SSD. | | buffer_read | Number of buffered read operations performed on the file. Buffered I/O goes through the kernel page cache. | | direct_read | Number of direct read operations performed on the file. Direct I/O bypasses the page cache. | | direct_write | Number of direct write operations performed on the file. Like direct read, it skips the page cache and writes data directly from user space to storage. | | buffer_write | Number of buffered write operations performed on the file. Data is first written to the page cache and later flushed to disk asynchronously by the kernel. | | delete | Indicates whether the file has been unlinked (deleted). If True, the file was removed from the directory but may still be accessible if held open by a process. I/O on such files can indicate resource leaks or long-running file handles. | This tool is intended for ext4 filesystem developers and performance engineers who need to analyze I/O behavior at the file level. Was there a specific reason for separating them into two maps? - ext4file tracks the deletion of files. We envision that users need to frequently create and delete files, and the file descriptor (fd) resources in the kernel are limited, so fd reuse may occur. In this case, an fd may not represent a specific file. Based on the idea, we believe that the existing method of uniquely representing a file is more reasonable.
Do you mean I should add the compiled binary file ext4file (not the source files) to .gitignore? Or something else |
I mean, compiled binary file(ext4file) should be added on |
ext4file introduction
Overview
ext4file is used to monitor the I/O patterns (buffer or direct) of each file in the target ext4 filesystem, as well as the hint used by each file.
Why ext4file
ext4file is a tool used to track file-level buffer or direct I/O. Currently, in the repository, there exist block-layer tools to monitor I/O patterns of whole disk (such as biopattern, biolatency, etc.) and VFS-layer tools to trace file lifecycle and I/O behavior of files throughout the entire VFS (such as filelife and vfsstat, etc.).
Below is a comparative summary:
How to use
Run ext4file before executing your test. You can refer to ./ext4file -h to get the usage of the tool
The output could be:
Below is the detailed explanation of each field in the ext4file output. This tool traces per-file I/O patterns (buffered vs. direct) on ext4 filesystems, providing fine-grained visibility into application behavior.
True, the file was removed from the directory but may still be accessible if held open by a process. I/O on such files can indicate resource leaks or long-running file handles.Target Audience
This tool is intended for ext4 filesystem developers and performance engineers who need to analyze I/O behavior at the file level.