Skip to content

fix: dynamically calculate max_pages based on system page size#189

Merged
stapelberg merged 1 commit intojacobsa:masterfrom
kislaykishore:master
Mar 2, 2026
Merged

fix: dynamically calculate max_pages based on system page size#189
stapelberg merged 1 commit intojacobsa:masterfrom
kislaykishore:master

Conversation

@kislaykishore
Copy link
Contributor

@kislaykishore kislaykishore commented Feb 26, 2026

Hardcoding FUSE max_pages to 256 assumes a 4KiB page size yielding a 1MiB max request. On ARM64 architectures (e.g., Grace CPU) with 64KiB pages, 256 pages allows the kernel to send read-ahead requests up to 16MiB.
This overflows the daemon's fixed 1MiB buffer pool, resulting in 0-byte reads, premature EOFs, and fatal SIGBUS errors in mmap-heavy applications like TensorRT.
This fix dynamically calculates the limit during the FUSE INIT phase:
max_pages = (1 MiB buffer capacity) / os.Getpagesize()

On 64KiB systems, this safely caps max_pages at 16. The kernel will now strictly split large read-ahead demands into 1MiB chunks, preventing buffer overflows and crashes.

@kislaykishore kislaykishore changed the title [Do not review] Fix memory-buffer limit fix: dynamically calculate max_pages based on system page size Feb 26, 2026
@kislaykishore kislaykishore marked this pull request as ready for review February 26, 2026 20:20
@kislaykishore kislaykishore requested a review from geertj February 26, 2026 21:02
@kislaykishore kislaykishore force-pushed the master branch 3 times, most recently from 570c3e1 to 247e587 Compare February 27, 2026 10:38
@kislaykishore
Copy link
Contributor Author

@geertj @mustvicky I've now enhanced the logging as well so that it's easier to identify these errors in the future.

Hardcoding FUSE `max_pages` to 256 assumes a 4KiB page size yielding a 1MiB max request. On ARM64 architectures (e.g., Grace CPU) with 64KiB pages, 256 pages allows the kernel to send read-ahead requests up to 16MiB.
This overflows the daemon's fixed 1MiB buffer pool, resulting in 0-byte reads, premature EOFs, and fatal SIGBUS errors in mmap-heavy applications like TensorRT.
This fix dynamically calculates the limit during the FUSE INIT phase:
max_pages = (1 MiB buffer capacity) / os.Getpagesize()

On 64KiB systems, this safely caps `max_pages` at 16. The kernel will now strictly split large read-ahead demands into 1MiB chunks, preventing buffer overflows and crashes.
@stapelberg stapelberg merged commit f1ba38d into jacobsa:master Mar 2, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants