compress: transparent gzip-compression (#545) by iczelia · Pull Request #549 · Limine-Bootloader/Limine

iczelia · 2026-04-11T19:10:08Z

provides a port of embeddable pdgzip (CC0) with small performance tweaks to attain better-than-zlib decode times. we have a couple of unique tricks up our sleeve:

extra validation of huffman trees via the kraft-mcmillan condition
multi-level huffman tables; standard technique in accelerated zlib decompressors, here implemented in a low code volume.
semi-space based window design to decrease overall latency and move a hotspot to semi-space memcpy() back to history buffer.
full support of the RFC quirks for deflate/gzip, including fixed huffman trees.
fast crc32 implementation via runtime-computed tables and the slicing-by-4 algorithm.

comparisons:

size: tinf ~2.5kb x86 code, this ~15kb x86 code, zlib ~22kb of x86 code.
code volume: tinf 639 sloc, this 594 sloc, zlib >=10k sloc, libdeflate >=7.7k sloc.
performance (r7 pro 7840u; enwik8 100MB): tinf ~2.3s, this ~409.7 ms, zlib ~417.7 ms. extra: fuzzed with afl++.

furthermore, the hand-over includes full feature validation:

automated testing that validates gzip decompression in the UEFI path to quickly catch potential issues.
automated fuzzing script that finds potential issues with the gzip decompressor. so far, after 24 hours of parallel fuzzing no problems have been found.

provides a port of embeddable pdgzip (CC0) with small performance tweaks to attain better-than-zlib decode times. we have a couple of unique tricks up our sleeve: - extra validation of huffman trees via the kraft-mcmillan condition - multi-level huffman tables; standard technique in accelerated zlib decompressors, here implemented in a low code volume. - semi-space based window design to decrease overall latency and move a hotspot to semi-space memcpy() back to history buffer. - full support of the RFC quirks for deflate/gzip, including fixed huffman trees. - fast crc32 implementation via runtime-computed tables and the slicing-by-4 algorithm. comparison (size): tinf ~2.5kb x86 code, this ~15kb x86 code, zlib ~22kb of x86 code. comparison (code volume): tinf 639 sloc, this 594 sloc, zlib >=10k sloc, libdeflate >=7.7k sloc. performance (r7 pro 7840u; enwik8 100MB): tinf ~2.3s, this ~409.7 ms, zlib ~417.7 ms. extra: fuzzed with afl++. pending addition of fuzzing scripts and automated ci-bound testing. thin wrapper over file streams automatically used when the $-prefix is found as per Limine-Bootloader#545.

this fuzzing harness is not intended to be ran by end users or distro maintainers, hence it does not follow a lot of the standard "portability" kludges of mainline limine code.

document the behaviour of the ->size field in the public API of gzip.h; adhere to a more uniform style.

replace the 2M image with a 32M UEFI image with a specific head/track/sector geometry.

also move crc32 tables to ext_mem_alloc, document peak mem usage

- remove ISIZE parsing in gzip.c - add a streaming file_handle wrapper over blake2b. - make ->read and fread return the actual # of bytes read (needed by the gzip decoder downstream users to determine real EOF) - add is_high_mem and load_addr_64 to file_handle (after freadall_mode inlined body in uri_open relocates to high memory) - as a result of the changes, freadall and freadall_mode are no longer used anywhere by the code base. hence they were removed. - update the signature of uri_open to accept memcpy functions to/from high memory and a boolean parameter for whether high memory allocations are acceptable for this specific resource. - inline size-agnostic (stretchy-vector type) functionality to uri_open to facilitate streaming unknown-size decoding. uri_open now returns memfiles, always, mimicking the behaviour prior to streaming blake2b change commit. - minimum patch in limine_asm to make limine_memcpy_64_asm take two wide pointers. - update downstream callers of uri_open, as well as the gzip fuzzing suite.

iczelia added 7 commits April 11, 2026 20:55

contrib: check in the gzip-fuzzing harness

83688f3

this fuzzing harness is not intended to be ran by end users or distro maintainers, hence it does not follow a lot of the standard "portability" kludges of mainline limine code.

compress: gzip.c/.h - document limitations.

f7078cc

document the behaviour of the ->size field in the public API of gzip.h; adhere to a more uniform style.

fix ci

5b05e90

replace the 2M image with a 32M UEFI image with a specific head/track/sector geometry.

test: limine.c now uses outw() for QEMU-specific fast path shutdown.

2e06543

gzip.c: more memory-frugal by constructing fixed tables on demand

64b9289

also move crc32 tables to ext_mem_alloc, document peak mem usage

contrib: mechanical test for validating gzip compression

76f726e

iczelia changed the title ~~compress: transparent gzip-compression~~ compress: transparent gzip-compression (#545) Apr 12, 2026

iczelia and others added 11 commits April 12, 2026 17:32

Merge branch 'Limine-Bootloader:trunk' into trunk

8b9c90f

minor cosmetic

ca125e7

fs: fat32.s2.c: update a stale comment

0d1fad9

crypt: blake2b.h: opaque pointer to fs guts

d267f6b

lib: uri.c silence warning on non-i386

962d24f

lib: uri.c: remove stale comment.

af79ad6

Merge branch 'trunk' into trunk

0f59b4e

move to pdgzip

f361b31

drop old

ecd9bd3

Merge branch 'trunk' into trunk

6b47c05

iczelia requested a review from Mintsuki April 18, 2026 21:29

iczelia added 3 commits April 19, 2026 23:48

test/test.mk: rollback a merge artifact.

66a1c09

Merge remote-tracking branch 'origin/trunk' into trunk

922287e

test.mk: re-add extra cflags

c3e82ca

Mintsuki merged commit e9a95e0 into Limine-Bootloader:trunk Apr 19, 2026
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

compress: transparent gzip-compression (#545)#549

compress: transparent gzip-compression (#545)#549
Mintsuki merged 21 commits intoLimine-Bootloader:trunkfrom
iczelia:trunk

iczelia commented Apr 11, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

iczelia commented Apr 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

iczelia commented Apr 11, 2026 •

edited

Loading