compress: transparent gzip-compression (#545)#549
Merged
Mintsuki merged 21 commits intoLimine-Bootloader:trunkfrom Apr 19, 2026
Merged
compress: transparent gzip-compression (#545)#549Mintsuki merged 21 commits intoLimine-Bootloader:trunkfrom
Mintsuki merged 21 commits intoLimine-Bootloader:trunkfrom
Conversation
provides a port of embeddable pdgzip (CC0) with small performance tweaks to attain better-than-zlib decode times. we have a couple of unique tricks up our sleeve: - extra validation of huffman trees via the kraft-mcmillan condition - multi-level huffman tables; standard technique in accelerated zlib decompressors, here implemented in a low code volume. - semi-space based window design to decrease overall latency and move a hotspot to semi-space memcpy() back to history buffer. - full support of the RFC quirks for deflate/gzip, including fixed huffman trees. - fast crc32 implementation via runtime-computed tables and the slicing-by-4 algorithm. comparison (size): tinf ~2.5kb x86 code, this ~15kb x86 code, zlib ~22kb of x86 code. comparison (code volume): tinf 639 sloc, this 594 sloc, zlib >=10k sloc, libdeflate >=7.7k sloc. performance (r7 pro 7840u; enwik8 100MB): tinf ~2.3s, this ~409.7 ms, zlib ~417.7 ms. extra: fuzzed with afl++. pending addition of fuzzing scripts and automated ci-bound testing. thin wrapper over file streams automatically used when the $-prefix is found as per Limine-Bootloader#545.
this fuzzing harness is not intended to be ran by end users or distro maintainers, hence it does not follow a lot of the standard "portability" kludges of mainline limine code.
document the behaviour of the ->size field in the public API of gzip.h; adhere to a more uniform style.
also move crc32 tables to ext_mem_alloc, document peak mem usage
- remove ISIZE parsing in gzip.c - add a streaming file_handle wrapper over blake2b. - make ->read and fread return the actual # of bytes read (needed by the gzip decoder downstream users to determine real EOF) - add is_high_mem and load_addr_64 to file_handle (after freadall_mode inlined body in uri_open relocates to high memory) - as a result of the changes, freadall and freadall_mode are no longer used anywhere by the code base. hence they were removed. - update the signature of uri_open to accept memcpy functions to/from high memory and a boolean parameter for whether high memory allocations are acceptable for this specific resource. - inline size-agnostic (stretchy-vector type) functionality to uri_open to facilitate streaming unknown-size decoding. uri_open now returns memfiles, always, mimicking the behaviour prior to streaming blake2b change commit. - minimum patch in limine_asm to make limine_memcpy_64_asm take two wide pointers. - update downstream callers of uri_open, as well as the gzip fuzzing suite.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
provides a port of embeddable pdgzip (CC0) with small performance tweaks to attain better-than-zlib decode times. we have a couple of unique tricks up our sleeve:
comparisons:
furthermore, the hand-over includes full feature validation: