Skip to content

compress: transparent gzip-compression (#545)#549

Merged
Mintsuki merged 21 commits intoLimine-Bootloader:trunkfrom
iczelia:trunk
Apr 19, 2026
Merged

compress: transparent gzip-compression (#545)#549
Mintsuki merged 21 commits intoLimine-Bootloader:trunkfrom
iczelia:trunk

Conversation

@iczelia
Copy link
Copy Markdown
Member

@iczelia iczelia commented Apr 11, 2026

provides a port of embeddable pdgzip (CC0) with small performance tweaks to attain better-than-zlib decode times. we have a couple of unique tricks up our sleeve:

  • extra validation of huffman trees via the kraft-mcmillan condition
  • multi-level huffman tables; standard technique in accelerated zlib decompressors, here implemented in a low code volume.
  • semi-space based window design to decrease overall latency and move a hotspot to semi-space memcpy() back to history buffer.
  • full support of the RFC quirks for deflate/gzip, including fixed huffman trees.
  • fast crc32 implementation via runtime-computed tables and the slicing-by-4 algorithm.

comparisons:

  • size: tinf ~2.5kb x86 code, this ~15kb x86 code, zlib ~22kb of x86 code.
  • code volume: tinf 639 sloc, this 594 sloc, zlib >=10k sloc, libdeflate >=7.7k sloc.
  • performance (r7 pro 7840u; enwik8 100MB): tinf ~2.3s, this ~409.7 ms, zlib ~417.7 ms. extra: fuzzed with afl++.

furthermore, the hand-over includes full feature validation:

  • automated testing that validates gzip decompression in the UEFI path to quickly catch potential issues.
  • automated fuzzing script that finds potential issues with the gzip decompressor. so far, after 24 hours of parallel fuzzing no problems have been found.

iczelia added 7 commits April 11, 2026 20:55
provides a port of embeddable pdgzip (CC0) with small performance tweaks to
attain better-than-zlib decode times. we have a couple of unique tricks up
our sleeve:
- extra validation of huffman trees via the kraft-mcmillan condition
- multi-level huffman tables; standard technique in accelerated zlib
  decompressors, here implemented in a low code volume.
- semi-space based window design to decrease overall latency and move a
  hotspot to semi-space memcpy() back to history buffer.
- full support of the RFC quirks for deflate/gzip, including fixed huffman
  trees.
- fast crc32 implementation via runtime-computed tables and the
  slicing-by-4 algorithm.
comparison (size):
  tinf ~2.5kb x86 code, this ~15kb x86 code, zlib ~22kb of x86 code.
comparison (code volume):
  tinf 639 sloc, this 594 sloc, zlib >=10k sloc, libdeflate >=7.7k sloc.
performance (r7 pro 7840u; enwik8 100MB):
  tinf ~2.3s, this ~409.7 ms, zlib ~417.7 ms.
extra: fuzzed with afl++. pending addition of fuzzing scripts and automated
ci-bound testing. thin wrapper over file streams automatically used when the
$-prefix is found as per Limine-Bootloader#545.
this fuzzing harness is not intended to be ran by end users or distro maintainers, hence it does not follow a lot of the standard "portability" kludges of mainline limine code.
document the behaviour of the ->size field in the public API of gzip.h; adhere to a more uniform style.
replace the 2M image with a 32M UEFI image with a specific head/track/sector geometry.
also move crc32 tables to ext_mem_alloc, document peak mem usage
@iczelia iczelia changed the title compress: transparent gzip-compression compress: transparent gzip-compression (#545) Apr 12, 2026
iczelia and others added 11 commits April 12, 2026 17:32
- remove ISIZE parsing in gzip.c
- add a streaming file_handle wrapper over blake2b.
- make ->read and fread return the actual # of bytes read (needed by the gzip decoder downstream users to determine real EOF)
- add is_high_mem and load_addr_64 to file_handle (after freadall_mode inlined body in uri_open relocates to high memory)
- as a result of the changes, freadall and freadall_mode are no longer used anywhere by the code base. hence they were removed.
- update the signature of uri_open to accept memcpy functions to/from high memory and a boolean parameter for whether high memory allocations are acceptable for this specific resource.
- inline size-agnostic (stretchy-vector type) functionality to uri_open to facilitate streaming unknown-size decoding. uri_open now returns memfiles, always, mimicking the behaviour prior to streaming blake2b change commit.
- minimum patch in limine_asm to make limine_memcpy_64_asm take two wide pointers.
- update downstream callers of uri_open, as well as the gzip fuzzing suite.
@iczelia iczelia requested a review from Mintsuki April 18, 2026 21:29
@Mintsuki Mintsuki merged commit e9a95e0 into Limine-Bootloader:trunk Apr 19, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants