Multiple improvements: seekable tests, zstdless CLI, streaming dict example, seqBench hardening, DiB early termination#4617
Open
BhavyaBibra wants to merge 1 commit intofacebook:devfrom
Conversation
…le, seqBench hardening, DiB early termination
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
A batch of improvements across tests, examples, documentation, and contrib tools. Each change is independent and addresses existing TODOs or gaps in the codebase.
Changes
1. Seekable format: add unit tests & fix FIXME (contrib/seekable_format/tests/seekable_tests.c)
checksumFlag=1pathgetFrameCompressedSize()etc. return errors for bad indices/* Github issue #FIXME */with descriptive regression test comment (variation of ZSTD_seekable_decompress() can hang #2335)/* TODO: Add more tests */2. Improve zstdless script (programs/zstdless)
--help/-hwith usage message--version/-Vto print zstd versionZSTDLESS_FLAGSenvironment variable for passing custom flags to zstd3. Add streaming dictionary compression example (
examples/)dictionary_compressionandstreaming_compressionexamples4. Document memset engineering decisions (lib/compress/zstd_compress.c)
/* TODO: avoid memset? */comments (LDM hash table and bucket offsets) with documented analysis:--long(rare default path)5. Early termination in DiB_fileStats() (programs/dibio.c)
breakwhentotalSizeToLoad >= MAX_SAMPLES_SIZE(2GB)stat()syscalls when training dictionaries on large file sets/* TODO: there is opportunity to stop DiB_fileStats() early */6. Harden
contrib/seqBench(contrib/seqBench/seqBench.c)fopen()error check (previously segfaulted on missing files)malloc()error checksfread()return value validationgoto cleanupTesting
make -C lib libzstd.a✅make -C contrib/seekable_format/tests test— all 9 tests pass ✅make -C examples all— all 10 examples build ✅make -C programs zstd— full binary builds with dibio.c changes ✅seqBenchcompiles successfully ✅zstdless --helpprints usage ✅