v1.4.0
Tar v1.4.0
This release adds two new API functions:
rewriteallows a tarball to be rewritten to the standard form thatcreateproduces without needing to fully extract and re-create the tarball. If the input stream is seekable, it makes one pass to index the tarball and uses seek to access file data in the correct order. If the input stream is not seekable, it will collect all the data in a buffer first and then use that.tree_hashcompute the git tree hash (SHA1 and SHA256 supported) of a tarball without needing to extract it to disk. This is particularly useful since some file systems lack features that are significant to git when hashing a file tree (e.g. symlinks, case preservation, ability to set/get executable bits).
This also includes significant refactoring of the internal {read,write}_tarball functions. This refactoring allows read_tarball to by extract, rewrite and tree_hash while write_tarball is used by create and rewrite. In the future these internal functions may be promoted to official low-level APIs.
Closed issues:
Merged pull requests:
- implement Tar.tree_hash (with other improvements) (#36) (@StefanKarpinski)
- tree_hash: more efficient file hashing (#37) (@StefanKarpinski)
- README: add note about reproducibility and tree_hash (#39) (@StefanKarpinski)
- read_data: fix premature eof logic (#40) (@StefanKarpinski)
- format change: sort purely by name, no '/' added to dirs (#42) (@StefanKarpinski)
- create.jl: factor out reusable write_tarball function (#43) (@StefanKarpinski)
- new API: Tar.rewrite([pred], old, [new]) (#44) (@StefanKarpinski)
- list_tarball: specialize on read_hdr function (#45) (@StefanKarpinski)
- skip(Process): use buffered I/O when skipping process output (#46) (@StefanKarpinski)
- open_{read,write}: use generic
fileargument name (#47) (@StefanKarpinski)