Skip to content

Libdeflate port #25

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 18 commits into
base: master
Choose a base branch
from
Open

Libdeflate port #25

wants to merge 18 commits into from

Conversation

springmeyer
Copy link
Contributor

@springmeyer springmeyer commented Apr 3, 2018

This ports gzip-hpp to use libdeflate. This is for experimental purposes only testing in production and to investigate the potential speed improvements in using libdeflate.

If this proves viable (fast and robust), this is exciting because:

  • currently node.js statically links zlib
  • this forces all node c++ addons (that get dynamically loaded into node) to use ABI compatible zlib (otherwise weird things can happen like symbol lookup error node-zipfile#63)
  • because libdeflate is a totally different API, if we use it, we could drop using zlib and then the problem of having node.js force a specific zlib version on us goes away!

Anyway, here are the first, promising, numbers locally:

Locally I see (with this PR) (smaller numbers is faster):

Run on (4 X 3500 MHz CPU s)
2018-04-03 15:43:13
----------------------------------------------------------------------------
Benchmark                                     Time           CPU Iterations
----------------------------------------------------------------------------
BM_compress                             1128545 ns    1118127 ns        661
BM_decompress                            202429 ns     201630 ns       3209
BM_compress_class                        778036 ns     777355 ns        979
BM_compress_class_no_reallocations       755331 ns     754569 ns        932
BM_decompress_class                      173623 ns     173489 ns       4284
BM_decompress_class_no_reallocations     162725 ns     162567 ns       4446

And with master:

Run on (4 X 3500 MHz CPU s)
2018-04-03 15:43:52
----------------------------------------------------------------------------
Benchmark                                     Time           CPU Iterations
----------------------------------------------------------------------------
BM_compress                             2124502 ns    2109067 ns        326
BM_decompress                            321913 ns     317480 ns       2122
BM_compress_class                       2181514 ns    2148765 ns        340
BM_compress_class_no_reallocations      2115537 ns    2102435 ns        329
BM_decompress_class                      317198 ns     315435 ns       2180
BM_decompress_class_no_reallocations     327788 ns     326020 ns       2243

TODO:

  • Find a better way than std::size_t uncompressed_size_guess = size * 3; to figure out the uncompressed size to allocate.
  • Is the same level of compression being used with libdeflate?
  • Is Z_DEFAULT_COMPRESSION producing roughly the same sizes?
  • Do the speed differences seen locally between zlib and libdeflate persist in production systems?

if (compressor_)
{
libdeflate_free_compressor(compressor_);
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noting that this approach (initialize C struct pointer in constructor + free the memory in the deconstructor) is applying RAII (https://en.wikipedia.org/wiki/Resource_acquisition_is_initialization) to avoid a memory leak.

#pragma GCC diagnostic ignored "-Wold-style-cast"
if (deflateInit2(&deflate_s, level_, Z_DEFLATED, window_bits, mem_level, Z_DEFAULT_STRATEGY) != Z_OK)
std::size_t max_compressed_size = libdeflate_gzip_compress_bound(compressor_, size);
// TODO: sanity check this before allocating
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason for this comment is fear/lack of knowledge on my part. What happens if libdeflate_gzip_compress_bound is buggy and returns a really massive value? Is that possible? Would we end up trying to allocate so much memory the machine would crumble? Probably not possible, but I've also not looked inside libdeflate yet to figure out how much to worry about this.

@springmeyer springmeyer mentioned this pull request Jul 14, 2018
springmeyer pushed a commit to mapbox/vtcomposite that referenced this pull request Jul 28, 2018
@springmeyer springmeyer requested a review from artemp September 26, 2018 04:53
@@ -56,7 +56,7 @@ class Decompressor
// https://github.com/kaorimatz/libdeflate-ruby/blob/0e33da96cdaad3162f03ec924b25b2f4f2847538/ext/libdeflate/libdeflate_ext.c#L340
// https://github.com/ebiggers/libdeflate/commit/5a9d25a8922e2d74618fba96e56db4fe145510f4
std::size_t actual_size;
std::size_t uncompressed_size_guess = size * 3;
std::size_t uncompressed_size_guess = size * 4;
output.resize(uncompressed_size_guess);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This probably needs to be re-written to avoid needing to guess the size up front. @artemp could you tackle that?

Copy link

@artemp artemp Sep 26, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

std::size_t uncompressed_size_guess = size * 4; - good starting point but it definitely needs a way to dynamically increase capacity if size * 4 is not large enough. Approach from the ruby link looks reasonable, here is my patch :

diff --git a/include/gzip/decompress.hpp b/include/gzip/decompress.hpp
index badd315..627e670 100644
--- a/include/gzip/decompress.hpp
+++ b/include/gzip/decompress.hpp
@@ -57,17 +57,23 @@ class Decompressor
         // https://github.com/ebiggers/libdeflate/commit/5a9d25a8922e2d74618fba96e56db4fe145510f4
         std::size_t actual_size;
         std::size_t uncompressed_size_guess = size * 4;
-        output.resize(uncompressed_size_guess);
-        enum libdeflate_result result = libdeflate_gzip_decompress(decompressor_,
-                                                                   data,
-                                                                   size,
-                                                                   static_cast<void*>(&output[0]),
-                                                                   uncompressed_size_guess, &actual_size);
-        if (result == LIBDEFLATE_INSUFFICIENT_SPACE)
+        output.reserve(uncompressed_size_guess);
+        enum libdeflate_result result;
+        for (;;)
         {
-            throw std::runtime_error("no space: did not succeed");
+            result = libdeflate_gzip_decompress(decompressor_,
+                                                data,
+                                                size,
+                                                const_cast<char*>(output.data()),
+                                                output.capacity(), &actual_size);
+            if (result != LIBDEFLATE_INSUFFICIENT_SPACE)
+            {
+                break;
+            }
+            output.reserve((output.capacity() << 1) - output.size());
         }
-        else if (result == LIBDEFLATE_SHORT_OUTPUT)
+
+        if (result == LIBDEFLATE_SHORT_OUTPUT)
         {
             throw std::runtime_error("short output: did not succeed");
         }

Couple notes:

  • Use std::vector::reserve() to increase capacity
  • Use std::vector::data() to access a pointer to the first element (c++11) which works for empty containers, while &vec[0] is not safe and needs an extra check.

/cc @springmeyer

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@artemp great fix - can you please commit directly to this branch? But two concerns to keep in mind:

  • Can you think of a situation where result will never become == to LIBDEFLATE_INSUFFICIENT_SPACE and therefore the loop will run infinitely? Can we protect against that by checking for other error states inside the loop?
  • I wonder if we should avoid calling output.reserve because sometimes we may overeserve and the extra cost of over-reserving might overrule the benefit of reserving when it is relatively correct. This concern of mine comes from seeing how reserve is expensive and when wrong hurts perf over at Optimize coalesce carmen-cache#126

Copy link

@artemp artemp Sep 27, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@springmeyer -

Can you think of a situation where result will never become == to LIBDEFLATE_INSUFFICIENT_SPACE and therefore the loop will run infinitely? Can we protect against that by checking for other error states inside the loop?

Good point, I'll add conditional exception in the loop if allocation size is getting to big.

I wonder if we should avoid calling output.reserve because sometimes we may overeserve and the extra cost of over-reserving might overrule the benefit of reserving when it is relatively correct. This concern of mine comes from seeing how reserve is expensive and when wrong hurts perf over at mapbox/carmen-cache#126

Let me investigate and fix this ^

I'm going to push updates and continue work on this branch

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to push updates and continue work on this branch

👍 I think you'll likely need an easy way to test larger, varied files and a CLI would allow that. So could you finish #29 to aid your testing of this branch?

Copy link

@artemp artemp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good overall and faster according to make bench but see my comments with proposed changes /cc @springmeyer

@@ -130,7 +130,7 @@ TEST_CASE("test decompression size limit")
std::istreambuf_iterator<char>());
stream.close();

std::size_t limit = 20 * 1024 * 1024; // 20 Mb
std::size_t limit = 500 * 1024 * 1024; // 500 Mb
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@springmeyer - I've changed logic to validate output buffer size rather then input, which makes more sense in my opinion.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👌

struct libdeflate_decompressor* decompressor_ = nullptr;

public:
Decompressor(std::size_t max_bytes = 1000000000) // by default refuse operation if compressed data is > 1GB
Decompressor(std::size_t max_bytes = 2147483648u) // by default refuse operation if required uutput buffer is > 2GB
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@artemp typo: uutput

// https://github.com/kaorimatz/libdeflate-ruby/blob/0e33da96cdaad3162f03ec924b25b2f4f2847538/ext/libdeflate/libdeflate_ext.c#L340
// https://github.com/ebiggers/libdeflate/commit/5a9d25a8922e2d74618fba96e56db4fe145510f4
std::size_t actual_size;
std::size_t uncompressed_size_guess = size * 4;
output.reserve(uncompressed_size_guess);
std::size_t uncompressed_size_guess = std::min(size * 4, max_);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens when size * 4 will not fit within std::numeric_limits<std::size_t>::max()?

@springmeyer
Copy link
Contributor Author

@artemp all the changes look good. I made a few minor comments inline.

One other high level thought: check out #27 which may be of interest as you find bugs or find ways to improve the code in general (non-specific to this libdeflate branch). If you do have generally applicable changes I think it would be ideal to keep track of these and attempt to land them in a separate PR.

@artemp
Copy link

artemp commented Sep 27, 2018

@springmeyer - thanks for the feedback, I'll reply asap. Going through #27 looks like a good plan.

@springmeyer
Copy link
Contributor Author

@artemp I noticed that the -Weffc++ warning that is used downstream is triggering due to code in this branch. See:

In file included from ../src/vtcomposite.cpp:7:0:
/home/travis/build/mapbox/vtcomposite/../gzip-hpp/include/gzip/compress.hpp:13:7: error: ‘class gzip::Compressor’ has pointer data members [-Werror=effc++]
 class Compressor
       ^~~~~~~~~~
/home/travis/build/mapbox/vtcomposite/../gzip-hpp/include/gzip/compress.hpp:13:7: error:   but does not override ‘gzip::Compressor(const gzip::Compressor&)’ [-Werror=effc++]
/home/travis/build/mapbox/vtcomposite/../gzip-hpp/include/gzip/compress.hpp:13:7: error:   or ‘operator=(const gzip::Compressor&)’ [-Werror=effc++]
In file included from ../src/vtcomposite.cpp:8:0:
/home/travis/build/mapbox/vtcomposite/../gzip-hpp/include/gzip/decompress.hpp:13:7: error: ‘class gzip::Decompressor’ has pointer data members [-Werror=effc++]
 class Decompressor
       ^~~~~~~~~~~~
/home/travis/build/mapbox/vtcomposite/../gzip-hpp/include/gzip/decompress.hpp:13:7: error:   but does not override ‘gzip::Decompressor(const gzip::Decompressor&)’ [-Werror=effc++]
/home/travis/build/mapbox/vtcomposite/../gzip-hpp/include/gzip/decompress.hpp:13:7: error:   or ‘operator=(const gzip::Decompressor&)’ [-Werror=effc++]

at https://travis-ci.org/mapbox/vtcomposite/jobs/438830669#L986.

@artemp could you fix this warning/error? Perhaps it could be fixed by making the class noncopyable? You should be able to replicate locally with g++ and adding -Weffc++ (note: clang++ does not do anything with this option).

@springmeyer
Copy link
Contributor Author

@artemp one more need that I noticed: we have a downstream consumer mapbox-maps that needs to support zlib-coded data as well as gzip-coded data. So, could you:

  • add a test for zlib-coded data (we must be missing one?)
  • ensure that this branch works to decompress it. This will likely require dynamically switching between libdeflate_gzip_decompress or libdeflate_zlib_decompress (or maybe libdeflate_deflate_decompress_ex?)

@artemp
Copy link

artemp commented Oct 9, 2018

@artemp I noticed that the -Weffc++ warning that is used downstream is triggering due to code in this branch. See:

In file included from ../src/vtcomposite.cpp:7:0:
/home/travis/build/mapbox/vtcomposite/../gzip-hpp/include/gzip/compress.hpp:13:7: error: ‘class gzip::Compressor’ has pointer data members [-Werror=effc++]
 class Compressor
       ^~~~~~~~~~
/home/travis/build/mapbox/vtcomposite/../gzip-hpp/include/gzip/compress.hpp:13:7: error:   but does not override ‘gzip::Compressor(const gzip::Compressor&)’ [-Werror=effc++]
/home/travis/build/mapbox/vtcomposite/../gzip-hpp/include/gzip/compress.hpp:13:7: error:   or ‘operator=(const gzip::Compressor&)’ [-Werror=effc++]
In file included from ../src/vtcomposite.cpp:8:0:
/home/travis/build/mapbox/vtcomposite/../gzip-hpp/include/gzip/decompress.hpp:13:7: error: ‘class gzip::Decompressor’ has pointer data members [-Werror=effc++]
 class Decompressor
       ^~~~~~~~~~~~
/home/travis/build/mapbox/vtcomposite/../gzip-hpp/include/gzip/decompress.hpp:13:7: error:   but does not override ‘gzip::Decompressor(const gzip::Decompressor&)’ [-Werror=effc++]
/home/travis/build/mapbox/vtcomposite/../gzip-hpp/include/gzip/decompress.hpp:13:7: error:   or ‘operator=(const gzip::Decompressor&)’ [-Werror=effc++]

at https://travis-ci.org/mapbox/vtcomposite/jobs/438830669#L986.

@artemp could you fix this warning/error? Perhaps it could be fixed by making the class noncopyable? You should be able to replicate locally with g++ and adding -Weffc++ (note: clang++ does not do anything with this option).

Yep, just deleting copy constructor and copy assignment operator would do. Done in b4684ca

@artemp
Copy link

artemp commented Oct 9, 2018

@artemp one more need that I noticed: we have a downstream consumer mapbox-maps that needs to support zlib-coded data as well as gzip-coded data. So, could you:

* [ ]  add a test for zlib-coded data (we must be missing one?)

* [ ]  ensure that this branch works to decompress it. This will likely require dynamically switching between `libdeflate_gzip_decompress` or `libdeflate_zlib_decompress` (or maybe `libdeflate_deflate_decompress_ex`?)

@springmeyer - I've no idea why this module called gzip-hpp.

I'm not buying 'for historical reasons', let's make it right, now, and avoid perpetual confusions!
deflate-hpp? libdeflate-hpp? libcoagulate? libvilipend ? Anything better ? Just lets not call apple oranges.

Once we're on the same page I'll add support for handling zlib (de)compression. In terms of APIs, I see two high level approaches. The first one is the correct one and the second is a 'lets-make-life-easier-by-hiding-implementation-switch-inside-a-class`. :)

  • Lets user decide (user is smart)
// This is not a working code just an illustration 
if (gzip) 
{
     deflate::gzip_decompressor obj{};
     // or
     deflate::decompressor<GZIP> obj{};
     // do the work
}
else if (zlib)
{
    deflate::zlib_decompressor obj{};
    // or
    deflate::decompressor<ZLIB> obj{};
    // do the work
}

This is how libdeflate is designed ^ offering maximum flexibility.

  • The second one
deflate::compressor obj{};
// do the work

Wondering if name choices here can be better:

Compressor::compress(...);
Decompressor::decompress(...);

?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants