-
-
Notifications
You must be signed in to change notification settings - Fork 245
Description
pmproxy is based on libuv, which works as depicted in this diagram: http://docs.libuv.org/en/v1.x/design.html#the-i-o-loop
When a filesystem change event happens, the discovery code starts and submits Redis requests - up to 66500 Redis requests after starting a single pmlogger. These requests are getting queued up until the next loop iteration. After the filesystem change callback is done, the libuv loop continues from the start and runs the Redis callbacks, which actually sends all Redis requests over the network (and processes any incoming Redis replies).
This pattern is problematic when there are multiple pmloggers, each writing to the disk at a similar time. All filesystem change events are run, i.e. thousands of Redis requests will be buffered (and occupying lots of memory) until they get sent in the next libuv loop iteration.
imho the problem is that the discovery code blocks the libuv loop, and doesn't let the Redis client process any requests or replies. This can be solved by either having a dedicated libuv loop (in a different thread), or partitioning the discovery function, so that not a single callback is blocking the loop, but that the discovery happens in parts.
I have some rough example code ready for the latter solution, i.e. splitting each loop iteration in series_cache_update
in a new libuv timer callback (andreasgerstmayr@27220a0). However, currently the discovery code is free'ing the pmResult
after invoking pmDiscoverInvokeValuesCallBack
, so running the code async will result in memory access of free'd memory (process_logvol
in discover.c
, and likely similar issues in other parts).