Consolidation issue fix. #7

k0ushal · 2025-05-30T06:04:54Z

Fixed the tier based candidate selection
Default tiers are powers of 4 with the first tier being 0-4M followed by 4-16M, 16-64M and so on.
Fixed consolidation window of size 4

k0ushal · 2025-05-30T08:05:48Z

Documentation:
https://github.com/arangodb/documents/pull/145

goedderz

Comments as we talked about. Looks good to me!

goedderz · 2025-07-14T13:03:18Z

core/utils/index_utils.cpp

+      mergeBytes += itrMeta->byte_size;
+      skew = static_cast<double>(itrMeta->byte_size) / mergeBytes;
+      delCount += (itrMeta->docs_count - itrMeta->live_docs_count);
+      mergeScore = skew + (1.0 / (1 + delCount));
+      cost = mergeBytes * mergeScore;

-  size_t size_before_consolidation = 0;
-  size_t size_after_consolidation = 0;
-  size_t size_after_consolidation_floored = 0;
-  for (auto& segment_stat : consolidation) {
-    size_before_consolidation += segment_stat.meta->byte_size;
-    size_after_consolidation += segment_stat.size;
-    size_after_consolidation_floored +=
-      std::max(segment_stat.size, floor_segment_bytes);
+    } while (itr++ != end);


Probably inconsequential, but it would suffice to calculate skew, mergeScore and cost once after the loop for the last element.

goedderz · 2025-07-14T13:04:42Z

core/utils/index_utils.cpp

+  size_t nextTier = ConsolidationConfig::tier1;
+  while (nextTier < num)
+    nextTier = nextTier << 2;


Minor: You could probably use std::countl_zero and get rid of the loop.

goedderz · 2025-07-14T13:20:50Z

core/utils/index_utils.cpp

+    mergeBytes = mergeBytes - removeMeta->byte_size + addMeta->byte_size;
+    skew = static_cast<double>(addMeta->byte_size) / mergeBytes;
+    delCount = delCount - getDelCount(removeMeta) + getDelCount(addMeta);
+    mergeScore = skew + (1 / (1 + delCount));


As already discussed:

We should think about whether calculating the mergeScore this way is sensible. What seems strange is that while the skew is a ratio (of byte-sizes), the second summand is an inverse count. This seems off: intuitively I'd expect e.g. a ratio of live and total documents to be considered alongside the skew.

This is actually quite bad the way it is, worse than we noticed yesterday @k0ushal.

Note that $\mathrm{skew} \in (0, 1)$. With $\mathrm{delCount} = 1$, we get
$$\begin{align*} \mathrm{mergeScore} &= \mathrm{skew} + \frac{1}{1 + \mathrm{delCount}} \\\ &= \mathrm{skew} + \frac 1 2 \\\ &\leq 1 \frac 1 2 \\\ &= \mathrm{maxMergeScore} \end{align*}$$
.

So this way we are always allowed to consolidate if only one document has been deleted, no matter the size of the files or number of documents therein.

Let us at least do

mergeScore = skew + live_docs_count / total_docs_count;

instead, as discussed - this has more reasonable properties.

And as a second observation @neunhoef made today while discussing this: Adding these two values is probably not right, either. They should be multiplied instead; the maxMergeScore will need to be adjusted to 0.5 to get a similar effect.

So we should actually do

mergeScore = skew * live_docs_count / total_docs_count;

(and adapt maxMergeScore).

To understand this better, we should still do some formal worst-case analysis and some tests (specifically unit tests of the consolidation algorithm that play out certain usage scenarios).

goedderz · 2025-07-14T18:22:09Z

core/utils/index_utils.hpp

+    for (auto idx = start; idx != sorted_segments.end();) {
+
+        if (getSize(*idx) <= currentTier) {
+            idx++;
+            continue;
+        }
+
+        tiers.emplace_back(start, idx - 1);
+
+        //  The next tier may not necessarily be in the
+        //  next power of 4.
+        //  Consider this example,
+        //     [2, 4, 6, 8, 900]
+        //  While the 2, 4 fall in the 0-4 tier and 6, 8 fall
+        //  in the 4-16 tier, the last segment falls in
+        //  the [256-1024] tier.
+
+        currentTier = getConsolidationTier(getSize(*idx));
+        start = idx++;
+    }


As discussed: finding the tier-boundaries could be done by binary search, possibly utilizing std::lower_bound / std::upper_bound.

goedderz · 2025-07-15T10:25:47Z

core/utils/index_utils.cpp

+    mergeBytes = mergeBytes - removeMeta->byte_size + addMeta->byte_size;
+    skew = static_cast<double>(addMeta->byte_size) / mergeBytes;
+    delCount = delCount - getDelCount(removeMeta) + getDelCount(addMeta);
+    mergeScore = skew + (1 / (1 + delCount));


This is actually quite bad the way it is, worse than we noticed yesterday @k0ushal.

Note that $\mathrm{skew} \in (0, 1)$. With $\mathrm{delCount} = 1$, we get
$$\begin{align*} \mathrm{mergeScore} &= \mathrm{skew} + \frac{1}{1 + \mathrm{delCount}} \\\ &= \mathrm{skew} + \frac 1 2 \\\ &\leq 1 \frac 1 2 \\\ &= \mathrm{maxMergeScore} \end{align*}$$
.

So this way we are always allowed to consolidate if only one document has been deleted, no matter the size of the files or number of documents therein.

Let us at least do

mergeScore = skew + live_docs_count / total_docs_count;

instead, as discussed - this has more reasonable properties.

And as a second observation @neunhoef made today while discussing this: Adding these two values is probably not right, either. They should be multiplied instead; the maxMergeScore will need to be adjusted to 0.5 to get a similar effect.

So we should actually do

mergeScore = skew * live_docs_count / total_docs_count;

(and adapt maxMergeScore).

To understand this better, we should still do some formal worst-case analysis and some tests (specifically unit tests of the consolidation algorithm that play out certain usage scenarios).

- Fixed the tier based candidate selection - Default tiers are powers of 4 with the first tier being 0-4M followed by 4-16M, 16-64M and so on. - Fixed consolidation window of size 4

Changed consolidation config defaults

Disabled irrelevant tests

k0ushal requested review from kvahed, neunhoef, johann-listunov and goedderz May 30, 2025 06:04

k0ushal self-assigned this May 30, 2025

k0ushal requested a review from alexbakharew May 30, 2025 08:09

k0ushal marked this pull request as draft July 9, 2025 08:18

k0ushal force-pushed the bugfix/consolidation-issues branch from 57714c9 to c1e6ebb Compare July 11, 2025 12:34

k0ushal changed the base branch from master to bugfix/iresearch-address-table-tests July 14, 2025 09:03

k0ushal changed the base branch from bugfix/iresearch-address-table-tests to master July 14, 2025 09:04

k0ushal changed the base branch from master to bugfix/iresearch-address-table-tests July 14, 2025 09:05

goedderz approved these changes Jul 14, 2025

View reviewed changes

goedderz reviewed Jul 14, 2025

View reviewed changes

goedderz requested changes Jul 15, 2025

View reviewed changes

k0ushal force-pushed the bugfix/iresearch-address-table-tests branch from 07286d8 to 872d553 Compare July 16, 2025 19:35

k0ushal force-pushed the bugfix/consolidation-issues branch 2 times, most recently from fb73fcd to f6305e3 Compare July 17, 2025 07:34

k0ushal deleted the branch master July 18, 2025 07:56

k0ushal closed this Jul 18, 2025

goedderz reopened this Jul 23, 2025

goedderz changed the base branch from bugfix/iresearch-address-table-tests to master July 23, 2025 13:02

k0ushal force-pushed the bugfix/consolidation-issues branch from f6305e3 to 21a2f95 Compare July 23, 2025 13:05

k0ushal added 2 commits July 24, 2025 15:42

Consolidation issue fix.

48cecc3

- Fixed the tier based candidate selection - Default tiers are powers of 4 with the first tier being 0-4M followed by 4-16M, 16-64M and so on. - Fixed consolidation window of size 4

Fixed ConsolidationTier unit tests

d91b909

k0ushal force-pushed the bugfix/consolidation-issues branch from 21a2f95 to d91b909 Compare July 24, 2025 15:43

k0ushal requested a review from goedderz August 25, 2025 07:49

k0ushal added 2 commits August 26, 2025 07:39

Fixed consolidation candidate selection approach

c4ce867

Changed consolidation config defaults

Changed ConsolidationTier unit tests to follow the new algorithm

4eb4b06

Disabled irrelevant tests

k0ushal force-pushed the bugfix/consolidation-issues branch from 79070ae to 5165f01 Compare August 26, 2025 07:41

Removed irrelevant tests

9cfc1fc

k0ushal force-pushed the bugfix/consolidation-issues branch from 5165f01 to 9cfc1fc Compare August 26, 2025 07:56

k0ushal marked this pull request as ready for review August 26, 2025 11:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Consolidation issue fix. #7

Consolidation issue fix. #7

Uh oh!

k0ushal commented May 30, 2025

Uh oh!

k0ushal commented May 30, 2025

Uh oh!

goedderz left a comment

Uh oh!

goedderz Jul 14, 2025

Uh oh!

goedderz Jul 14, 2025

Uh oh!

goedderz Jul 14, 2025

Uh oh!

goedderz Jul 15, 2025 •

edited

Loading

Uh oh!

goedderz Jul 14, 2025

Uh oh!

goedderz Jul 15, 2025 •

edited

Loading

Uh oh!

Uh oh!

Consolidation issue fix. #7

Are you sure you want to change the base?

Consolidation issue fix. #7

Uh oh!

Conversation

k0ushal commented May 30, 2025

Uh oh!

k0ushal commented May 30, 2025

Uh oh!

goedderz left a comment

Choose a reason for hiding this comment

Uh oh!

goedderz Jul 14, 2025

Choose a reason for hiding this comment

Uh oh!

goedderz Jul 14, 2025

Choose a reason for hiding this comment

Uh oh!

goedderz Jul 14, 2025

Choose a reason for hiding this comment

Uh oh!

goedderz Jul 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

goedderz Jul 14, 2025

Choose a reason for hiding this comment

Uh oh!

goedderz Jul 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

goedderz Jul 15, 2025 •

edited

Loading

goedderz Jul 15, 2025 •

edited

Loading