Skip to content

Conversation

peteralfonsi
Copy link
Contributor

Description

Changes field data cache size setting to default to 35% rather than no limit (-1). Previously, by default the FD cache was only limited by the (separate) field breaker setting at 40% of heap size, which would confusingly stop further requests rather than just evicting. Also makes the size setting dynamic.

As mentioned in the original issue, I'm open to changing the actual values from 35% and 40% to whatever people think is reasonable. See original issue for reasoning behind the two settings having different values.

Related Issues

Resolves #19104

Check List

  • Functionality includes testing.
  • API changes companion pull request created, if applicable.
  • Public documentation issue/PR created, if applicable.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@github-actions github-actions bot added enhancement Enhancement or improvement to existing feature or request Search Search query, autocomplete ...etc labels Aug 26, 2025
Signed-off-by: Peter Alfonsi <[email protected]>
Copy link
Contributor

❌ Gradle check result for 285140f: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@peteralfonsi peteralfonsi marked this pull request as draft August 26, 2025 19:19
Signed-off-by: Peter Alfonsi <[email protected]>
@peteralfonsi
Copy link
Contributor Author

Flaky test: #14407

Copy link
Contributor

❌ Gradle check result for cd7b00c: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@peteralfonsi
Copy link
Contributor Author

Flaky test: #18157

Signed-off-by: Peter Alfonsi <[email protected]>
Copy link
Contributor

❕ Gradle check result for f4ab279: UNSTABLE

Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure.

Copy link

codecov bot commented Aug 26, 2025

Codecov Report

❌ Patch coverage is 40.00000% with 6 lines in your changes missing coverage. Please review.
✅ Project coverage is 72.97%. Comparing base (f5d41fb) to head (3ec4e78).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
...indices/fielddata/cache/IndicesFieldDataCache.java 28.57% 5 Missing ⚠️
...c/main/java/org/opensearch/common/cache/Cache.java 0.00% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main   #19152      +/-   ##
============================================
+ Coverage     72.95%   72.97%   +0.01%     
- Complexity    69701    69711      +10     
============================================
  Files          5655     5655              
  Lines        319867   319882      +15     
  Branches      46337    46339       +2     
============================================
+ Hits         233364   233432      +68     
+ Misses        67584    67526      -58     
- Partials      18919    18924       +5     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@peteralfonsi
Copy link
Contributor Author

Not sure why codecov thinks updateMaximumWeight() isn't covered, it is covered by the IT I added (confirmed this with debugger).

@peteralfonsi peteralfonsi marked this pull request as ready for review August 27, 2025 16:36
Copy link
Contributor

@jainankitk jainankitk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @peteralfonsi for this change. While the idea looks good, I am wondering if we can collect some stats on maximum field data entry size in terms of JVM percentage to determine if the 5% differential is really sufficient?

Signed-off-by: Peter Alfonsi <[email protected]>
@peteralfonsi
Copy link
Contributor Author

@jainankitk I don't have data on individual entry size, but I do have some data on overall FD cache size across ~200 domains which is an upper bound for that. Looks like the large majority of domains have <2.5% heap usage for the whole cache with only a few higher than that. (y-axis is node count, not domain count):
Screenshot 2025-08-11 at 2 38 48 PM
(below is the same graph with the y-axis zoomed in)
Screenshot 2025-08-11 at 2 39 03 PM

That said I think reducing cache.size further to like 25% is totally reasonable, giving us a delta of 15%. What do you think?


Copy link
Contributor

❕ Gradle check result for 3ec4e78: UNSTABLE

Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure.

Comment on lines 111 to 113
public static final Setting<ByteSizeValue> FIELDDATA_CIRCUIT_BREAKER_LIMIT_SETTING = Setting.memorySizeSetting(
"indices.breaker.fielddata.limit",
"40%",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this setting now? Considering we are now capping the field data cache size to 35% or similar.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it does make sense to keep it for 2 reasons. First is backwards compatibility. But more important is if a single request comes in that's trying to create a huge amount of field data, like 50% of the heap, this setting would cancel the request before it can happen, rather than having it complete, fill the cache, and then get evicted, which would be bad for performance.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One minor note: the size of the field data should be calculated after it is generated, rather than checked beforehand.
https://github.com/opensearch-project/OpenSearch/blob/main/server/src/main/java/org/opensearch/index/fielddata/ordinals/GlobalOrdinalsBuilder.java#L85

Comment on lines 83 to +87
public static final Setting<ByteSizeValue> INDICES_FIELDDATA_CACHE_SIZE_KEY = Setting.memorySizeSetting(
"indices.fielddata.cache.size",
new ByteSizeValue(-1),
Property.NodeScope
MAX_SIZE_PERCENTAGE.toString(),
Property.NodeScope,
Property.Dynamic
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should also have a min and max value here, so that the user does not accidentally/erroneously update this setting to either a very low value or a high value.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, might be good to have min as 0% and max as the value of the circuit breaker setting (if you can actually chain them together this way, I'm not sure, if you can't then we can just hardcode them both to the same 40% value that gets defined as a static variable somewhere). Do you think those are reasonable limits?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like there isn't a memory size setting ctor with min/max size (😮). I'll add one

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually this requires more changes than anticipated so I'll make that a separate PR, which will also include min/max values for this setting (assuming this PR goes in first)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Enhancement or improvement to existing feature or request Search Search query, autocomplete ...etc
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature Request] Change default settings for field data cache size
4 participants