Skip to content

Add configurable limit for postings expansion in store-gateway /labels requests #12179

@dimitarvdimitrov

Description

@dimitarvdimitrov

Problem

Store-gateway /labels requests with broad matchers (e.g., {__name__!=""}) can cause OOM kills by consuming excessive memory during postings expansion. This was observed in production where:

  • A single /labels request consumed ~10Gi memory expanding ~1 billion postings (8 bytes each)
  • Multiple concurrent requests can exhaust store-gateway memory and crash the process

The memory allocation occurs in bucketIndexReader.expandedPostings() when calling index.ExpandPostings(result), which loads all matching postings into memory before filtering label values.

Proposed Solution

Mainly proposed by @bboreham

Add a configurable limit on memory usage per request in store-gateway, similar to the ingester's -ingester.read-path-memory-utilization-limit. Initially this will track postings memory usage, but can be extended to track other memory allocations as reactive limiters are implemented.

Configuration:

  • New flag: -blocks-storage.bucket-store.max-memory-bytes-per-request: percentage of GOMEMLIMIT
  • Suggested default: 10% of GOMEMLIMIT (e.g., 600M postings for 48Gi store-gateway = 4.8Gi limit)

Behavior:

  • Return gRPC status code ResourceExhausted when limit would be exceeded, rather than OOMing
  • Error message should indicate the query is too broad and suggest more specific matchers

Implementation approach:

  • Implement wrapper iterator around postings expansion (analogous to io.LimitedReader)
  • Add limit check in bucketIndexReader.expandedPostings() before calling index.ExpandPostings()
  • Initially could be logging-only to measure impact before enforcing

Consider starting with a warning-only mode to gather metrics on how often the limit would be hit before enforcing it.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions