Skip to content

feat(observability): Phase 1 — jemalloc + comprehensive Prometheus metrics#405

Draft
spreston8 wants to merge 6 commits intorust/devfrom
feat/phase1-observability
Draft

feat(observability): Phase 1 — jemalloc + comprehensive Prometheus metrics#405
spreston8 wants to merge 6 commits intorust/devfrom
feat/phase1-observability

Conversation

@spreston8
Copy link
Copy Markdown
Collaborator

Summary

  • Installs tikv-jemallocator 0.6 as the global allocator (with profiling + background_threads features) to enable jemalloc memory profiling and background thread management
  • Wires a MemoryReporter background task (10s interval) that emits RSS and peak virtual memory gauges on Linux via /proc/self/status
  • Adds 16 Prometheus gauges covering all major unbounded in-memory structures across the node, rspace++, block-storage, and comm crates

Metrics added

Metric Crate What it tracks
process_memory_rss_bytes node Linux RSS
process_memory_peak_bytes node Linux peak virtual
transport_channels_count node gRPC peer channels map
casper_requested_blocks_count node In-flight block requests
rspace_hot_store_data_channels rspace++ HotStore data map
rspace_hot_store_continuations rspace++ HotStore continuations
rspace_hot_store_joins rspace++ HotStore joins
rspace_history_cache_continuations rspace++ History read-through cache (never cleared)
rspace_history_cache_datums rspace++ History read-through cache (never cleared)
rspace_history_cache_joins rspace++ History read-through cache (never cleared)
dag_blocks_total block-storage DAG total blocks
dag_finalized_blocks_total block-storage DAG finalized blocks
dag_height_map_entries block-storage DAG height index
casper_buffer_pending_blocks block-storage Blocks awaiting parent resolution
casper_buffer_dependency_free_blocks block-storage Buffer-ready blocks
comm_stream_cache_size comm Blob streaming cache

Test plan

  • cargo check passes on the branch
  • Node starts and /metrics endpoint (or Prometheus scrape) shows all 16 gauge families
  • Under load, rspace_history_cache_* gauges grow monotonically (confirming the unbounded leak is now visible)
  • process_memory_rss_bytes tracks actual RSS on a Linux deployment

…trics

Switch to jemalloc as the global allocator for better memory return
behaviour and heap-profiling support (MALLOC_CONF=prof:true).

Add a background memory_reporter thread emitting:
  - process_memory_rss_bytes / process_memory_peak_bytes (Linux /proc)
  - transport_channels_count   — live gRPC peer-connection map size
  - casper_requested_blocks_count — in-flight block-request tracker size

Add inline metrics::gauge! calls covering all major unbounded structures:

  HotStore (rspace++):
  - rspace_hot_store_data_channels / _continuations / _joins
    (updated on every put, zeroed on clear)
  - rspace_history_cache_continuations / _datums / _joins
    (emitted on cache miss — the previously untracked HistoryStoreCache
    that grows without bound and is never cleared)

  BlockDagKeyValueStorage (block-storage):
  - dag_blocks_total / dag_finalized_blocks_total / dag_height_map_entries
    (updated after every insert_internal)

  CasperBufferKeyValueStorage (block-storage):
  - casper_buffer_pending_blocks — blocks awaiting parent resolution
  - casper_buffer_dependency_free_blocks — blocks ready to process
    (updated after every add_relation and remove)

  GrpcTransportClient / StreamObservable (comm):
  - comm_stream_cache_size — shared blob streaming cache
    (emitted on every enqueue attempt)

Fix metrics crate version: workspace pinned to "0.23" to match
metrics-exporter-prometheus 0.15 (was "0.24", incompatible).
jemalloc's configure script uses an autoconf runtime test to detect the
strerror_r return type variant (POSIX int vs GNU char*). During cross-
compilation on an x86 host targeting aarch64-unknown-linux-gnu the test
binary cannot execute, causing configure to abort with:

  configure: error: cannot determine return type of strerror_r

Fix by setting the autoconf cache variable ac_cv_func_strerror_r_char_p=no
in the Dockerfile build step before xx-cargo build. This tells jemalloc's
configure to use the POSIX int-returning variant directly, bypassing the
runtime detection. On Linux aarch64 with glibc this is the correct answer.
… and docs

- Fix BLOCK_REQUESTS_TOTAL_METRIC name (was "block.requests.total", causing
  double _total suffix in Prometheus: block_requests_total_total)
- Rewrite prometheus-rules.yml for Rust label-selector pattern
  (metric_name{source="..."} instead of Kamon-style flat names)
- Update prometheus-grafana.md with Phase 1 memory growth analysis:
  ~10-17 MB/block linear growth confirmed, DAG non-pruning identified
  as root cause, block-retriever fetching 94% of blocks documented,
  process_memory_peak_bytes jemalloc virtual memory caveat noted,
  block_validation_step_*_time unit mismatch documented as Phase 2 fix
…tation

Add jemalloc epoch stats reporter emitting allocated/active/mapped/resident/retained bytes.
Add LMDB data.mdb file-size gauges for rspace/history, rspace/cold, blockstorage, dagstorage, eval/history.
Add memory_metrics recording rules for jemalloc overhead and LMDB total size.
Depend on tikv-jemalloc-ctl with stats feature.
…oc_reporter

Move SYSTEM_METRICS_SOURCE import inside #[cfg(not(test))] block and rename
parameter to _interval so both are visible only where jemalloc is active.
Fixes unused-imports and unused-variables errors in test builds.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant