Skip to content

GH-135: Store trace_id, span_id, and parent_span_id as FixedSizeBinary#136

Merged
frisbeeman merged 9 commits into
mainfrom
GH-135/optimize
May 7, 2026
Merged

GH-135: Store trace_id, span_id, and parent_span_id as FixedSizeBinary#136
frisbeeman merged 9 commits into
mainfrom
GH-135/optimize

Conversation

@frisbeeman
Copy link
Copy Markdown
Member

@frisbeeman frisbeeman commented May 3, 2026

Summary by CodeRabbit

  • New Features

    • Added cAdvisor DaemonSet for container metrics collection
  • Configuration

    • Increased default row-group size to 20,000
    • Added configurable Parquet data page size limit for writes
    • Per-column Parquet encoding overrides and writer tuning enabled
  • Schema Changes

    • Trace/span IDs switched to fixed-length binary storage (fixed-size bytes)
    • Query/formatters/export paths updated to handle and hex-encode fixed-size IDs where needed

@frisbeeman frisbeeman requested review from a team and s-prosvirnin May 3, 2026 02:31
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 3, 2026

Warning

Rate limit exceeded

@frisbeeman has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 44 minutes and 15 seconds before requesting another review.

To continue reviewing without waiting, purchase usage credits in the billing tab.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 87fc17c7-f13a-409a-b5a0-4a2b011f4bc8

📥 Commits

Reviewing files that changed from the base of the PR and between a7d7fc9 and 2e04143.

📒 Files selected for processing (1)
  • crates/icegate-queue/src/config.rs

Walkthrough

Migrates trace/span IDs from hex strings to fixed-length binary across ingest, WAL, Parquet, and query; adds per-column Parquet encoding overrides and shared writer-properties builder; exposes Parquet data-page-size config; and adds a cAdvisor DaemonSet + ServiceMonitor.

Changes

Binary ID & Parquet encoding migration

Layer / File(s) Summary
Schema / Data shape
crates/icegate-common/src/schema.rs, crates/icegate-common/src/SCHEMA.md
Trace/span IDs (and nested links/exemplars) changed from string to fixed-length binary: trace_id = Fixed(16), span_id/parent_span_id = Fixed(8); SCHEMA.md updated with migration guide.
Parquet writer API / Encodings
crates/icegate-common/src/parquet_writer.rs
crates/icegate-common/src/parquet_encoding.rs
crates/icegate-common/src/lib.rs
New build_writer_properties(...) builder and exported ColumnEncoding alias; table-specific encoding constant slices (LOGS/SPANS/EVENTS/METRICS) added and documented.
Ingest transform (write-side)
crates/icegate-ingest/src/transform.rs
Transform writes IDs as raw bytes into FixedSizeBinaryBuilder; stops mirroring promoted indexed columns into the attributes MAP (adds skip list); tests updated.
WAL sorting / boundary / cache
crates/icegate-ingest/src/wal/columns.rs
crates/icegate-ingest/src/wal/boundary.rs
crates/icegate-ingest/src/wal/sorter.rs
crates/icegate-ingest/src/wal/test_utils.rs
Sort cache and comparisons now support Fixed(N) via FixedSizeBinaryArray; RowGroupBoundaryValue::FixedBytes added with width validation and lexicographic ordering; tests added/updated.
Shift config & Iceberg writer wiring
crates/icegate-ingest/src/shift/config.rs
crates/icegate-ingest/src/shift/mod.rs
crates/icegate-ingest/src/shift/iceberg_storage.rs
ShiftWriteConfig adds data_page_size_limit_bytes (default 2 MiB) and bumps row_group_size default to 20_000; ShiftJobSpec gets column_encodings; IcebergStorage::new extended to accept and thread encodings + data-page-size into build_writer_properties.
Queue writer & ingest CLI wiring
crates/icegate-queue/src/writer.rs
crates/icegate-queue/src/config.rs
crates/icegate-ingest/src/cli/commands/run.rs
QueueWriter gains per-topic column_encodings and with_column_encodings(...); max_row_group_size default → 20_000; CLI wires topic→encoding maps into QueueWriter and ShiftJobSpec.
Parquet metadata scan & stats pruning
crates/icegate-query/src/engine/metadata_scan/parquet_reader.rs
Dictionary decoding and stats-based pruning extended to handle FIXED_LEN_BYTE_ARRAY dictionary pages and Statistics::FixedLenByteArray; new tests added.
Query planner, predicates, formatters
crates/icegate-query/src/logql/datafusion/planner.rs
crates/icegate-query/src/loki/predicate.rs
crates/icegate-query/src/loki/formatters.rs
crates/icegate-query/src/tempo/formatters.rs
crates/icegate-query/src/tempo/trace_by_id.rs
Planner/predicate layers special-case trace_id/span_id: hex-decode and validate to produce typed FixedSizeBinary literals/predicates or fold/omit invalid matches; formatters read fixed-binary IDs and hex-encode for labels/JSON while proto paths pass raw bytes; trace-by-id decodes URL hex into binary filter.
TraceQL / search plumbing
crates/icegate-query/src/traceql/datafusion/planner.rs
crates/icegate-query/src/traceql/datafusion/selectors.rs
Stage-1 returns Vec<Vec<u8>>, stage-2 accepts binary trace IDs and builds FixedSizeBinary(16) IN predicates; literal_to_scalar decodes TraceID/SpanID hex into typed FixedSizeBinary scalars; root-span logic updated for null or all-zero parent sentinel.
Tests, benches, docs & config
crates/icegate-query/tests/**, crates/icegate-query/benches/common/harness.rs, crates/icegate-common/Cargo.toml, config/helm/icegate/values.yaml, config/helm/icegate/templates/configmap-ingest.yaml
Extensive test fixtures migrated to FixedSizeBinaryBuilder; new test helpers and tests added; two small Loki label-values tests ignored pending dict fallback; Helm values bumped (rowGroupSize→20000, maxRecordBatchesPerTask→192) and ConfigMap exposes data_page_size_limit_bytes.

Container metrics (cAdvisor) deployment

Layer / File(s) Summary
DaemonSet / Service / SA
config/kustomize/base/cadvisor/daemonset.yaml
.../service.yaml
.../serviceaccount.yaml
Adds cAdvisor DaemonSet (gcr.io/cadvisor:v0.55.1), headless Service exposing port 8080, and ServiceAccount in observability namespace.
ServiceMonitor / relabeling
config/kustomize/base/cadvisor/servicemonitor.yaml
New ServiceMonitor scrapes /metrics every 30s, applies relabeling to set node/instance, rebuilds core labels from cAdvisor container labels, drops low-value metrics, and includes CFS orphan UID extraction rules.
Kustomize & Prometheus values
config/kustomize/base/kustomization.yaml
config/kustomize/base/kustomization.yaml (base)
config/kustomize/base/values-prometheus.yaml
Adds cadvisor to base kustomization and disables kubelet cAdvisor scraping in Prometheus values (replaces earlier relabeling location).

Sequence Diagram

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related issues

Possibly related PRs

  • icegatetech/icegate#118 — overlaps on shift/Spans/WAL job plumbing that this change extends with column encodings.
  • icegatetech/icegate#45 — modifies trace/span ID handling and related schema/transform/planner code (potentially conflicting approaches).
  • icegatetech/icegate#138 — related Parquet writer/config plumbing and shift/ingest writer properties changes.

Suggested reviewers

  • s-prosvirnin
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch GH-135/optimize

@frisbeeman frisbeeman changed the title GH-135: Store trace_id, span_id, and parent_span_id as `FixedSi… GH-135: Store trace_id, span_id, and parent_span_id as FixedSizeBinary May 3, 2026
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
crates/icegate-ingest/src/wal/boundary.rs (1)

136-145: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Reject mixed-width FixedBytes components during structure validation.

Two FixedBytes values with different byte lengths are currently treated as structurally compatible and then compared lexicographically. For fixed-width IDs that accepts impossible boundary metadata and can validate a corrupt range after deserialization. Please fail fast when both sides are FixedBytes but their lengths differ.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/icegate-ingest/src/wal/boundary.rs` around lines 136 - 145, The
structural compatibility check in the match over (&self.value, &other.value)
must reject cases where both values are FixedBytes with different lengths:
inside the branch that currently checks boundary_value_discriminant(left) !=
boundary_value_discriminant(right) add an additional early error for the case
when both variants are FixedBytes but left.len() != right.len() (or the
appropriate length accessor on the FixedBytes variant) and return
Err(IngestError::Shift(...)) describing mismatched fixed byte widths; keep using
the same error type/format as the surrounding code and reference the FixedBytes
variant, boundary_value_discriminant, and the same component index context in
the message.
crates/icegate-query/src/logql/datafusion/planner.rs (2)

1240-1247: ⚠️ Potential issue | 🟠 Major

Regex matchers still target the raw binary column without type conversion.

For trace_id and span_id (which are FixedSizeBinary columns), the code at lines 1240–1247 calls regexp_like(col_expr, ...) directly on the raw binary column without converting it to a hex string first. While equality matchers properly use the typed val variable, regex matchers bypass it entirely. This causes a type mismatch: regexp_like() expects string arguments, not binary. Either hex-encode/cast the binary columns before passing them to regexp_like(), or reject regex matchers (=~, !~) for these fixed-binary labels explicitly.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/icegate-query/src/logql/datafusion/planner.rs` around lines 1240 -
1247, Regex matchers (MatchOp::Re / MatchOp::Nre) call
datafusion::functions::regex::regexp_like() directly on col_expr (the raw
FixedSizeBinary columns like trace_id/span_id) which causes a type mismatch;
convert those binary columns to a string representation (hex-encode or cast to
UTF-8) before calling regexp_like, using the same typed value used by equality
matchers (the val variable / matcher.value) or explicitly reject =~ / !~ for
fixed-binary labels. Locate the MatchOp::Re and MatchOp::Nre branches, wrap
col_expr with the binary->string conversion (or add a guard that returns an
error for regex on FixedSizeBinary) and pass the converted expression and
matcher.value to regexp_like (and call .not() for Nre).

1225-1239: ⚠️ Potential issue | 🟠 Major

Fix != operator semantics on invalid trace_id/span_id values.

When hex decode/length validation fails, both = and != currently compare against FixedSizeBinary(..., None). In DataFusion, col != NULL evaluates to NULL and filters out all rows, making trace_id != "badhex" match nothing instead of all valid IDs. Move the invalid-input check before val construction and apply different fallback expressions: for Eq use NULL (match nothing), for Neq use a row-matching expression (match all valid rows). Also note that regex matchers on binary columns (trace_id/span_id) require handling separate from the lit(ScalarValue) path.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/icegate-query/src/logql/datafusion/planner.rs` around lines 1225 -
1239, The current logic builds val first then uses it for both Eq and Neq,
causing `col != NULL` to filter out all rows; instead, for COL_TRACE_ID and
COL_SPAN_ID (identified by mapped_label, COL_TRACE_ID, COL_SPAN_ID) validate
hex::decode + length before constructing val and branch on MatchOp: if decode is
invalid then for MatchOp::Eq return an expression that matches nothing (i.e.
compare against NULL/other "no-match" expression) while for MatchOp::Neq return
an expression that matches all valid rows (e.g. a truthy row-matching expression
such as an is-not-null or always-true predicate for that binary column); only
when decode+length are valid construct lit(ScalarValue::FixedSizeBinary(...,
Some(bytes))) and use col_expr.eq(...) / col_expr.not_eq(...). Also handle regex
matchers for binary columns separately from the lit(ScalarValue) path.
🧹 Nitpick comments (5)
crates/icegate-query/src/engine/provider/scan.rs (1)

199-203: Confirm that row_selection overhead is justified across all workloads.

Enabling row selection parses the Parquet page index to filter row ranges within row groups. Per iceberg-rust docs, this overhead can outweigh the benefits: "parsing the parquet page index…can be slow enough that parsing the page index outweighs any gains from the reduced number of rows that need scanning."

Since with_row_selection_enabled(true) is unconditional (applies to every scan regardless of predicate selectivity), verify via benchmarks or profiling that the page index parsing overhead justifies enabling it for all queries.

Also confirm that iceberg-rust's page index evaluator correctly handles FixedSizeBinary column predicates. The codebase creates FixedSizeBinary predicates extensively for trace_id (16 B), span_id (8 B), and parent_span_id (8 B); ensure these benefit from page-level filtering rather than falling back to full row-group inclusion.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/icegate-query/src/engine/provider/scan.rs` around lines 199 - 203,
Unconditionally enabling row selection via
scan_builder.with_row_selection_enabled(true) may add page-index parsing
overhead that isn't always beneficial; run targeted benchmarks/profiling
comparing scans with and without row selection (use ICEBERG_SCAN_BATCH_SIZE as
in the current build) to determine overall benefit and gate the call if needed,
and separately verify iceberg-rust's page index evaluator correctly supports
FixedSizeBinary predicates by testing predicate pushdown for FixedSizeBinary
columns (trace_id, span_id, parent_span_id) to ensure they produce page-level
filtering rather than forcing full row-group reads; if benchmarks show
regressions, make row selection conditional (e.g., based on predicate
selectivity estimation or a config flag) and add unit/integration tests
confirming FixedSizeBinary predicate handling.
crates/icegate-query/tests/tempo/harness.rs (1)

219-225: ⚡ Quick win

Seed at least one non-null parent_span_id here.

parent_span_id is part of the fixed-size-binary migration and is also covered by the shared Parquet encoding overrides, but this fixture still writes three nulls. That means these Tempo tests never serialize a real parent span ID through the new path, so a regression there would slip by unnoticed.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/icegate-query/tests/tempo/harness.rs` around lines 219 - 225, The test
currently creates parent_span_id as three null FixedSizeBinary entries using
FixedSizeBinaryArray::try_from_sparse_iter_with_size and repeat_n(None, 3);
change the iterator so at least one element is Some(&[u8;8]) (an 8-byte slice)
and the other entries remain None (e.g., iterator over [Some(bytes8), None,
None]) so the FixedSizeBinaryArray contains at least one real parent span ID;
update the call to FixedSizeBinaryArray::try_from_sparse_iter_with_size
accordingly (keeping the size 8) so the Tempo tests serialize a real
parent_span_id through the migration path.
crates/icegate-ingest/src/shift/config.rs (1)

290-294: ⚡ Quick win

Add a unit test for data_page_size_limit_bytes == 0.

This guard is new and the Helm chart now exposes the field, but the test module does not pin it yet. A tiny regression test here would keep the config validation and chart wiring from drifting.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/icegate-ingest/src/shift/config.rs` around lines 290 - 294, Add a unit
test that exercises validation when write.data_page_size_limit_bytes == 0:
create a config instance (or builder) with write.data_page_size_limit_bytes set
to 0, call the validation method (the impl that returns Result, e.g., validate
or the method that produced Err(IngestError::Config)), and assert it returns
Err(IngestError::Config) and that the error message contains
"data_page_size_limit_bytes must be greater than zero"; name the test something
like test_data_page_size_limit_bytes_zero to live in the existing test module so
future changes to the Helm chart/field are caught.
crates/icegate-common/src/parquet_encoding.rs (1)

135-165: ⚡ Quick win

DELTA_BYTE_ARRAY on FIXED_LEN_BYTE_ARRAY is spec-valid; test coverage is the remaining gap

The official Apache Parquet encoding table explicitly lists DELTA_BYTE_ARRAY = 7 as supported for both BYTE_ARRAY and FIXED_LEN_BYTE_ARRAY, so the encoding choices for trace_id, span_id, and parent_span_id are spec-compliant. The spec also notes that "even for FIXED_LEN_BYTE_ARRAY, all lengths are encoded despite the redundancy" — this is by-spec design and does not affect correctness.

The gap is that none of the test suites (neither this module nor writer.rs nor iceberg_storage.rs) exercise a FixedSizeBinary column with an active DELTA_BYTE_ARRAY override. Consider adding at least one test here (or in writer.rs) that writes a FixedSizeBinary column with dictionary disabled + DELTA_BYTE_ARRAY and reads it back, verifying parquet-rs's encoder and decoder agree on the binary payload.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/icegate-common/src/parquet_encoding.rs` around lines 135 - 165, Add a
regression test that exercises a FixedSizeBinary column using DELTA_BYTE_ARRAY
encoding so we verify parquet-rs can encode/decode that combination: create a
small Parquet write+read roundtrip (either as a new test in this module or in
writer.rs) that defines a FixedSizeBinary column (FixedSizeBinary type),
disables dictionary encoding for that column, sets the encoding override to
Encoding::DELTA_BYTE_ARRAY (matching the SPANS_COLUMN_ENCODINGS use of
DELTA_BYTE_ARRAY for trace_id/span_id/parent_span_id), writes several rows with
varying binary values, then reads them back and asserts the binary payloads
match; ensure the test fails if the encoder/decoder disagree.
crates/icegate-queue/src/writer.rs (1)

466-479: ⚡ Quick win

Add a roundtrip test for DELTA_BYTE_ARRAY on a FixedSizeBinary column

The encoding logic looks correct, but the existing writer tests (writer_emits_bloom_filter_for_columns_listed_for_topic, etc.) all use StringArray/Int32Array and the encoding map is never populated in any test. The central claim of this PR — that DELTA_BYTE_ARRAY compresses FIXED_LEN_BYTE_ARRAY trace_id/span_id columns more efficiently than dictionary — is not exercised at all in the test suite.

The Parquet spec confirms DELTA_BYTE_ARRAY (encoding 7) is valid for both BYTE_ARRAY and FIXED_LEN_BYTE_ARRAY. However, the parquet-rs crate's encoder for FIXED_LEN_BYTE_ARRAY with DELTA_BYTE_ARRAY should be verified via a roundtrip test. A minimal test would:

  1. Build a schema with a FixedSizeBinary(16) column (Arrow DataType::FixedSizeBinary(16)).
  2. Populate it with representative 16-byte values.
  3. Write via QueueWriter with column_encodings set to &[("trace_id", Encoding::DELTA_BYTE_ARRAY)].
  4. Read back and assert the values match, confirming parquet-rs encodes/decodes the path without error or silent fallback.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/icegate-queue/src/writer.rs` around lines 466 - 479, Add a roundtrip
unit test that verifies parquet-rs actually encodes/decodes a FixedSizeBinary
column with DELTA_BYTE_ARRAY: create an Arrow schema containing
DataType::FixedSizeBinary(16) for a "trace_id" column, populate it with several
16-byte values, construct a QueueWriter with column_encodings mapping the topic
to &[("trace_id", Encoding::DELTA_BYTE_ARRAY)], write the record batch to an
in-memory sink, read the written Parquet back (using the same reader path the
codebase uses) and assert the decoded FixedSizeBinary values equal the
originals; put this under a clear test name (e.g.,
writer_roundtrip_delta_byte_array_fixed_size_binary) so CI covers the new
encoding path for QueueWriter and column_encodings.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@crates/icegate-common/src/schema.rs`:
- Around line 89-102: Update SCHEMA.md to reflect that trace_id and span_id are
stored as fixed-length binary types (PrimitiveType::Fixed(16) and
PrimitiveType::Fixed(8)) rather than VARCHAR/STRING to match the Rust schema
where NestedField::optional(..., "trace_id",
Type::Primitive(PrimitiveType::Fixed(16))) and NestedField::optional(...,
"span_id", Type::Primitive(PrimitiveType::Fixed(8))) are defined; add a
Migration Guide section that explains the current upgrade_schemas() behavior (it
detects schema diffs and returns an error) and instructs operators to perform
the manual destructive migration by dropping and recreating affected tables
using the documented command (icegate-maintain migrate create), plus a short
checklist of which tables (logs, spans, events, metrics) are affected and a
recommended backup step before dropping tables.

In `@crates/icegate-ingest/src/wal/columns.rs`:
- Around line 121-131: The FixedSizeBinary branch (when matching
PrimitiveType::Fixed) must validate the actual byte width: after downcasting to
FixedSizeBinaryArray (the current
as_any().downcast_ref::<FixedSizeBinaryArray>() call), read its data_type() to
extract the declared width (DataType::FixedSizeBinary(n_actual)) and compare
n_actual to the schema width (the PrimitiveType::Fixed(n_expected) value); if
they differ return an Err(IngestError::Shift(...)) similar to the existing error
message using column_name and context, otherwise proceed to construct
CachedSortColumn::FixedBytes with the cloned array, sort_column.descending and
sort_column.nulls_first. Ensure you reference PrimitiveType::Fixed,
FixedSizeBinaryArray, CachedSortColumn::FixedBytes, IngestError::Shift,
sort_column and column in the fix.

In `@crates/icegate-query/src/engine/metadata_scan/parquet_reader.rs`:
- Around line 259-275: eval_binary_on_stats currently only inspects
Statistics::ByteArray causing FIXED_LEN_BYTE_ARRAY columns (trace_id/span_id) to
bypass stats pruning; update eval_binary_on_stats to also handle the
fixed-length case by matching Statistics::FixedLenByteArray (or whatever the
parquet Statistics variant for fixed-length bytes is) and decode/compare its
min/max values the same way you handle ByteArray literals (apply the same
hex/string conversion logic used for FIXED_LEN_BYTE_ARRAY decoding such as in
decode_plain_fixed_len_byte_array_values and the
PhysicalType::FIXED_LEN_BYTE_ARRAY branch), so predicates against
trace_id/span_id consult stats correctly and pruning does not regress when no
dictionary page is present.

In `@crates/icegate-query/src/loki/predicate.rs`:
- Around line 21-31: datum_for_column currently accepts any hex string and
converts to Datum::fixed, but doesn't enforce storage widths; update
datum_for_column to decode hex and then validate byte length: for COL_TRACE_ID
require exactly 16 bytes, and for COL_SPAN_ID and COL_PARENT_SPAN_ID require
exactly 8 bytes, returning None when decode fails or length is wrong; keep other
columns returning Datum::string as before and use the existing Datum::fixed(...)
constructor for valid fixed-length byte slices.

In `@crates/icegate-query/src/tempo/formatters.rs`:
- Around line 91-97: The parent_span_id value is being serialized even when it's
the all-zero sentinel for root spans; modify the parent handling so that when
evaluating parent_ids (used to build parent_hex passed to build_span_value) you
treat either null or an all-zero 8-byte value as "no parent". Concretely: in the
code that computes parent_hex (the parent_ids.filter(...).map(...) expression),
retrieve the raw parent byte slice (parent_ids.value(row)), check whether it is
all zeros (compare to an 8-bytes-of-zero array or use .iter().all(|b| *b == 0)),
and only call hex::encode and pass a Some(hex) to build_span_value when it is
non-zero; otherwise pass None (i.e. keep parent_hex as None) so build_span_value
and OTLP output will suppress the zero-filled parent_span_id.

In `@crates/icegate-query/src/traceql/datafusion/planner_tests.rs`:
- Around line 25-35: Tests currently use padded ASCII labels via pad_id which
produce non-hex IDs (e.g., "t1-trace-1"), preventing the hex-decode path from
being exercised; change the fixtures to use real hex-encoded trace/span IDs and
update helpers accordingly: replace uses of pad_id(...) for trace/span fixtures
with a new or modified helper (referenced as pad_id or hex_id) that accepts a
hex string and decodes it into a fixed-size [u8; N] byte array (or add a
separate hex_decode helper) so the test selectors like traceID = "..." are valid
hex and the end-to-end hex decode path (trace/span ID matching and tenant
isolation) is exercised. Ensure the new helper validates hex length matches N
and returns the decoded fixed-size array, and update any test vectors that
previously used ASCII labels to use the corresponding hex strings.

---

Outside diff comments:
In `@crates/icegate-ingest/src/wal/boundary.rs`:
- Around line 136-145: The structural compatibility check in the match over
(&self.value, &other.value) must reject cases where both values are FixedBytes
with different lengths: inside the branch that currently checks
boundary_value_discriminant(left) != boundary_value_discriminant(right) add an
additional early error for the case when both variants are FixedBytes but
left.len() != right.len() (or the appropriate length accessor on the FixedBytes
variant) and return Err(IngestError::Shift(...)) describing mismatched fixed
byte widths; keep using the same error type/format as the surrounding code and
reference the FixedBytes variant, boundary_value_discriminant, and the same
component index context in the message.

In `@crates/icegate-query/src/logql/datafusion/planner.rs`:
- Around line 1240-1247: Regex matchers (MatchOp::Re / MatchOp::Nre) call
datafusion::functions::regex::regexp_like() directly on col_expr (the raw
FixedSizeBinary columns like trace_id/span_id) which causes a type mismatch;
convert those binary columns to a string representation (hex-encode or cast to
UTF-8) before calling regexp_like, using the same typed value used by equality
matchers (the val variable / matcher.value) or explicitly reject =~ / !~ for
fixed-binary labels. Locate the MatchOp::Re and MatchOp::Nre branches, wrap
col_expr with the binary->string conversion (or add a guard that returns an
error for regex on FixedSizeBinary) and pass the converted expression and
matcher.value to regexp_like (and call .not() for Nre).
- Around line 1225-1239: The current logic builds val first then uses it for
both Eq and Neq, causing `col != NULL` to filter out all rows; instead, for
COL_TRACE_ID and COL_SPAN_ID (identified by mapped_label, COL_TRACE_ID,
COL_SPAN_ID) validate hex::decode + length before constructing val and branch on
MatchOp: if decode is invalid then for MatchOp::Eq return an expression that
matches nothing (i.e. compare against NULL/other "no-match" expression) while
for MatchOp::Neq return an expression that matches all valid rows (e.g. a truthy
row-matching expression such as an is-not-null or always-true predicate for that
binary column); only when decode+length are valid construct
lit(ScalarValue::FixedSizeBinary(..., Some(bytes))) and use col_expr.eq(...) /
col_expr.not_eq(...). Also handle regex matchers for binary columns separately
from the lit(ScalarValue) path.

---

Nitpick comments:
In `@crates/icegate-common/src/parquet_encoding.rs`:
- Around line 135-165: Add a regression test that exercises a FixedSizeBinary
column using DELTA_BYTE_ARRAY encoding so we verify parquet-rs can encode/decode
that combination: create a small Parquet write+read roundtrip (either as a new
test in this module or in writer.rs) that defines a FixedSizeBinary column
(FixedSizeBinary type), disables dictionary encoding for that column, sets the
encoding override to Encoding::DELTA_BYTE_ARRAY (matching the
SPANS_COLUMN_ENCODINGS use of DELTA_BYTE_ARRAY for
trace_id/span_id/parent_span_id), writes several rows with varying binary
values, then reads them back and asserts the binary payloads match; ensure the
test fails if the encoder/decoder disagree.

In `@crates/icegate-ingest/src/shift/config.rs`:
- Around line 290-294: Add a unit test that exercises validation when
write.data_page_size_limit_bytes == 0: create a config instance (or builder)
with write.data_page_size_limit_bytes set to 0, call the validation method (the
impl that returns Result, e.g., validate or the method that produced
Err(IngestError::Config)), and assert it returns Err(IngestError::Config) and
that the error message contains "data_page_size_limit_bytes must be greater than
zero"; name the test something like test_data_page_size_limit_bytes_zero to live
in the existing test module so future changes to the Helm chart/field are
caught.

In `@crates/icegate-query/src/engine/provider/scan.rs`:
- Around line 199-203: Unconditionally enabling row selection via
scan_builder.with_row_selection_enabled(true) may add page-index parsing
overhead that isn't always beneficial; run targeted benchmarks/profiling
comparing scans with and without row selection (use ICEBERG_SCAN_BATCH_SIZE as
in the current build) to determine overall benefit and gate the call if needed,
and separately verify iceberg-rust's page index evaluator correctly supports
FixedSizeBinary predicates by testing predicate pushdown for FixedSizeBinary
columns (trace_id, span_id, parent_span_id) to ensure they produce page-level
filtering rather than forcing full row-group reads; if benchmarks show
regressions, make row selection conditional (e.g., based on predicate
selectivity estimation or a config flag) and add unit/integration tests
confirming FixedSizeBinary predicate handling.

In `@crates/icegate-query/tests/tempo/harness.rs`:
- Around line 219-225: The test currently creates parent_span_id as three null
FixedSizeBinary entries using
FixedSizeBinaryArray::try_from_sparse_iter_with_size and repeat_n(None, 3);
change the iterator so at least one element is Some(&[u8;8]) (an 8-byte slice)
and the other entries remain None (e.g., iterator over [Some(bytes8), None,
None]) so the FixedSizeBinaryArray contains at least one real parent span ID;
update the call to FixedSizeBinaryArray::try_from_sparse_iter_with_size
accordingly (keeping the size 8) so the Tempo tests serialize a real
parent_span_id through the migration path.

In `@crates/icegate-queue/src/writer.rs`:
- Around line 466-479: Add a roundtrip unit test that verifies parquet-rs
actually encodes/decodes a FixedSizeBinary column with DELTA_BYTE_ARRAY: create
an Arrow schema containing DataType::FixedSizeBinary(16) for a "trace_id"
column, populate it with several 16-byte values, construct a QueueWriter with
column_encodings mapping the topic to &[("trace_id",
Encoding::DELTA_BYTE_ARRAY)], write the record batch to an in-memory sink, read
the written Parquet back (using the same reader path the codebase uses) and
assert the decoded FixedSizeBinary values equal the originals; put this under a
clear test name (e.g., writer_roundtrip_delta_byte_array_fixed_size_binary) so
CI covers the new encoding path for QueueWriter and column_encodings.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 16436743-3325-4b81-b4a1-726ae55e7926

📥 Commits

Reviewing files that changed from the base of the PR and between b671fdd and d534780.

⛔ Files ignored due to path filters (1)
  • Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (32)
  • config/helm/icegate/templates/configmap-ingest.yaml
  • config/helm/icegate/values.yaml
  • crates/icegate-common/src/catalog/config.rs
  • crates/icegate-common/src/lib.rs
  • crates/icegate-common/src/parquet_encoding.rs
  • crates/icegate-common/src/schema.rs
  • crates/icegate-ingest/src/cli/commands/run.rs
  • crates/icegate-ingest/src/shift/config.rs
  • crates/icegate-ingest/src/shift/iceberg_storage.rs
  • crates/icegate-ingest/src/shift/mod.rs
  • crates/icegate-ingest/src/transform.rs
  • crates/icegate-ingest/src/wal/boundary.rs
  • crates/icegate-ingest/src/wal/columns.rs
  • crates/icegate-ingest/src/wal/sorter.rs
  • crates/icegate-query/src/engine/metadata_scan/parquet_reader.rs
  • crates/icegate-query/src/engine/provider/scan.rs
  • crates/icegate-query/src/logql/datafusion/planner.rs
  • crates/icegate-query/src/loki/formatters.rs
  • crates/icegate-query/src/loki/predicate.rs
  • crates/icegate-query/src/tempo/formatters.rs
  • crates/icegate-query/src/tempo/trace_by_id.rs
  • crates/icegate-query/src/traceql/datafusion/planner.rs
  • crates/icegate-query/src/traceql/datafusion/planner_tests.rs
  • crates/icegate-query/src/traceql/datafusion/selectors.rs
  • crates/icegate-query/tests/loki/grouping.rs
  • crates/icegate-query/tests/loki/harness.rs
  • crates/icegate-query/tests/loki/trace.rs
  • crates/icegate-query/tests/loki/unwrap.rs
  • crates/icegate-query/tests/tempo/harness.rs
  • crates/icegate-query/tests/tempo/tags.rs
  • crates/icegate-queue/src/config.rs
  • crates/icegate-queue/src/writer.rs

Comment thread crates/icegate-common/src/schema.rs
Comment thread crates/icegate-ingest/src/wal/columns.rs Outdated
Comment thread crates/icegate-query/src/engine/metadata_scan/parquet_reader.rs
Comment thread crates/icegate-query/src/loki/predicate.rs
Comment thread crates/icegate-query/src/tempo/formatters.rs
Comment thread crates/icegate-query/src/traceql/datafusion/planner_tests.rs
…with `FixedSizeBinary` support for efficient Parquet storage, query optimization, and validation. Includes schema changes, migration guide, planner updates, and test coverage.
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (4)
crates/icegate-common/src/SCHEMA.md (1)

497-499: ⚡ Quick win

Add the Migration Guide to the Table of Contents.

The new section starts at Line 497 but isn’t discoverable from the ToC near Lines 11-20. Please add a Migration Guide entry there.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/icegate-common/src/SCHEMA.md` around lines 497 - 499, The Table of
Contents in SCHEMA.md is missing an entry for the new "Migration Guide" section;
update the ToC near the top to include a "Migration Guide" link that points to
the "Migration Guide" heading (which contains the "Trace and Span ID storage
change (GH-135)" subsection) so readers can navigate to that new section from
the ToC. Ensure the ToC entry uses the same heading text and markdown link style
as other entries in the file.
crates/icegate-query/src/traceql/datafusion/planner_tests.rs (1)

566-570: ⚡ Quick win

Inline padded construction duplicates pad_id — use the helper instead.

Five places across three test functions manually reproduce the pad_id byte-copy logic instead of calling it. If pad_id semantics change, these diverge silently.

♻️ Replace all five occurrences (example shown for `child_only_filter_returns_full_trace_so_root_is_present`; same pattern applies to `limit_caps_traces_not_span_rows` and `spss_one_keeps_root_over_child`)
-    let trace_a = {
-        let mut padded = [0u8; 16];
-        padded[..7].copy_from_slice(b"trace-A");
-        padded.to_vec()
-    };
+    let trace_a = pad_id::<16>("trace-A").to_vec();
     // ...
-    let trace_b = {
-        let mut padded = [0u8; 16];
-        padded[..7].copy_from_slice(b"trace-B");
-        padded.to_vec()
-    };
+    let trace_b = pad_id::<16>("trace-B").to_vec();

Also applies to: 583-587, 625-629, 691-695, 696-700

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/icegate-query/src/traceql/datafusion/planner_tests.rs` around lines
566 - 570, Replace the manual 16-byte padding logic used to build test trace IDs
with the shared helper pad_id to avoid duplicated semantics: locate the
test-local constructions (e.g., the trace_a creation in
child_only_filter_returns_full_trace_so_root_is_present and the similar blocks
in limit_caps_traces_not_span_rows and spss_one_keeps_root_over_child) and call
pad_id(b"trace-A") (or the appropriate raw id bytes) instead of creating a local
padded array and copy_from_slice; ensure the return type matches (Vec<u8> or
&[u8]) expected by the tests and update the five occurrences accordingly so all
tests use pad_id consistently.
crates/icegate-query/benches/common/harness.rs (1)

245-259: ⚡ Quick win

Extract a shared helper to eliminate the triplicated trace/span builder block.

The identical ~11-line decode → build → append → finish pattern appears in all three writer functions. Since this schema is actively evolving (this PR is one step in a migration series), keeping it in three places means all three need updating on the next change. A small helper removes that risk and is also the right place to switch to with_capacity, which is consistent with how the production ingest path allocates these builders (FixedSizeBinaryBuilder::with_capacity(total_records, 16) in crates/icegate-ingest/src/transform.rs).

♻️ Proposed refactor
+// Helper used by all three writer functions
+fn build_trace_span_arrays(count: usize) -> (ArrayRef, ArrayRef) {
+    let trace_id_bytes = hex::decode("0102030405060708090a0b0c0d0e0f10").expect("trace_id hex");
+    let span_id_bytes  = hex::decode("0102030405060708").expect("span_id hex");
+    let mut trace_id_builder = FixedSizeBinaryBuilder::with_capacity(count, 16);
+    let mut span_id_builder  = FixedSizeBinaryBuilder::with_capacity(count, 8);
+    for _ in 0..count {
+        trace_id_builder.append_value(&trace_id_bytes).expect("trace_id length 16");
+        span_id_builder.append_value(&span_id_bytes).expect("span_id length 8");
+    }
+    (
+        Arc::new(trace_id_builder.finish()) as ArrayRef,
+        Arc::new(span_id_builder.finish()) as ArrayRef,
+    )
+}

Then in each writer:

-    // trace_id / span_id are FIXED_LEN_BYTE_ARRAY (16 / 8) in the logs
-    // schema; decode the hex fixtures into raw bytes once and reuse them
-    // for every row.
-    let trace_id_bytes = hex::decode("0102030405060708090a0b0c0d0e0f10").expect("trace_id hex");
-    let span_id_bytes = hex::decode("0102030405060708").expect("span_id hex");
-    let mut trace_id_builder = FixedSizeBinaryBuilder::new(16);
-    let mut span_id_builder = FixedSizeBinaryBuilder::new(8);
-
-    for _ in 0..count {
-        trace_id_builder.append_value(&trace_id_bytes).expect("trace_id length 16");
-        span_id_builder.append_value(&span_id_bytes).expect("span_id length 8");
-    }
-
-    let trace_id_arr: ArrayRef = Arc::new(trace_id_builder.finish());
-    let span_id_arr: ArrayRef = Arc::new(span_id_builder.finish());
+    let (trace_id_arr, span_id_arr) = build_trace_span_arrays(count);

Also applies to: 382-396, 509-523

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/icegate-query/benches/common/harness.rs` around lines 245 - 259,
Extract a small helper (e.g., fn build_fixed_binary_array(hex: &str, count:
usize, byte_width: usize) -> ArrayRef) to replace the repeated
decode→build→append→finish logic found around the trace/span builder blocks
(uses FixedSizeBinaryBuilder currently constructed via new in those writers);
inside the helper decode the hex once, create the builder with
FixedSizeBinaryBuilder::with_capacity(count, byte_width), append the decoded
bytes count times, finish and return Arc::new(builder.finish()), then call this
helper for both trace_id and span_id in each writer to remove duplication and
switch to with_capacity.
crates/icegate-ingest/src/wal/columns.rs (1)

534-552: ⚡ Quick win

Add a descending-sort test case for the FixedBytes path.

All fixed_bytes_descriptor calls in the new tests pass descending=false. The compare_option_ord inversion is verified for Utf8/TimestampMicrosecond elsewhere, but a single descending check here would pin the new plumbing end-to-end. The test helper already accepts descending: bool, so the addition is trivial.

✅ Suggested addition to sort_column_cache_supports_fixed_bytes_sort or as a new test
+    #[test]
+    fn sort_column_cache_supports_fixed_bytes_sort_descending() {
+        let trace_a = [b'a', 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0];
+        let trace_b = [b'b', 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0];
+
+        // Row 0 = trace_b, row 1 = trace_a; descending → b > a, so row 0 sorts first.
+        let batch = fixed_bytes_batch(&[Some(trace_b), Some(trace_a)]);
+        let descriptor = fixed_bytes_descriptor(true, false); // descending=true
+        let cache = SortColumnCache::try_new(&batch, &descriptor, "fixed-bytes-desc").expect("cache");
+
+        // In descending order, trace_b (row 0) ranks before trace_a (row 1).
+        assert_eq!(cache.compare_indices(0, 1), Ordering::Less);
+        assert_eq!(cache.compare_indices(1, 0), Ordering::Greater);
+    }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/icegate-ingest/src/wal/columns.rs` around lines 534 - 552, The test
sort_column_cache_supports_fixed_bytes_sort only covers ascending order; add a
descending-case (either extend the test or a new one) by creating a descriptor
with descending=true via fixed_bytes_descriptor(true, true), build a
SortColumnCache with SortColumnCache::try_new using the same fixed_bytes_batch
input, and assert that compare_indices yields reversed Ordering results (e.g.,
row 0 vs 1 is Less instead of Greater) and that stable tie-breaking still
behaves correctly for equal rows; reference the functions fixed_bytes_batch,
fixed_bytes_descriptor, SortColumnCache::try_new, and compare_indices to locate
and implement the change.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@crates/icegate-ingest/src/shift/config.rs`:
- Line 210: ShiftWriteConfig::row_group_size default (20_000) is out of sync
with QueueCommonConfig::max_row_group_size (16_384); update the default for
QueueCommonConfig::max_row_group_size to 20_000 so both defaults match. Locate
the QueueCommonConfig struct and change its max_row_group_size default value to
20_000, and run/build tests to ensure no other code assumes the old limit.

In `@crates/icegate-query/src/traceql/datafusion/planner_tests.rs`:
- Around line 852-858: The test matching_tenant_trace_id_filter_returns_rows
uses a weak assertion assert!(total > 0) which should be tightened to assert the
exact expected row count; update the assertion in that test (which calls
fixture_session(), make_query_ctx("t1") and run_and_count(...)) to
assert_eq!(total, 1, "expected exactly one span for trace_id
pad_id::<16>(\"t1-trace-1\") in tenant t1") so it fails if extra rows (e.g.,
from tenant leakage) are returned.

---

Nitpick comments:
In `@crates/icegate-common/src/SCHEMA.md`:
- Around line 497-499: The Table of Contents in SCHEMA.md is missing an entry
for the new "Migration Guide" section; update the ToC near the top to include a
"Migration Guide" link that points to the "Migration Guide" heading (which
contains the "Trace and Span ID storage change (GH-135)" subsection) so readers
can navigate to that new section from the ToC. Ensure the ToC entry uses the
same heading text and markdown link style as other entries in the file.

In `@crates/icegate-ingest/src/wal/columns.rs`:
- Around line 534-552: The test sort_column_cache_supports_fixed_bytes_sort only
covers ascending order; add a descending-case (either extend the test or a new
one) by creating a descriptor with descending=true via
fixed_bytes_descriptor(true, true), build a SortColumnCache with
SortColumnCache::try_new using the same fixed_bytes_batch input, and assert that
compare_indices yields reversed Ordering results (e.g., row 0 vs 1 is Less
instead of Greater) and that stable tie-breaking still behaves correctly for
equal rows; reference the functions fixed_bytes_batch, fixed_bytes_descriptor,
SortColumnCache::try_new, and compare_indices to locate and implement the
change.

In `@crates/icegate-query/benches/common/harness.rs`:
- Around line 245-259: Extract a small helper (e.g., fn
build_fixed_binary_array(hex: &str, count: usize, byte_width: usize) ->
ArrayRef) to replace the repeated decode→build→append→finish logic found around
the trace/span builder blocks (uses FixedSizeBinaryBuilder currently constructed
via new in those writers); inside the helper decode the hex once, create the
builder with FixedSizeBinaryBuilder::with_capacity(count, byte_width), append
the decoded bytes count times, finish and return Arc::new(builder.finish()),
then call this helper for both trace_id and span_id in each writer to remove
duplication and switch to with_capacity.

In `@crates/icegate-query/src/traceql/datafusion/planner_tests.rs`:
- Around line 566-570: Replace the manual 16-byte padding logic used to build
test trace IDs with the shared helper pad_id to avoid duplicated semantics:
locate the test-local constructions (e.g., the trace_a creation in
child_only_filter_returns_full_trace_so_root_is_present and the similar blocks
in limit_caps_traces_not_span_rows and spss_one_keeps_root_over_child) and call
pad_id(b"trace-A") (or the appropriate raw id bytes) instead of creating a local
padded array and copy_from_slice; ensure the return type matches (Vec<u8> or
&[u8]) expected by the tests and update the five occurrences accordingly so all
tests use pad_id consistently.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: cfe6f8ca-ed2a-4092-be07-922d2a6bfac8

📥 Commits

Reviewing files that changed from the base of the PR and between d534780 and ea33118.

⛔ Files ignored due to path filters (1)
  • Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (13)
  • crates/icegate-common/Cargo.toml
  • crates/icegate-common/src/SCHEMA.md
  • crates/icegate-common/src/parquet_encoding.rs
  • crates/icegate-ingest/src/shift/config.rs
  • crates/icegate-ingest/src/wal/boundary.rs
  • crates/icegate-ingest/src/wal/columns.rs
  • crates/icegate-query/benches/common/harness.rs
  • crates/icegate-query/src/engine/metadata_scan/parquet_reader.rs
  • crates/icegate-query/src/logql/datafusion/planner.rs
  • crates/icegate-query/src/loki/predicate.rs
  • crates/icegate-query/src/tempo/formatters.rs
  • crates/icegate-query/src/traceql/datafusion/planner_tests.rs
  • crates/icegate-query/tests/tempo/harness.rs
✅ Files skipped from review due to trivial changes (1)
  • crates/icegate-common/Cargo.toml
🚧 Files skipped from review as they are similar to previous changes (4)
  • crates/icegate-ingest/src/wal/boundary.rs
  • crates/icegate-query/tests/tempo/harness.rs
  • crates/icegate-common/src/parquet_encoding.rs
  • crates/icegate-query/src/tempo/formatters.rs

Comment thread crates/icegate-ingest/src/shift/config.rs
Comment thread crates/icegate-query/src/traceql/datafusion/planner_tests.rs
…eamline `trace_id` handling, enhance tests, and update `max_row_group_size` config.
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@config/kustomize/base/cadvisor/servicemonitor.yaml`:
- Around line 105-119: The fallback relabel rules that map cgroup `id`→`uid` are
unconditional and can overwrite an existing pod UID; change each of the three
rules so they only run when the real pod UID is absent/empty (the label from
container_label_io_kubernetes_pod_uid). Concretely, add
container_label_io_kubernetes_pod_uid as the first sourceLabel in each rule
(e.g. sourceLabels: [container_label_io_kubernetes_pod_uid, id]) and adjust the
regex to require the first capture to be empty before matching the cgroup path
(so the replacement only triggers when container_label_io_kubernetes_pod_uid is
empty), keeping the same targetLabel: uid and replacement for the second
capture.

In `@crates/icegate-common/src/SCHEMA.md`:
- Around line 539-552: Update the root-span sentinel example in SCHEMA.md so the
hex literal is portable across SQL dialects: replace the underscored form
`X'00000000_00000000'` with a plain 16-hex-digit literal `X'0000000000000000'`
in the sentence that suggests checking `parent_span_id IS NULL OR parent_span_id
= ...` (referencing the `parent_span_id` check in the "Root-span sentinel"
section) and ensure the surrounding wording still indicates this represents 8
zero bytes.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: fc8cf97c-0825-432b-bfd1-bf1e01537cb6

📥 Commits

Reviewing files that changed from the base of the PR and between ea33118 and 71a20f8.

📒 Files selected for processing (12)
  • config/kustomize/base/cadvisor/daemonset.yaml
  • config/kustomize/base/cadvisor/kustomization.yaml
  • config/kustomize/base/cadvisor/service.yaml
  • config/kustomize/base/cadvisor/serviceaccount.yaml
  • config/kustomize/base/cadvisor/servicemonitor.yaml
  • config/kustomize/base/kustomization.yaml
  • config/kustomize/base/values-prometheus.yaml
  • crates/icegate-common/src/SCHEMA.md
  • crates/icegate-ingest/src/wal/columns.rs
  • crates/icegate-query/benches/common/harness.rs
  • crates/icegate-query/src/traceql/datafusion/planner_tests.rs
  • crates/icegate-queue/src/config.rs
✅ Files skipped from review due to trivial changes (4)
  • config/kustomize/base/cadvisor/serviceaccount.yaml
  • crates/icegate-queue/src/config.rs
  • crates/icegate-query/src/traceql/datafusion/planner_tests.rs
  • crates/icegate-ingest/src/wal/columns.rs
🚧 Files skipped from review as they are similar to previous changes (1)
  • crates/icegate-query/benches/common/harness.rs

Comment thread config/kustomize/base/cadvisor/servicemonitor.yaml
Comment thread crates/icegate-common/src/SCHEMA.md
Comment thread crates/icegate-common/src/parquet_encoding.rs Outdated
Comment thread crates/icegate-common/src/parquet_encoding.rs Outdated
Comment thread crates/icegate-ingest/src/shift/config.rs
Comment thread crates/icegate-ingest/src/transform.rs Outdated
Comment thread crates/icegate-query/src/engine/metadata_scan/parquet_reader.rs Outdated
…odule, consolidate `build_writer_properties`, and streamline column encoding definitions.
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
crates/icegate-common/src/parquet_encoding.rs (1)

102-112: 💤 Low value

Nested links[].trace_id and links[].span_id won't receive the DELTA_BYTE_ARRAY override.

build_writer_properties calls ColumnPath::from(col_name) which creates a single-segment path (e.g. ["trace_id"]). This matches the root-level trace_id (Iceberg field 4) correctly. However the links list struct contains a second trace_id (field ID 33) and span_id (field ID 34) whose physical Parquet paths are multi-segment (e.g. links.list.element.trace_id), so the override is silently skipped for those columns and they fall back to plain+RLE_DICTIONARY. Since span links are uncommon and typically low-volume this is a minor optimization gap, not a correctness issue.

If closing the gap becomes worthwhile, the nested paths can be added to SPANS_COLUMN_ENCODINGS using the appropriate dotted segment form that Iceberg emits for list-of-struct fields.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/icegate-common/src/parquet_encoding.rs` around lines 102 - 112,
SPANS_COLUMN_ENCODINGS currently only lists root-level columns so
build_writer_properties (which uses ColumnPath::from) won't match nested
list-of-struct fields; add entries for the nested links fields using the dotted
multi-segment form Iceberg emits (e.g. "links.list.element.trace_id" and
"links.list.element.span_id") to SPANS_COLUMN_ENCODINGS and assign them
Encoding::DELTA_BYTE_ARRAY so those nested columns receive the intended
override; keep the existing root-level entries intact.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@crates/icegate-common/src/parquet_encoding.rs`:
- Around line 102-112: SPANS_COLUMN_ENCODINGS currently only lists root-level
columns so build_writer_properties (which uses ColumnPath::from) won't match
nested list-of-struct fields; add entries for the nested links fields using the
dotted multi-segment form Iceberg emits (e.g. "links.list.element.trace_id" and
"links.list.element.span_id") to SPANS_COLUMN_ENCODINGS and assign them
Encoding::DELTA_BYTE_ARRAY so those nested columns receive the intended
override; keep the existing root-level entries intact.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 8ca1e05f-fb32-4b46-8449-85fdcafa5ab8

📥 Commits

Reviewing files that changed from the base of the PR and between 71a20f8 and 6b573ab.

📒 Files selected for processing (11)
  • crates/icegate-common/src/lib.rs
  • crates/icegate-common/src/parquet_encoding.rs
  • crates/icegate-common/src/parquet_writer.rs
  • crates/icegate-common/src/schema.rs
  • crates/icegate-ingest/src/shift/iceberg_storage.rs
  • crates/icegate-ingest/src/shift/mod.rs
  • crates/icegate-ingest/src/transform.rs
  • crates/icegate-query/src/engine/metadata_scan/parquet_reader.rs
  • crates/icegate-query/tests/tempo/harness.rs
  • crates/icegate-query/tests/tempo/tags.rs
  • crates/icegate-queue/src/writer.rs
🚧 Files skipped from review as they are similar to previous changes (5)
  • crates/icegate-queue/src/writer.rs
  • crates/icegate-query/src/engine/metadata_scan/parquet_reader.rs
  • crates/icegate-query/tests/tempo/tags.rs
  • crates/icegate-query/tests/tempo/harness.rs
  • crates/icegate-ingest/src/transform.rs

Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Performance Alert ⚠️

Possible performance regression was detected for benchmark 'IceGate Benchmarks'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.50.

Benchmark suite Current: 2da0853 Previous: 1972421 Ratio
write_performance/large_batches 2431235279 ns/iter (± 32966219) 501499979 ns/iter (± 211080) 4.85

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
crates/icegate-queue/benches/queue_s3_bench.rs (1)

95-100: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Update the stale comment to match the new batch size.

The comment on line 95 still says 10,000 records but the actual argument on line 100 is now 21_000.

✏️ Proposed fix
-        // Benchmark 2: Large batches - 10 batches x 10,000 records
+        // Benchmark 2: Large batches - 10 batches x 21,000 records
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/icegate-queue/benches/queue_s3_bench.rs` around lines 95 - 100, Update
the stale comment for the "large_batches" benchmark to reflect the actual batch
size used: change the text that says "10,000 records" to "21,000 records" (or
otherwise indicate "10 batches x 21,000 records") so the comment matches the
call to generate_batches_for_throughput(10, 21_000) inside
group.bench_function("large_batches") and the surrounding benchmark setup.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@crates/icegate-queue/benches/queue_s3_bench.rs`:
- Around line 95-100: Update the stale comment for the "large_batches" benchmark
to reflect the actual batch size used: change the text that says "10,000
records" to "21,000 records" (or otherwise indicate "10 batches x 21,000
records") so the comment matches the call to generate_batches_for_throughput(10,
21_000) inside group.bench_function("large_batches") and the surrounding
benchmark setup.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: f9a9c896-e50d-4462-b4c4-26c990f63e4b

📥 Commits

Reviewing files that changed from the base of the PR and between 6b573ab and a7d7fc9.

📒 Files selected for processing (14)
  • config/helm/icegate/templates/configmap-ingest.yaml
  • config/helm/icegate/values.yaml
  • crates/icegate-ingest/src/cli/commands/run.rs
  • crates/icegate-ingest/src/shift/config.rs
  • crates/icegate-ingest/src/shift/iceberg_storage.rs
  • crates/icegate-ingest/src/shift/mod.rs
  • crates/icegate-ingest/src/shift/shift_runner.rs
  • crates/icegate-ingest/src/wal/boundary.rs
  • crates/icegate-ingest/src/wal/columns.rs
  • crates/icegate-ingest/src/wal/sorter.rs
  • crates/icegate-ingest/src/wal/test_utils.rs
  • crates/icegate-queue/benches/queue_s3_bench.rs
  • crates/icegate-queue/src/config.rs
  • crates/icegate-queue/src/writer.rs
✅ Files skipped from review due to trivial changes (1)
  • crates/icegate-ingest/src/shift/shift_runner.rs
🚧 Files skipped from review as they are similar to previous changes (10)
  • crates/icegate-queue/src/config.rs
  • config/helm/icegate/templates/configmap-ingest.yaml
  • config/helm/icegate/values.yaml
  • crates/icegate-ingest/src/wal/sorter.rs
  • crates/icegate-ingest/src/shift/mod.rs
  • crates/icegate-ingest/src/cli/commands/run.rs
  • crates/icegate-ingest/src/wal/test_utils.rs
  • crates/icegate-queue/src/writer.rs
  • crates/icegate-ingest/src/shift/config.rs
  • crates/icegate-ingest/src/shift/iceberg_storage.rs

@frisbeeman frisbeeman merged commit 98e67a3 into main May 7, 2026
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants