-
Notifications
You must be signed in to change notification settings - Fork 66
Add support for schema key aliases in query engine Parsers #1725
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Add support for schema key aliases in query engine Parsers #1725
Conversation
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #1725 +/- ##
==========================================
+ Coverage 83.96% 83.98% +0.01%
==========================================
Files 467 468 +1
Lines 135720 136391 +671
==========================================
+ Hits 113960 114544 +584
- Misses 21226 21313 +87
Partials 534 534
🚀 New features to boost your workflow:
|
| self | ||
| } | ||
|
|
||
| pub fn with_alias(mut self, alias: &str, canonical_key: &str) -> ParserMapSchema { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we have key in the name with_key_alias?
Also might be nice for readability if you could do everything in one call...
let mut log_record_schema = ParserMapSchema::new()
.set_default_map_key("Attributes")
.with_key_definition("Timestamp", ParserMapKeySchema::DateTime)
.with_key_definition("ObservedTimestamp", ParserMapKeySchema::DateTime)
.with_key_definition("SeverityNumber", ParserMapKeySchema::Integer)
.with_key_definition("SeverityText", ParserMapKeySchema::String)
.with_key_definition("Body", ParserMapKeySchema::Any)
.with_key_definition("TraceId", ParserMapKeySchema::Array)
.with_key_definition("SpanId", ParserMapKeySchema::Array)
.with_key_definition("TraceFlags", ParserMapKeySchema::Integer)
.with_key_definition("EventName", ParserMapKeySchema::String)
.with_key_aliases([
("attributes", "Attributes"),
("time_unix_nano", "Timestamp"),
("observed_time_unix_nano", "ObservedTimestamp"),
("severity_number", "SeverityNumber"),
("severity_text", "SeverityText"),
("body", "Body"),
("trace_id", "TraceId"),
("span_id", "SpanId"),
("flags", "TraceFlags"),
("event_name", "EventName")
])| self.keys.get(key) | ||
| } | ||
|
|
||
| pub fn get_aliases_for_key(&self, canonical_key: &str) -> Vec<&str> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this one and the one below for testing? pub(crate) if so.
| } | ||
|
|
||
| pub fn with_alias(mut self, alias: &str, canonical_key: &str) -> ParserMapSchema { | ||
| self.aliases.insert(alias.into(), canonical_key.into()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this error if the canonical_key hasn't been registered?
| "SpanId" => self.span_id.is_some(), | ||
| "TraceFlags" => self.flags.is_some(), | ||
| "EventName" => self.event_name.is_some(), | ||
| "Attributes" | "attributes" => true, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just musing here, this is kind of bloaty repeating itself a lot. Anything you can think of to try and consolidate/satisfy DRY principle?
It would be great if the parser could normalize everything in the tree so this code only need worry about the "canonical_key" for lookup. It could in some cases but I don't think we could do it always. Because it is possible to get some dynamic lookup at runtime think source[Attributes['some_key']]. Probably rare but not impossible 😄
Draft PR to open discussion - The current
otlp-bridgefor therecordsetengine uses the OpenTelemetry log data model spec for its initial schema keys (Attributes,Timestamp,ObservedTimestamp,SeverityText, etc).However, many well-versed in the OpenTelemetry space may be more used to the snake case representation (
attributes,time_unix_nano,observed_time_unix_nano,severity_text, etc) from the proto representation.Do we have any significant risks if we plan to support both? Inspired by
severity_textreference in #1722, been on the back of my mind for a while.This is still somewhat incomplete, could need more wiring for user-provided aliases in bridge.