-
Notifications
You must be signed in to change notification settings - Fork 4
Open
Labels
type: enhancementMinor improvementsMinor improvements
Description
Right now to ingest into perspective we go GatewayStruct
to JSON via to_json
, then from json to plain python objects via orjson
, then we flatten, then we dump back to jsonl for ingestion into pyarrow and finally from pyarrow into perspective. This is done to balance performance (pyarrow json loading and perspective ingestion of pyarrow are very fast, gateway struct to json is in C++ so also faster than flattening as structured objects, orjson very fast) with the fact that we want to flatten things. In increasing order of preference, we can:
- enhance
csp
to flatten - try to emit record batches directly instead of flattening into json and moving back and forth between python objects
- have perspective flatten for us
Metadata
Metadata
Assignees
Labels
type: enhancementMinor improvementsMinor improvements