-
Notifications
You must be signed in to change notification settings - Fork 1k
Basic (non-Arrow) Parquet native geometry type support #8520
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
// ---------------------------------------------------------------------- | ||
// Mirrors `parquet::EdgeInterpolationAlgorithm` | ||
|
||
/// Edge interpolation algorithm for Geography logical type | ||
#[derive(Debug, Clone, PartialEq, Eq, Hash)] | ||
pub enum EdgeInterpolationAlgorithm { | ||
/// Edges are interpolated as geodesics on a sphere. | ||
SPHERICAL, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see that some enums are copied out here (e.g., Encoding
) and some are just references to the Thrift enum (e.g., TimeUnit
). I don't mind which one of these we use...I did this version because it makes it harder for a non-geospatial aware user to do the wrong thing (i.e., it applies the "default" of SPHERICAL whose value would otherwise be squirrelled away in the Parquet specification where probably nobody will ever look for it).
cc @kylebarron ...I'm not at all offended if you'd prefer to incorporate any of this into #8222 instead. It was mostly just slightly easier to write the tests like this and I wanted to ensure the breaking change portion of this was able to make it through while there was some attention on Variant/Thrift metadata encoding/decoding 🙂 . |
fa7a80c
to
f7e69fb
Compare
@paleolimbot I'm wondering how to proceed with this, as many of the changes here (minus the excellent documentation) have already been made in the thrift remodel branch (https://github.com/apache/arrow-rs/tree/gh5854_thrift_remodel). Could you try diffing against that branch and perhaps rebase this work? (he says shamelessly trying to move the merge burden elsewhere) 🙏 |
@etseidl I didn't know! I'm happy to wait until that merges (or feel free to lift any tests or documentation from here if that is easier). |
Thanks! I'll steal your doc strings (and hopefully tests) and then ask you to review 😁 |
Changes in #8528 |
…8528) # Which issue does this PR close? **Note: this targets a feature branch, not main** - Part of #5854. # Rationale for this change This brings over changes to handling of geo-spatial statistics introduced by @paleolimbot in #8520. # What changes are included in this PR? Primarily adds documentation and tests to changes already made. The only significant change is adding a `Default` implementation for `EdgeInterpolationAlgorithm`. # Are these changes tested? Yes # Are there any user-facing changes? Yes --------- Co-authored-by: Matthijs Brobbel <[email protected]>
Which issue does this PR close?
(The other issue, #7240 I think would cover the Arrow half that Kyle has prototyped in #8222).
This PR is a subset of #8222 (just the part where the
crs
andalgorithm
fields were added) that also includes tests.Rationale for this change
When the Geography and Geometry members of the LogicalType enum were added, they did not include the underlying parameters. Without the parameters, low-level clients to the Parquet library like SedonaDB can't read Parquet files that contain these types without loosing information.
What changes are included in this PR?
This PR adds the
crs
andalgorithm
parameters to theGeometry
andGeography
logical types and updatesmatch
statements with reasonable implementations that roundtrip these values.Are these changes tested?
Yes
Are there any user-facing changes?
Yes: any match statements written against the
LogicalType
will break. This is a breaking change.