Skip to content

Conversation

paleolimbot
Copy link
Member

Which issue does this PR close?

(The other issue, #7240 I think would cover the Arrow half that Kyle has prototyped in #8222).

This PR is a subset of #8222 (just the part where the crs and algorithm fields were added) that also includes tests.

Rationale for this change

When the Geography and Geometry members of the LogicalType enum were added, they did not include the underlying parameters. Without the parameters, low-level clients to the Parquet library like SedonaDB can't read Parquet files that contain these types without loosing information.

What changes are included in this PR?

This PR adds the crs and algorithm parameters to theGeometry and Geography logical types and updates match statements with reasonable implementations that roundtrip these values.

Are these changes tested?

Yes

Are there any user-facing changes?

Yes: any match statements written against the LogicalType will break. This is a breaking change.

@github-actions github-actions bot added the parquet Changes to the parquet crate label Sep 30, 2025
Comment on lines +672 to +679
// ----------------------------------------------------------------------
// Mirrors `parquet::EdgeInterpolationAlgorithm`

/// Edge interpolation algorithm for Geography logical type
#[derive(Debug, Clone, PartialEq, Eq, Hash)]
pub enum EdgeInterpolationAlgorithm {
/// Edges are interpolated as geodesics on a sphere.
SPHERICAL,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that some enums are copied out here (e.g., Encoding) and some are just references to the Thrift enum (e.g., TimeUnit). I don't mind which one of these we use...I did this version because it makes it harder for a non-geospatial aware user to do the wrong thing (i.e., it applies the "default" of SPHERICAL whose value would otherwise be squirrelled away in the Parquet specification where probably nobody will ever look for it).

@paleolimbot paleolimbot marked this pull request as ready for review September 30, 2025 21:40
@paleolimbot
Copy link
Member Author

cc @kylebarron ...I'm not at all offended if you'd prefer to incorporate any of this into #8222 instead. It was mostly just slightly easier to write the tests like this and I wanted to ensure the breaking change portion of this was able to make it through while there was some attention on Variant/Thrift metadata encoding/decoding 🙂 .

@paleolimbot paleolimbot force-pushed the geospatial-logical-type-fix branch from fa7a80c to f7e69fb Compare October 1, 2025 03:09
@mbrobbel mbrobbel added the api-change Changes to the arrow API label Oct 1, 2025
@etseidl
Copy link
Contributor

etseidl commented Oct 1, 2025

@paleolimbot I'm wondering how to proceed with this, as many of the changes here (minus the excellent documentation) have already been made in the thrift remodel branch (https://github.com/apache/arrow-rs/tree/gh5854_thrift_remodel). Could you try diffing against that branch and perhaps rebase this work? (he says shamelessly trying to move the merge burden elsewhere) 🙏

@paleolimbot
Copy link
Member Author

@etseidl I didn't know! I'm happy to wait until that merges (or feel free to lift any tests or documentation from here if that is easier).

@etseidl
Copy link
Contributor

etseidl commented Oct 1, 2025

I'm happy to wait until that merges (or feel free to lift any tests or documentation from here if that is easier).

Thanks! I'll steal your doc strings (and hopefully tests) and then ask you to review 😁

@etseidl
Copy link
Contributor

etseidl commented Oct 1, 2025

Changes in #8528

etseidl added a commit that referenced this pull request Oct 2, 2025
…8528)

# Which issue does this PR close?
**Note: this targets a feature branch, not main**

- Part of #5854.

# Rationale for this change

This brings over changes to handling of geo-spatial statistics
introduced by @paleolimbot in #8520.

# What changes are included in this PR?

Primarily adds documentation and tests to changes already made. The only
significant change is adding a `Default` implementation for
`EdgeInterpolationAlgorithm`.

# Are these changes tested?

Yes

# Are there any user-facing changes?

Yes

---------

Co-authored-by: Matthijs Brobbel <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api-change Changes to the arrow API parquet Changes to the parquet crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature] geometry and geography logical type implementations
3 participants