Skip to content

Add pathStyleAccess to AwsStorageConfigInfo #2012

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

dimas-b
Copy link
Contributor

@dimas-b dimas-b commented Jul 8, 2025

This change allows configuring the "path-style" access mode in S3 clients (both in Polaris Servers and Iceberg REST Catalog API clients).

This change is applicable both to AWS storage and to non-AWS S3-compatible storage (#1530).

The Management API change is backward-compatible.

Includes a fix to AWS config object round-trip via generated API classes.

Dev ML discussion: https://lists.apache.org/thread/oxy2yd4mbq06zr4z57lfptgy95c3lyhb

}

public AwsStorageConfigurationInfo(
@Nonnull StorageType storageType,
@Nonnull List<String> allowedLocations,
@Nonnull String roleARN,
@Nullable String region) {
this(storageType, allowedLocations, roleARN, null, region, null, null);
this(storageType, allowedLocations, roleARN, null, region, null, null, false);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could use the default here, or just leave it null to leverage the default used in orElse. Better not to proliferate the constants around like this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed to use PATH_STYLE_ACCESS_DEFAULT

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that we have a default at the API level, is this actually nullable?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had to remove the API default.

type: boolean
description: >-
Whether S3 requests to files in this catalog should use 'path-style addressing for buckets'.
Default: false.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe OpenAPI actually supports default values, maybe we could just use that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added default: false, changed example to true.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I had to roll this back. Using Open API defaults breaks strict backward compatibility because is causes the attribute with a default value to be always returned to clients (even when not in use) - CI caught this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What exactly was the failure in CI? The client would fail to parse a response because it didn't recognize the property?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CatalogSerializationTest detected a change in the JSON text (when the new properties are not set)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@eric-maynard : your previous comment made me realise that the new AwsStorageConfigurationInfo properties are not properly converted back to AwsStorageConfigInfo yet. I'll tackle that as a separate PR, if you do not mind.

I'll convert pathStyleAccess to object Boolean in AwsStorageConfigurationInfo to ensure unset values are distinct from defaults (and thus do not affect older catalogs).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I think that's okay to defer.

But...

As long as pathStyleAccess is null in AwsStorageConfigInfo clients will not see it.

That's only true if you don't create a new catalog. So take the following pseudocode:

String catalog = ...createNewCatalog().toString
assert(catalog == "{...}")

We don't have a test like this in CatalogSerializationTest, but if we had one this PR would break it. Personally I don't think this really constitutes a breaking change, but I do think the proper way to handle a default in the API is with a default in the spec rather than a blurb in the description.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that we do not have a specific test for new property values in REST responses. However, CatalogSerializationTest makes sure defaults in new properties in AwsStorageConfigInfo do not show up in JSON.

As I mentioned above, we're missing code for new properties to be propagated from AwsStorageConfigurationInfo to AwsStorageConfigInfo, so I cannot even add a test for that without extra changes, which is what I'm proposing to do in a follow-up PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually, never mind, I'll expand this PR a bit :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated. PTAL.

snazy
snazy previously approved these changes Jul 9, 2025
@github-project-automation github-project-automation bot moved this from PRs In Progress to Ready to merge in Basic Kanban Board Jul 9, 2025
singhpk234
singhpk234 previously approved these changes Jul 9, 2025
Copy link
Contributor

@singhpk234 singhpk234 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM !

@dimas-b
Copy link
Contributor Author

dimas-b commented Jul 9, 2025

I had to remove one commit (discussed in the API yaml comment thread). PTAL.

Copy link
Contributor

@eric-maynard eric-maynard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's clarify if this is a breaking change (does it need to target 2.0? Go into API version 2?) and plan accordingly before merging.

@github-project-automation github-project-automation bot moved this from Ready to merge to PRs In Progress in Basic Kanban Board Jul 10, 2025
snazy
snazy previously approved these changes Jul 10, 2025
Copy link
Member

@snazy snazy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

dimas-b added 3 commits July 10, 2025 18:03
This change allows configuring the "path-style" access
mode in S3 clients (both in Polaris Servers and Iceberg
REST Catalog API clients).

This change is applicable both to AWS storage and to
non-AWS S3-compatible storage (apache#1530).

The Management API change is backward-compatible.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants