Skip to content

Add pathStyleAccess to AwsStorageConfigInfo #2012

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -154,12 +154,27 @@ public void dropTable(String catalog, TableIdentifier id) {
}

public LoadTableResponse loadTable(String catalog, TableIdentifier id, String snapshots) {
return loadTable(catalog, id, snapshots, Map.of());
}

public LoadTableResponse loadTableWithAccessDelegation(
String catalog, TableIdentifier id, String snapshots) {
return loadTable(
catalog, id, snapshots, Map.of("X-Iceberg-Access-Delegation", "vended-credentials"));
}

public LoadTableResponse loadTable(
String catalog, TableIdentifier id, String snapshots, Map<String, String> headers) {
HashMap<String, String> allHeaders = new HashMap<>(defaultHeaders());
allHeaders.putAll(headers);

String ns = RESTUtil.encodeNamespace(id.namespace());
try (Response res =
request(
"v1/{cat}/namespaces/" + ns + "/tables/{table}",
Map.of("cat", catalog, "table", id.name()),
snapshots == null ? Map.of() : Map.of("snapshots", snapshots))
snapshots == null ? Map.of() : Map.of("snapshots", snapshots),
allHeaders)
.get()) {
if (res.getStatus() == Response.Status.OK.getStatusCode()) {
return res.readEntity(LoadTableResponse.class);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -47,14 +47,18 @@ public Invocation.Builder request(String path, Map<String, String> templateValue
return request(path, templateValues, Map.of());
}

public Invocation.Builder request(
String path, Map<String, String> templateValues, Map<String, String> queryParams) {
protected Map<String, String> defaultHeaders() {
Map<String, String> headers = new HashMap<>();
headers.put(endpoints.realmHeaderName(), endpoints.realmId());
if (authToken != null) {
headers.put("Authorization", "Bearer " + authToken);
}
return request(path, templateValues, queryParams, headers);
return headers;
}

public Invocation.Builder request(
String path, Map<String, String> templateValues, Map<String, String> queryParams) {
return request(path, templateValues, queryParams, defaultHeaders());
}

public Invocation.Builder request(
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -140,6 +140,9 @@ private StorageConfigInfo getStorageInfo(Map<String, String> internalProperties)
.setStorageType(StorageConfigInfo.StorageTypeEnum.S3)
.setAllowedLocations(awsConfig.getAllowedLocations())
.setRegion(awsConfig.getRegion())
.setEndpoint(awsConfig.getEndpoint())
.setStsEndpoint(awsConfig.getStsEndpoint())
.setPathStyleAccess(awsConfig.getPathStyleAccess())
.build();
}
if (configInfo instanceof AzureStorageConfigurationInfo) {
Expand Down Expand Up @@ -275,7 +278,8 @@ public Builder setStorageConfigurationInfo(
awsConfigModel.getExternalId(),
awsConfigModel.getRegion(),
awsConfigModel.getEndpoint(),
awsConfigModel.getStsEndpoint());
awsConfigModel.getStsEndpoint(),
awsConfigModel.getPathStyleAccess());
awsConfig.validateArn(awsConfigModel.getRoleArn());
config = awsConfig;
break;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -125,6 +125,10 @@ public EnumMap<StorageAccessProperty, String> getSubscopedCreds(
credentialMap.put(StorageAccessProperty.AWS_ENDPOINT, endpointUri.toString());
}

if (Boolean.TRUE.equals(storageConfig.getPathStyleAccess())) {
credentialMap.put(StorageAccessProperty.AWS_PATH_STYLE_ACCESS, Boolean.TRUE.toString());
}

if (storageConfig.getAwsPartition().equals("aws-us-gov")
&& credentialMap.get(StorageAccessProperty.CLIENT_REGION) == null) {
throw new IllegalArgumentException(
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,10 @@ public class AwsStorageConfigurationInfo extends PolarisStorageConfigurationInfo
@JsonProperty(value = "stsEndpoint")
private @Nullable String stsEndpoint;

/** A flag indicating whether path-style bucket access should be forced in S3 clients. */
@JsonProperty(value = "pathStyleAccess")
private Boolean pathStyleAccess;

@JsonCreator
public AwsStorageConfigurationInfo(
@JsonProperty(value = "storageType", required = true) @Nonnull StorageType storageType,
Expand All @@ -71,21 +75,23 @@ public AwsStorageConfigurationInfo(
@JsonProperty(value = "externalId") @Nullable String externalId,
@JsonProperty(value = "region", required = false) @Nullable String region,
@JsonProperty(value = "endpoint") @Nullable String endpoint,
@JsonProperty(value = "stsEndpoint") @Nullable String stsEndpoint) {
@JsonProperty(value = "stsEndpoint") @Nullable String stsEndpoint,
@JsonProperty(value = "pathStyleAccess") @Nullable Boolean pathStyleAccess) {
super(storageType, allowedLocations);
this.roleARN = roleARN;
this.externalId = externalId;
this.region = region;
this.endpoint = endpoint;
this.stsEndpoint = stsEndpoint;
this.pathStyleAccess = pathStyleAccess;
}

public AwsStorageConfigurationInfo(
@Nonnull StorageType storageType,
@Nonnull List<String> allowedLocations,
@Nonnull String roleARN,
@Nullable String region) {
this(storageType, allowedLocations, roleARN, null, region, null, null);
this(storageType, allowedLocations, roleARN, null, region, null, null, null);
}

public AwsStorageConfigurationInfo(
Expand All @@ -94,7 +100,7 @@ public AwsStorageConfigurationInfo(
@Nonnull String roleARN,
@Nullable String externalId,
@Nullable String region) {
this(storageType, allowedLocations, roleARN, externalId, region, null, null);
this(storageType, allowedLocations, roleARN, externalId, region, null, null, null);
}

@Override
Expand Down Expand Up @@ -143,12 +149,27 @@ public void setRegion(@Nullable String region) {
this.region = region;
}

@Nullable
public String getEndpoint() {
return endpoint;
}

@JsonIgnore
@Nullable
public URI getEndpointUri() {
return endpoint == null ? null : URI.create(endpoint);
}

/** Returns a flag indicating whether path-style bucket access should be forced in S3 clients. */
public @Nullable Boolean getPathStyleAccess() {
return pathStyleAccess;
}

@Nullable
public String getStsEndpoint() {
return stsEndpoint;
}

/** Returns the STS endpoint if set, defaulting to {@link #getEndpointUri()} otherwise. */
@JsonIgnore
@Nullable
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -29,8 +29,13 @@
public class AwsStorageConfigurationInfoTest {

private static AwsStorageConfigurationInfo config(String endpoint, String stsEndpoint) {
return config(endpoint, stsEndpoint, false);
}

private static AwsStorageConfigurationInfo config(
String endpoint, String stsEndpoint, Boolean pathStyle) {
return new AwsStorageConfigurationInfo(
S3, List.of(), "role", null, null, endpoint, stsEndpoint);
S3, List.of(), "role", null, null, endpoint, stsEndpoint, pathStyle);
}

@Test
Expand All @@ -56,4 +61,11 @@ public void testStsEndpoint() {
AwsStorageConfigurationInfo::getStsEndpointUri)
.containsExactly(URI.create("http://s3.example.com"), URI.create("http://sts.example.com"));
}

@Test
public void testPathStyleAccess() {
assertThat(config(null, null, null).getPathStyleAccess()).isNull();
assertThat(config(null, null, false).getPathStyleAccess()).isFalse();
assertThat(config(null, null, true).getPathStyleAccess()).isTrue();
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,7 @@
import org.apache.iceberg.io.PositionOutputStream;
import org.apache.iceberg.rest.RESTCatalog;
import org.apache.iceberg.rest.auth.OAuth2Properties;
import org.apache.iceberg.rest.responses.LoadTableResponse;
import org.apache.iceberg.types.Types;
import org.apache.polaris.core.admin.model.AwsStorageConfigInfo;
import org.apache.polaris.core.admin.model.Catalog;
Expand All @@ -68,9 +69,10 @@
import org.junit.jupiter.api.AfterEach;
import org.junit.jupiter.api.BeforeAll;
import org.junit.jupiter.api.BeforeEach;
import org.junit.jupiter.api.Test;
import org.junit.jupiter.api.TestInfo;
import org.junit.jupiter.api.extension.ExtendWith;
import org.junit.jupiter.params.ParameterizedTest;
import org.junit.jupiter.params.provider.ValueSource;
import software.amazon.awssdk.services.s3.S3Client;
import software.amazon.awssdk.services.s3.model.GetObjectRequest;
import software.amazon.awssdk.services.s3.model.GetObjectResponse;
Expand Down Expand Up @@ -145,13 +147,15 @@ public void before(TestInfo testInfo) {
catalogName = client.newEntityName(testInfo.getTestMethod().orElseThrow().getName());
}

private RESTCatalog createCatalog(Optional<String> endpoint, Optional<String> stsEndpoint) {
private RESTCatalog createCatalog(
Optional<String> endpoint, Optional<String> stsEndpoint, boolean pathStyleAccess) {
AwsStorageConfigInfo.Builder storageConfig =
AwsStorageConfigInfo.builder()
.setRoleArn("arn:aws:iam::123456789012:role/polaris-test")
.setExternalId("externalId123")
.setUserArn("arn:aws:iam::123456789012:user/polaris-test")
.setStorageType(StorageConfigInfo.StorageTypeEnum.S3)
.setPathStyleAccess(pathStyleAccess)
.setAllowedLocations(List.of(storageBase.toString()));

endpoint.ifPresent(storageConfig::setEndpoint);
Expand Down Expand Up @@ -190,9 +194,11 @@ public void cleanUp() {
client.cleanUp(adminCredentials);
}

@Test
public void testCreateTable() throws IOException {
try (RESTCatalog restCatalog = createCatalog(Optional.of(endpoint), Optional.empty())) {
@ParameterizedTest
@ValueSource(booleans = {true, false})
public void testCreateTable(boolean pathStyle) throws IOException {
try (RESTCatalog restCatalog =
createCatalog(Optional.of(endpoint), Optional.empty(), pathStyle)) {
catalogApi.createNamespace(catalogName, "test-ns");
TableIdentifier id = TableIdentifier.of("test-ns", "t1");
Table table = restCatalog.createTable(id, SCHEMA);
Expand All @@ -212,14 +218,25 @@ public void testCreateTable() throws IOException {
.response();
assertThat(response.contentLength()).isGreaterThan(0);

LoadTableResponse loadTableResponse =
catalogApi.loadTableWithAccessDelegation(catalogName, id, "ALL");
assertThat(loadTableResponse.config()).containsKey("s3.endpoint");

if (pathStyle) {
assertThat(loadTableResponse.config())
.containsEntry("s3.path-style-access", Boolean.TRUE.toString());
}

restCatalog.dropTable(id);
assertThat(restCatalog.tableExists(id)).isFalse();
}
}

@Test
public void testAppendFiles() throws IOException {
try (RESTCatalog restCatalog = createCatalog(Optional.of(endpoint), Optional.of(endpoint))) {
@ParameterizedTest
@ValueSource(booleans = {true, false})
public void testAppendFiles(boolean pathStyle) throws IOException {
try (RESTCatalog restCatalog =
createCatalog(Optional.of(endpoint), Optional.of(endpoint), pathStyle)) {
catalogApi.createNamespace(catalogName, "test-ns");
TableIdentifier id = TableIdentifier.of("test-ns", "t1");
Table table = restCatalog.createTable(id, SCHEMA);
Expand All @@ -228,7 +245,11 @@ public void testAppendFiles() throws IOException {
@SuppressWarnings("resource")
FileIO io = table.io();

URI loc = URI.create(table.locationProvider().newDataLocation("test-file1.txt"));
URI loc =
URI.create(
table
.locationProvider()
.newDataLocation(String.format("test-file-%s.txt", pathStyle)));
OutputFile f1 = io.newOutputFile(loc.toString());
try (PositionOutputStream os = f1.create()) {
os.write("Hello World".getBytes(UTF_8));
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,12 @@
*/
package org.apache.polaris.service.quarkus.entity;

import static org.assertj.core.api.Assertions.assertThat;

import com.fasterxml.jackson.core.JsonProcessingException;
import com.fasterxml.jackson.databind.ObjectMapper;
import java.util.List;
import java.util.stream.Stream;
import org.apache.polaris.core.PolarisCallContext;
import org.apache.polaris.core.PolarisDefaultDiagServiceImpl;
import org.apache.polaris.core.admin.model.AwsStorageConfigInfo;
Expand All @@ -37,9 +42,12 @@
import org.junit.jupiter.api.BeforeEach;
import org.junit.jupiter.api.Test;
import org.junit.jupiter.params.ParameterizedTest;
import org.junit.jupiter.params.provider.Arguments;
import org.junit.jupiter.params.provider.MethodSource;
import org.junit.jupiter.params.provider.ValueSource;

public class CatalogEntityTest {
private static final ObjectMapper MAPPER = new ObjectMapper();

private CallContext callContext;

Expand Down Expand Up @@ -286,7 +294,7 @@ public void testCatalogTypeDefaultsToInternal() {
.build();

Catalog catalog = catalogEntity.asCatalog();
Assertions.assertThat(catalog.getType()).isEqualTo(Catalog.TypeEnum.INTERNAL);
assertThat(catalog.getType()).isEqualTo(Catalog.TypeEnum.INTERNAL);
}

@Test
Expand All @@ -309,7 +317,7 @@ public void testCatalogTypeExternalPreserved() {
.build();

Catalog catalog = catalogEntity.asCatalog();
Assertions.assertThat(catalog.getType()).isEqualTo(Catalog.TypeEnum.EXTERNAL);
assertThat(catalog.getType()).isEqualTo(Catalog.TypeEnum.EXTERNAL);
}

@Test
Expand All @@ -332,6 +340,61 @@ public void testCatalogTypeInternalExplicitlySet() {
.build();

Catalog catalog = catalogEntity.asCatalog();
Assertions.assertThat(catalog.getType()).isEqualTo(Catalog.TypeEnum.INTERNAL);
assertThat(catalog.getType()).isEqualTo(Catalog.TypeEnum.INTERNAL);
}

@Test
public void testAwsConfigJsonPropertiesPresence() throws JsonProcessingException {
AwsStorageConfigInfo.Builder b =
AwsStorageConfigInfo.builder()
.setStorageType(StorageConfigInfo.StorageTypeEnum.S3)
.setRoleArn("arn:aws:iam::012345678901:role/test-role");
assertThat(MAPPER.writeValueAsString(b.build())).contains("roleArn");
assertThat(MAPPER.writeValueAsString(b.build())).doesNotContain("endpoint");
assertThat(MAPPER.writeValueAsString(b.build())).doesNotContain("stsEndpoint");
assertThat(MAPPER.writeValueAsString(b.build())).doesNotContain("pathStyleAccess");

b.setEndpoint("http://s3.example.com");
b.setStsEndpoint("http://sts.example.com");
b.setPathStyleAccess(false);
assertThat(MAPPER.writeValueAsString(b.build())).contains("roleArn");
assertThat(MAPPER.writeValueAsString(b.build())).contains("endpoint");
assertThat(MAPPER.writeValueAsString(b.build())).contains("stsEndpoint");
assertThat(MAPPER.writeValueAsString(b.build())).contains("pathStyleAccess");
}

@ParameterizedTest
@MethodSource
public void testAwsConfigRoundTrip(AwsStorageConfigInfo config) throws JsonProcessingException {
String configStr = MAPPER.writeValueAsString(config);
CatalogEntity catalogEntity =
new CatalogEntity.Builder()
.setName("testAwsConfigRoundTrip")
.setDefaultBaseLocation(config.getAllowedLocations().getFirst())
.setCatalogType(Catalog.TypeEnum.INTERNAL.name())
.setStorageConfigurationInfo(
callContext,
MAPPER.readValue(configStr, StorageConfigInfo.class),
config.getAllowedLocations().getFirst())
.build();

Catalog catalog = catalogEntity.asCatalog();
assertThat(catalog.getStorageConfigInfo()).isEqualTo(config);
assertThat(MAPPER.writeValueAsString(catalog.getStorageConfigInfo())).isEqualTo(configStr);
}

public static Stream<Arguments> testAwsConfigRoundTrip() {
AwsStorageConfigInfo.Builder b =
AwsStorageConfigInfo.builder()
.setStorageType(StorageConfigInfo.StorageTypeEnum.S3)
.setAllowedLocations(List.of("s3://example.com"))
.setRoleArn("arn:aws:iam::012345678901:role/test-role");
return Stream.of(
Arguments.of(b.build()),
Arguments.of(b.setExternalId("ex1").build()),
Arguments.of(b.setRegion("us-west-2").build()),
Arguments.of(b.setEndpoint("http://s3.example.com:1234").build()),
Arguments.of(b.setStsEndpoint("http://sts.example.com:1234").build()),
Arguments.of(b.setPathStyleAccess(true).build()));
}
}
6 changes: 6 additions & 0 deletions spec/polaris-management-service.yml
Original file line number Diff line number Diff line change
Expand Up @@ -1056,6 +1056,12 @@ components:
type: string
description: endpoint for STS requests (optional). If not set, defaults to 'endpoint'.
example: "https://sts.example.com:1234"
pathStyleAccess:
type: boolean
description: >-
Whether S3 requests to files in this catalog should use 'path-style addressing for buckets'.
Default: false.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe OpenAPI actually supports default values, maybe we could just use that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added default: false, changed example to true.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I had to roll this back. Using Open API defaults breaks strict backward compatibility because is causes the attribute with a default value to be always returned to clients (even when not in use) - CI caught this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What exactly was the failure in CI? The client would fail to parse a response because it didn't recognize the property?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CatalogSerializationTest detected a change in the JSON text (when the new properties are not set)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that we do not have a specific test for new property values in REST responses. However, CatalogSerializationTest makes sure defaults in new properties in AwsStorageConfigInfo do not show up in JSON.

As I mentioned above, we're missing code for new properties to be propagated from AwsStorageConfigurationInfo to AwsStorageConfigInfo, so I cannot even add a test for that without extra changes, which is what I'm proposing to do in a follow-up PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually, never mind, I'll expand this PR a bit :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated. PTAL.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, the changes look better, but my concern about the default remains

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBH, I do not understand the whole discussion about "is this a breaking change".

As Polaris is primarily an Iceberg REST catalog, we have to let JSON parsing ignore unknown properties, because that is a requirement from Iceberg REST. There is also exactly one JSON parser (ObjectMapper) for all endpoints. The situation is unfortunate but a consequence of generating the OpenAPI code. Jackson itself allows much more fine grained control over how types are (de)serialized and allows back/forwards compatibility targeting functionality like @JsonView - but that's not usable w/ the code generation approach.

So fact is that unknown properties in JSON objects in any Polaris REST payload have to be ignored. As that fact leaks to all Polaris APIs, it also leaks to clients. This means, that a new property is not a breaking change. If clients fail on unknown properties, it's a bug in those client.

example: false
required:
- roleArn

Expand Down
Loading