Skip to content

Rename SUPPORTED_CATALOG_CONNECTION_TYPES #1959

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

poojanilangekar
Copy link
Contributor

@poojanilangekar poojanilangekar commented Jun 26, 2025

To ensure backward compatibility, process to rename SUPPORTED_CATALOG_CONNECTION_TYPES will consist of two steps.

  1. Add SUPPORTED_EXTERNAL_CATALOG_CONNECTION_TYPES.
  2. Remove the legacy SUPPORTED_CATALOG_CONNECTION_TYPES.

As discussed in PR #1931 , for a while Polaris will initially support both SUPPORTED_CATALOG_CONNECTION_TYPES and SUPPORTED_EXTERNAL_CATALOG_CONNECTION_TYPES. Later, we will remove the SUPPORTED_CATALOG_CONNECTION_TYPES to make it consistent with AUTHENTICATION_TYPES.

To support the rename, this PR introduces a new PolarisConfiguration parameter called legacyKey.

@poojanilangekar
Copy link
Contributor Author

CC @dimas-b @eric-maynard

@dimas-b
Copy link
Contributor

dimas-b commented Jun 27, 2025

Let's hold this until #1952 merges so that we could properly track the property rename in CHANGELOG.

@dimas-b dimas-b requested review from eric-maynard and dimas-b June 27, 2025 00:20
@@ -266,10 +266,18 @@ public static void enforceFeatureEnabledOrThrow(
public static final FeatureConfiguration<List<String>> SUPPORTED_CATALOG_CONNECTION_TYPES =
PolarisConfiguration.<List<String>>builder()
.key("SUPPORTED_CATALOG_CONNECTION_TYPES")
.description("The list of supported catalog connection types for federation")
.description("(Deprecated) The list of supported catalog connection types for federation")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe it would be more maintainable to have one FeatureConfiguration object which supported both the old and the new key names (cf. catalogConfigUnsafe()).

Copy link
Contributor

@eric-maynard eric-maynard Jun 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see a few options:

  1. Introduce a framework to support multiple config names, with some of them marked for deprecation, similar to catalogConfigUnsafe
  2. Just leave the name alone
  3. Just make the breaking change and switch the config name
  4. Change the variable name but not the config name
  5. Add a second config with a new name, change the logic to take the union of the two, and mark the old one deprecated

I think all of these are fine but (1) is definitely the most work. Looks like this is (5) but this random (Deprecated) comment is doing a lot of heavy lifting. We don't get logging like we would get with (1) and the code is less clean.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, I've implemented 1.

Copy link
Contributor

@dimas-b dimas-b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please rebase and add a CHANGELOG note about the config property rename (probably under "Deprecations")

SUPPORTED_EXTERNAL_CATALOG_CONNECTION_TYPES =
PolarisConfiguration.<List<String>>builder()
.key("SUPPORTED_EXTERNAL_CATALOG_CONNECTION_TYPES")
.deprecatedKey("SUPPORTED_CATALOG_CONNECTION_TYPES")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer a name like .legacyKey() or .compatibilityKey() for the sake of clarity. "Deprecated" key as a builder parameter does not convey a reason for this key to exist.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if we're really doing this we need to take multiple keys, so like
PolarisConfiguration.<>builder().keys(...). Maybe that should take some ConfigurationKey, or maybe we should have some other mechanism for marking which are deprecated.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Multiple keys SGTM.... but from the builder API POV, we could just accept multiple calls to .key(), .deprecatedKey() (or .legacyKey()), right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can make the changes. Just so we are on the same page, does the following approach make sense to both of you?:

  1. Support a single key per config, because ideally at any given time, we want at most one active key per configuration.
  2. Support multiple legacyKeys per config. So we can make multiple calls to legacyKey() while building a config.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whether we use one key call that takes N strings or we allow N chained calls to key / deprecatedKey, the point is we should support N different keys rather than 2.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -112,6 +112,7 @@ polaris.features."ENFORCE_PRINCIPAL_CREDENTIAL_ROTATION_REQUIRED_CHECKING"=false
polaris.features."SUPPORTED_CATALOG_STORAGE_TYPES"=["S3","GCS","AZURE"]
# polaris.features."ENABLE_CATALOG_FEDERATION"=true
polaris.features."SUPPORTED_CATALOG_CONNECTION_TYPES"=["ICEBERG_REST"]
polaris.features."ENABLE_EXTERNAL_CATALOG_CONNECTION_TYPES"=["ICEBERG_REST"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this change relevant to current PR's scope (and title)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry about that. It was added by auto complete. I have removed it.

@poojanilangekar poojanilangekar changed the title Rename SUPPORTED_CATALOG_CONNECTION_TYPES (1/2) Rename SUPPORTED_CATALOG_CONNECTION_TYPES Jun 30, 2025
@poojanilangekar poojanilangekar force-pushed the supported-connections branch from fe026fc to c51edb7 Compare June 30, 2025 19:49
dimas-b
dimas-b previously approved these changes Jun 30, 2025
Copy link
Contributor

@dimas-b dimas-b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Current changes in this PR LGTM 👍 I'd be ok with merging now and enhancing later, but I'm also fine with implementing @eric-maynard 's suggestions in this PR.

@github-project-automation github-project-automation bot moved this from PRs In Progress to Ready to merge in Basic Kanban Board Jun 30, 2025
@poojanilangekar
Copy link
Contributor Author

Hi @eric-maynard and @dimas-b can you please take a look at the change to support multiple legacyKeys?

@@ -41,6 +43,7 @@ public abstract class PolarisConfiguration<T> {

public final String key;
public final String description;
public final Set<String> legacyKeys;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this value used for property config value lookup? Sorry, if I missed it in the diff 🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it is used when registering a new configuration.

Previously (before this change), we would go over existingConfigurations check if it matched any existingConfiguration's key. Now, we check if the new configuration (both key and legacyKeys) match either an existing key (line 61) or existing legacyKeys (line 65 - 76)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct... but we should also use it to find config values in case the key is not found, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way I have implemented it, any configuration always has a key(I think that's a valid requirement). Is there a reason we would we keep a configuration with just legacyKeys?

Copy link
Contributor Author

@poojanilangekar poojanilangekar Jul 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I think I get what you mean. I am guessing you meant the aspect of reading in the properties. If so I will send a fix shortly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thx! To clarify: old config entries should keep working until we drop support for them (e.g. in 2.0). Let's add a WARN log message when we have to fall back to the "legacy" property name so that users are notified in runtime and are more aware of the need to migrating to the new key.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@poojanilangekar poojanilangekar requested a review from dimas-b July 2, 2025 20:58
dimas-b
dimas-b previously approved these changes Jul 2, 2025
Copy link
Contributor

@dimas-b dimas-b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks, @poojanilangekar !

}

@Test
public void testLegacyKeyWarningIsLogged() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does this test ensure the warning is logged?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be ok without this test too :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was my mistake, I should have removed it, I was trying to get the log warning but couldn't capture it. I've removed it now. (The functionality is still tested in testLegacyKeyLookup)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sgtm 👍

@@ -41,6 +43,7 @@ public abstract class PolarisConfiguration<T> {

public final String key;
public final String description;
public final Set<String> legacyKeys;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I did not notice this before, but I think having public fiends in general is not a good idea. Since you're toughing this code in current PR, would you mind following up with another PR to convert them to private fields?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I will send out a follow up PR.

@dimas-b
Copy link
Contributor

dimas-b commented Jul 2, 2025

@eric-maynard : since you had previous comments, PTAL.

dimas-b
dimas-b previously approved these changes Jul 2, 2025
public String legacyKeys() {
if (legacyKeys.isEmpty()) {
throw new IllegalStateException(
"Attempted to read legacy keys from a configuration that doesn't have any.");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't look right to me; the check is there for catalogConfig because it's not correct to call the variant of getConfiguration which takes a catalog when a config doesn't have a catalogConfig. However we should always be checking legacyKeys every time we load any config.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've removed it.

return !legacyKeys.isEmpty();
}

public String legacyKeys() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would not expect this to return a String either.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Comment on lines 104 to 105
if (result == null && config.hasLegacyKeys()) {
for (String legacyKey : config.legacyKeys) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If legacyKeys just returned a set (or there was a keys that returned a set) then the if statement here goes away.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

"Legacy configuration key '{}' is in use. Please use '{}' instead.",
legacyKey,
config.key);
break;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure we want a break here

  1. If multiple legacy keys really are in use, we should want to log for each of them
  2. Barring the above, you could just early return rather than break

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we can use an early return directly because we need to try and cast the value to its defined type. We could duplicate the code here but I think this is cleaner.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cast is just tryCast(config, result) which is not a crazy amount of logic to replicate. But I wouldn't advocate for the early return either, because of reason (1) above

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding (1), I don't think that is accurate, even with the default key, there is an order of precedence in which properties are applied. When multiple keys are found, Quarkus picks the first property that is encounters. So I'm unsure why we'd want to apply multiple legacy keys. Unless you're suggesting simply logging multiple legacy keys while only accepting the fist legacy key. Please let me know.

Copy link
Contributor

@eric-maynard eric-maynard Jul 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. If multiple legacy keys really are in use, we should want to log for each of them

Yes, I'm referring to the logging.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -34,11 +35,12 @@ public class BehaviorChangeConfiguration<T> extends PolarisConfiguration<T> {

protected BehaviorChangeConfiguration(
String key,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why can you have multiple legacyKeys but not multiple keys? It seems like maybe both should be Sets that support multiple calls.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is my understanding: A PolarisConfiguration or a configuration for any quarkus app has a name (key in PoalrisConfiguration) that is used to identify it. The key/name is the identity of the configuration. The point of adding support for legacyKeys was to enable identity changes wherein we can for certain time support deprecated keys. I don't see why we should have multiple identifying keys mapping to a single entity. If you ever need multiple identities, then the legacy keys do support the need but eventually each entity must converge to a single identity.

Please let me know if I am missing something here.

Copy link
Contributor

@eric-maynard eric-maynard Jul 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It just seems odd that we would support multiple invalid keys but not multiple valid keys. In my original comment here I wrote:

Introduce a framework to support multiple config names, with some of them marked for deprecation, similar to catalogConfigUnsafe

rather than ... with all but one marked for deprecation ... because I had this in mind.

I do agree that the situation where you'd have multiple valid configs seems contrived, but so does the situation where you'd have multiple invalid configs or really even one invalid config. So if we're overhauling the whole key system, I think it makes sense to go all the way.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we support multiple invalid keys. We track multiple legacy keys.

From a semantic perspective, a PrimaryKey in a RDBMS, a keyin a key-value store or a Java Map (or any other programming language) maps to a single row, value or entity. In all these examples, two keys never map to the same entity.

What we have with the legacyKeys is a way to track historic versions of the key-value pair. In any DBMS or KV store that supports multi-versioning, it's pretty normal to have multiple entries (multiple versions) for a single key. When the primaryKey/key changes, the DBMS/KV store adds a tombstone for the old record and creates a new record with the new key. While changing keys in PolarisConfig, we are essentially creating a new key-value pair in itself but its nice to track the older key, hence the support for legacy keys. Essentially one could rewrite the entire PolarisConfig class with a Map where the key is the active key and the value is everything else. In which case, we'd track the legacyKeys as a part of the value.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The key is not a primary key though. There's no RDBMS or KV store storing these; they describe entries in a config file. It seems reasonable to expect that multiple different entries could be used to configure the same feature.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider, too, that the current code needs to deal with a situation where the key has value true but the legacyKey has value false. Or two legacy keys have different values. Clearly, the key(s) is not unique and the code needs to handle multiple keys for a given feature.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I disagree with this:

It seems reasonable to expect that multiple different entries could be used to configure the same feature.

You can have two configurations that store the same logical value and then let the specific module decide upon how to reconcile any conflicting values. But the config service should not register two configurations for the same feature.

Regarding the case where the key is true and legacyKey is false. I don't see how it is any different from a default value being defined say FeatureConfiguration.java, a value being defined in application.properties and then a JVM system property. In this case the system property takes precedence. Similarly the key takes precedence over legacyKey.

Copy link
Contributor

@eric-maynard eric-maynard Jul 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I disagree with this:

It seems reasonable to expect that multiple different entries could be used to configure the same feature.

Your current PR has this functionality... except only partially:

Keys Legacy Keys Multiple Entries OK?
0 0 No Yes
0 1 No Yes
0 2+ Yes Yes
1 0 No Yes
1 1 Yes Yes
1 2+ Yes Yes
2 0 Yes No
2 1 Yes No
2 2+ Yes No

What is the rationale behind this asymmetry? If, as stated before, it is because the key is thought of as a primary key then this is just not correct. You can write a valid config that violates the "primary key constraint" by duplicating an entry.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The case with 0 keys is not possible. We can't define a PolarisConfig without a key. That was always the case. Unless you'd prefer if we annotate the field with @NonNull in the constructor.

You can write a valid config that violates the "primary key constraint" by duplicating an entry.

The registerConfiguration function will actually disallow duplicate entries and throw an IllegalArgumentException. That check is a part of the existing code, this PR is not change it. This PR goes further and checks that any existing key or legacyKey does not overlap with a new configuration that is being registered (again both key and legacyKeys).

I think we're looking at two entirely different aspects of a configuration.

  1. Declaring the configuration in FeatureConfiguration or BehaviorChangeConfiguration.
  2. Assigning values to a configuration which is done in application.properties / JVM arguments.

The two are orthogonal to each other. We should impose constraints on how PolarisConfigs are declared. We can't however impose constraints on how the user defines these configs. If the user does have multiple definitions of the same config, then we apply the one with the highest priority and discard the rest. If indeed we should disallow multiple definitions, then any config that has a default value defined should essentially disallow any overwrites in application.properties. AFAIK, Quarkus doesn't do that and there isn't a way for the app to determine such a condition.

T result = getConfiguration(realmContext, config.key, config.defaultValue);
T result = getConfiguration(realmContext, config.key);

if (result == null && config.hasLegacyKeys()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the key overrides the legacy keys?

T result = getConfiguration(realmContext, config.key);

if (result == null && config.hasLegacyKeys()) {
for (String legacyKey : config.legacyKeys()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will lead to nondeterministic behavior if two legacy keys for a feature have different values

}

@Test
public void testMainKeyTakesPrecedenceOverLegacyKeys() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this happens, we should probably actually error

@@ -186,4 +185,78 @@ public void testEntityOverrides() {
Assertions.assertEquals(
"entity-legacy4", store.getConfiguration(testRealmContext, entity, cfg.apply(4)));
}

@Test
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to add a test for conflicting legacy keys

Copy link
Contributor

@eric-maynard eric-maynard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR is now doing more than just a rename, and we should figure out what model we actually want to do for configs before going ahead with the change.

There's also a bug when multiple legacy keys are used for one config.

@github-project-automation github-project-automation bot moved this from Ready to merge to PRs In Progress in Basic Kanban Board Jul 10, 2025
@poojanilangekar
Copy link
Contributor Author

Hey @eric-maynard, I agree that this change has gotten larger than what I had initially planned. I am planning to close this PR because I think it's orthogonal to the goal of the PR.

If @dimas-b and you are amenable to making the breaking change and doing the rename, I can update this PR.

Otherwise, we can leave the config SUPPORTED_CATALOG_CONNECTION_TYPES as is, so I can spend more time on other tasks.

@github-project-automation github-project-automation bot moved this from PRs In Progress to Done in Basic Kanban Board Jul 10, 2025
@poojanilangekar
Copy link
Contributor Author

I'd prefer doing something similar to #2000. (The PR adds a deprecated comment on the existing property and creates a new one.)

@dimas-b
Copy link
Contributor

dimas-b commented Jul 10, 2025

I'd prefer using the old config name (for now) to making a breaking change.

Appreciate your effort on this PR @poojanilangekar !

@dimas-b
Copy link
Contributor

dimas-b commented Jul 10, 2025

This PR deals with Polaris-specific config, so I'm not sure #2000 is applicable to this rename "as is". The tricky part is that Polaris config can be realm-specific and we cannot provide redirects in those cases (realm names are unknown at compile time).

@dimas-b
Copy link
Contributor

dimas-b commented Jul 10, 2025

It is possible to have custom code to deal with Smallrye configs at a low level and provide property redirects, but I think it'll be much more complex in the end.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants