Skip to content

Conversation

yingjianwu98
Copy link
Contributor

@yingjianwu98 yingjianwu98 commented Jun 18, 2025

Set Up Aurora Storage for Metacat

Overview

This PR sets up Aurora storage for Metacat, introducing several key changes to ensure compatibility and performance.

Changes

  1. JDBC Template for Aurora Reader

    • Utilizes JDBC template for Aurora reader as Spring JPA entityManager doesn't support two entityManagers pointing to the same entity, where it looks like it is always the primary entityManager will be used, which means the replica won't be used in this case.
    • Context:
      • With CRDB, a single entityManager is sufficient, using read_follower in SQL queries to route to replicas.
      • Aurora requires separate data sources for primary and replica connections.
    • Files Affected: Changes in metacat-connector-polaris.
  2. Aurora Compatibility and Control Flag

    • Adds compatibility for Aurora with a control flag to manage storage switching easily.
    • Note, this is more than adding a control flag as we need to make sure the already established.bean is ready to handle both storages
    • Motivation: Allows quick toggling between storage systems without cluster redeployment during rollout phases.
    • Files Affected: Updates in metacat-common-server/src.
  3. Functional Test Refactoring

    • Refactors tests to use real CRDB/Aurora containers instead of H2 in-memory DB containers.
    • Files Affected: Changes in functionalTest under metacat-connector-polaris.
  4. Functional Tests for CRDB and Aurora

    • Ensures tests run against both CRDB and Aurora.
    • Files Affected: Updates in GitHub workflows to run the tests in parallel and metacat-functional-tests.
  5. Some minor improvements to allow test report to be downloadable for debugging purposes

    • See GitHub workflows

@yingjianwu98 yingjianwu98 force-pushed the yingjianw/auroraJDBC branch 8 times, most recently from 5ff874e to 5712246 Compare June 21, 2025 00:30
Copy link
Contributor

@ldsantos0911 ldsantos0911 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All in all LGTM! Just a few questions and nits.

* @param env Spring Environment
* @param isAuroraEnabled isAuroraEnabled
*/
public MetacatProperties(final Environment env, final boolean isAuroraEnabled) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: It might be good to add a new constructor rather than changing the existing one. I think you can keep the original and just proxy that through to this one, with isAuroraEnabled set to null. I think that would decrease the change's footprint.

facets {
functionalTest {
parentSourceSet = "test"
includeInCheckLifecycle = true
testTaskName = 'functionalTestWithDifferentDB'
includeInCheckLifecycle = false
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we switching the value of includeInCheckLifecycle?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the reason is
./gradlew clean build will run the functional test
but we already have an action for
./gradlew clean build functionalTest in our github action

@@ -61,21 +64,29 @@ public void deleteDatabase(final String dbName) {
}

@Override
@Transactional(propagation = Propagation.SUPPORTS)
@Transactional(propagation = Propagation.REQUIRED)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we shifting these to REQUIRED?

*/
protected <T> Slice<T> getAllDatabasesForCurrentPage(
final String dbNamePrefix, final Pageable page, final boolean selectAllColumns) {
return null;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Might be good to throw a NotImplemented exception here.

return (List<PolarisDatabaseEntity>) dbRepo.getAllDatabases(dbNamePrefix, sort, pageSize, true);
final int pageSize,
final boolean isAuroraEnabled) {
return (List<PolarisDatabaseEntity>)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I'd recommend adding a function like getDBRepo(isAuroraEnabled) and getTblRepo(isAuroraEnabled) that return BasePolarisDatabaseReplicaRepository and BasePolarisTableReplicaRepository respectively. Then you can do the ternary operator there and return the appropriate value and just call getAllDatabases on the return value. It might simplify the code a little bit.

Yingjian Wu and others added 5 commits August 23, 2025 10:05
wip
rebase

wip

wip

wip

Asda

Asda

Asda

asda

asda

asda

asda

asda

asda

asda

asda

asda

asda

asda

asda

asda

asda

asda

asda

asda

asda

asda

asda

Adas

Adas

Adas

Adas

Adas

Adas

Adas

wip

wip

wip

Adas

wip

wip

wip

wip

wip

wip

wip

wip

wip

wip

wip

wip

wip

wip

wip

wip

wip

wip

wip

wip

wip

wip

wip

wip

wip

wip

wip

wip

wip

wip

wip

wip

wip

wip

wip

wip

wip

wip

wip

wip

wip

wip

wip
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants