Skip to content

[SPARK-56296][SQL] Pivot createTableLike to pass full TableInfo including schema, partitioning, constraints, and owner#55101

Open
viirya wants to merge 1 commit intoapache:masterfrom
viirya:create-table-like-prop-owner-fix
Open

[SPARK-56296][SQL] Pivot createTableLike to pass full TableInfo including schema, partitioning, constraints, and owner#55101
viirya wants to merge 1 commit intoapache:masterfrom
viirya:create-table-like-prop-owner-fix

Conversation

@viirya
Copy link
Copy Markdown
Member

@viirya viirya commented Mar 30, 2026

What changes were proposed in this pull request?

Previously createTableLike(ident, sourceTable, userSpecifiedOverrides) only passed user-specified TBLPROPERTIES to the connector via TableInfo, requiring connectors to call CurrentUserContext.getCurrentUser (a Catalyst internal) to set the owner.

We should not expose Catalyst internal to connectors. But putting owner to the TableInfo means that we will have to rename userProvidedOverrides to something else, meaning it no longer contains only user overrides.

This change pivots to createTableLike(ident, tableInfo, sourceTable) where tableInfo contains all explicit information for the new table:

  • columns and partitioning copied from the source
  • constraints copied from the source
  • user-specified TBLPROPERTIES, LOCATION, and USING provider (if given)
  • PROP_OWNER set to the current user

Source table properties are intentionally excluded from tableInfo; connectors receive sourceTable to clone any format-specific or custom state they need. This matches the pattern used by REPLACE TABLE in DSv2.

Update InMemoryTableCatalog, CatalogSuite, and CreateTableLikeSuite accordingly.

Why are the changes needed?

This pivot is necessary to keep a balance between API consistency and internal exposure.

Does this PR introduce any user-facing change?

No

How was this patch tested?

Unit tests

Was this patch authored or co-authored using generative AI tooling?

Generated-by: Claude Sonnet 4.6

…ding schema, partitioning, constraints, and owner

Previously createTableLike(ident, sourceTable, userSpecifiedOverrides) only
passed user-specified TBLPROPERTIES to the connector via TableInfo, requiring
connectors to (1) call CurrentUserContext.getCurrentUser (a Catalyst internal)
to set the owner, and (2) read columns/partitioning/constraints themselves from
sourceTable.

This change pivots to createTableLike(ident, tableInfo, sourceTable) where
tableInfo contains all explicit information for the new table:
- columns and partitioning copied from the source
- constraints copied from the source
- user-specified TBLPROPERTIES, LOCATION, and USING provider (if given)
- PROP_OWNER set to the current user

Source table properties are intentionally excluded from tableInfo; connectors
receive sourceTable to clone any format-specific or custom state they need.
This matches the pattern used by REPLACE TABLE in DSv2.

Update InMemoryTableCatalog, CatalogSuite, and CreateTableLikeSuite accordingly.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@viirya viirya requested a review from aokolnychyi March 31, 2026 00:14
sourceTable.columns()
}

// Build TableInfo with all explicit information for the new table:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure how useful this comment is as the code is pretty obvious on its own.

I would consider adding two helper methods and adding comments there:

// Derive target columns from source; for V1Table sources apply CharVarcharUtils to preserve
// CHAR/VARCHAR types as declared rather than collapsed to StringType.
private def targetColumns: Array[Column] = sourceTable match {
  case v1: V1Table =>
    CatalogV2Util.structTypeToV2Columns(CharVarcharUtils.getRawSchema(v1.catalogTable.schema))
  case _ =>
    sourceTable.columns
}

// Source table properties are intentionally excluded; connectors read sourceTable
// to clone any additional format-specific or custom state they need.
private def targetProperties: Map[String, String] = {
  properties ++
    provider.map(TableCatalog.PROP_PROVIDER -> _) ++
    location.map(uri => TableCatalog.PROP_LOCATION -> CatalogUtils.URIToString(uri)) +
    (TableCatalog.PROP_OWNER -> CurrentUserContext.getCurrentUser)
}

Then you can drop this larger comment and call it like this:

val tableInfo = new TableInfo.Builder()
  .withColumns(targetColumns)
  .withPartitions(sourceTable.partitioning)
  .withConstraints(sourceTable.constraints)
  .withProperties(targetProperties.asJava)
  .build()
targetCatalog.createTableLike(targetIdent, tableInfo, sourceTable)

Up to you, though.

val userSpecifiedOverrides = new TableInfo.Builder()
val tableInfo = new TableInfo.Builder()
.withColumns(sourceColumns)
.withPartitions(sourceTable.partitioning())
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: Drop () for partitioning and constraints as you simply access getters?

Copy link
Copy Markdown
Contributor

@aokolnychyi aokolnychyi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 with one minor style suggestion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants