Skip to content

Conversation

@linliu-code
Copy link
Contributor

@linliu-code linliu-code commented Nov 21, 2025

Change Logs

In PR #9743, the InternalSchema has only one timestamp logical type that represents timestamp-micros, such that the logical type system was unable to represent columns that should be of logical type timestamp-millis. That means, when a column is of timestamp-millis type, and its value was treated as microseconds, which caused data corruptions.

PR #13711 introduced a more complete logical type system to handle schema evolution for column stats, which fixes the logical timestamp issue by introducing multiple logical timestamp types, like timestamp-millis, timestamp-micros, local-timestamp-millis and local-timestamp-micros.

For branch-0.x, we aim to fix the logical timestamp issue instead of the entire logical systems introduced in #13711.

In this PR, we fix the logical timestamp related issue by picking the relevant changes from #13711, and excluding irrelevant changes from this PR.

Impact

Medium.

Risk level (write none, low medium or high below)

Medium.

Documentation Update

Contributor's checklist

  • Read through contributor's guide
  • Enough context is provided in the sections above
  • Adequate tests were added if applicable

@github-actions github-actions bot added the size:XL PR with lines of changes > 1000 label Nov 21, 2025
@linliu-code linliu-code force-pushed the branch-0.x branch 2 times, most recently from 1905a37 to 3efea4c Compare November 21, 2025 16:13
@linliu-code linliu-code changed the title fix: Fix logical timestamp issue [HUDI-6872]Fix logical timestamp issue Nov 24, 2025
public abstract MessageType convert(Schema schema);

public abstract Schema convert(MessageType schema);
} No newline at end of file
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to add the extra empty line.

}

private void logicalAssertions(Schema tableSchema, String tableBasePath, Map<String, String> hudiOpts, int tableVersion) {
if (tableVersion > 8) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could remove these >8 statements since they will not be executed at all.

@linliu-code linliu-code changed the title [HUDI-6872]Fix logical timestamp issue [HUDI-6872] Fix logical timestamp issue Nov 24, 2025
@linliu-code linliu-code changed the title [HUDI-6872] Fix logical timestamp issue fix(HUDI-6872): Fix logical timestamp issue Nov 24, 2025
@linliu-code linliu-code changed the title fix(HUDI-6872): Fix logical timestamp issue [MINIOR] Fix logical timestamp issue Nov 24, 2025
@linliu-code linliu-code changed the title [MINIOR] Fix logical timestamp issue [MINOR] Fix logical timestamp issue Nov 24, 2025
@hudi-bot
Copy link

CI report:

Bot commands @hudi-bot supports the following commands:
  • @hudi-bot run azure re-run the last Azure build

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:XL PR with lines of changes > 1000

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants