Skip to content

Conversation

@CTTY
Copy link
Contributor

@CTTY CTTY commented Mar 10, 2025

Describe the issue this Pull Request addresses

Since #12529 , BaseCommitActionExecutor would update col stats by default. But bootstrap operation doesn't support col stats and will fail the bootstraping

This PR is to allow bootstrap to complete without updating col stats.

Caused by: org.apache.hudi.exception.HoodieNotSupportedException: col stats is not supported with bootstrap operation
	at org.apache.hudi.table.action.bootstrap.SparkBootstrapCommitActionExecutor.updateColumnsToIndexForColumnStats(SparkBootstrapCommitActionExecutor.java:225)
	at org.apache.hudi.table.action.commit.BaseCommitActionExecutor.lambda$commit$b950a45b$1(BaseCommitActionExecutor.java:239)
	at org.apache.hudi.client.HoodieColumnStatsIndexUtils.updateColsToIndex(HoodieColumnStatsIndexUtils.java:73)
	... 63 more

Summary and Changelog

  • Catch and handle the exception

Impact

None

Risk Level

None

Documentation Update

None

Contributor's checklist

  • Read through contributor's guide
  • Enough context is provided in the sections above
  • Adequate tests were added if applicable

@github-actions github-actions bot added the size:S PR with lines of changes in (10, 100] label Mar 10, 2025
updateColumnsToIndexForColumnStats(metaClient, columnsToIndex);
return null;
});
} catch (UnsupportedOperationException uoe) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we catch a specific exception for bootstrap here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you checked that there is no failed deltacommit in MDT this case and files partition is intact, and the index ready to use does not contain col_stats in the table config?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be good to add this validation in the test.

return null;
});
} catch (UnsupportedOperationException uoe) {
LOG.warn("Failed to update col stats, bootstrap doesn't support col stats", uoe);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly, let’s add or modify a test case to increase branch coverage.

@CTTY CTTY force-pushed the ctty/fix-ba-colstats branch from 315f643 to cd644f1 Compare March 26, 2025 04:12
@CTTY
Copy link
Contributor Author

CTTY commented Mar 26, 2025

#12977 seems to be able to fix this issue, I have not tested it yet. I'll test it later

@CTTY
Copy link
Contributor Author

CTTY commented May 20, 2025

This issue exists on the released Hudi 1.0.2 as well and will be needed

@CTTY CTTY force-pushed the ctty/fix-ba-colstats branch from d29bc45 to 74e0ae3 Compare November 25, 2025 00:34
@CTTY CTTY changed the title [HUDI-9154] Allow bootstrap to complete without updating col stats fix: Allow bootstrap to complete without updating col stats Nov 25, 2025
@hudi-bot
Copy link

CI report:

Bot commands @hudi-bot supports the following commands:
  • @hudi-bot run azure re-run the last Azure build

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:S PR with lines of changes in (10, 100]

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants