-
Notifications
You must be signed in to change notification settings - Fork 2.4k
Migrating tables to generic partitioning support
vinoth chandar edited this page Mar 27, 2017
·
1 revision
Till 0.3.1, we have assumed implicitly that the data is partitioned by dates (which was a very very popular observation), i.e all partitions can be found 3 levels down from basepath via basePath/year/month/day. With PR121, we plan to generalize this, by maintaining a .hoodie_partition_metadata file under each partition.
- Before rolling out new
hoodie-clientjar, with these changes, please setwithAssumeDatePartitioning(true)in your HoodieWriteConfig. Without this, hoodie-client will look for partitions based on the metadata and if cannot find anything, it will not be able to write data.
- Use the cli tool with
repair addpartitionmetato add this metadata to existing partitions/tables - Rollout new
hoodie-client, withwithAssumeDatePartitioning(false)[default], all new partitions will have the metadata going forward - You can upgrade query engines with new
hoodie-hadoop-mrjar, if you plan to have non date partitioned tables. Old input format with continue to work on date partitioned tables.