RFC-100: Lance File Format support in Hudi #14128
rahil-c
started this conversation in
New Feature Ideas
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
✅ Lance File Format Integration Tasks
See the following feature for more context: #14127
In regards to the following new feature for supporting unstructured data in Hudi via formats like Lance that are focused on AI/ML use cases. Here is the initial scope of what we are targeting(Note this list will continue to grow as we find get deeper within the integration, for now it aims to first support the Hudi Spark Client):
HoodieFileWriterfor Lance with a Spark implementation, PR (merged in feature branch)HoodieFileReaderfor Lance with a Spark implementation, PRSparkColumnarFileReaderimplementation for Spark Datasource Integration with Lance, PRColumnarBatchvectorized readingWill be raising changes on the following open source feature branch: https://github.com/apache/hudi/tree/feature-branch-rfc100-unstructured-data
Improving File Format Integration Points
Changes that are modifying existing hudi master (1.2.0) to allow for better integration with file formats in the future.
Beta Was this translation helpful? Give feedback.
All reactions