-
Notifications
You must be signed in to change notification settings - Fork 2.4k
feat(schema): Internal Schema System Integration with HoodieSchema #14314
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(schema): Internal Schema System Integration with HoodieSchema #14314
Conversation
| * @param type the non-null type | ||
| * @return new HoodieSchema representing a nullable union | ||
| */ | ||
| public static HoodieSchema createNullableSchema(HoodieSchema type) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is another method createNullable that covers similar functionality so I am deduping
| */ | ||
| public Option<String> getFullName() { | ||
| return Option.ofNullable(avroSchema.getFullName()); | ||
| public String getFullName() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The name is always meant to be non-empty in avro and I think we should follow a similar pattern here or we end up with a lot of option handling code in the callers
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a precondition in the constructor of HoodieSchema to verify this assumption
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hudi-common/src/main/java/org/apache/hudi/internal/schema/convert/InternalSchemaConverter.java
Show resolved
Hide resolved
hudi-common/src/main/java/org/apache/hudi/internal/schema/convert/InternalSchemaConverter.java
Show resolved
Hide resolved
hudi-common/src/main/java/org/apache/hudi/internal/schema/convert/InternalSchemaConverter.java
Outdated
Show resolved
Hide resolved
| import org.apache.hudi.common.schema.HoodieJsonProperties; | ||
| import org.apache.hudi.common.schema.HoodieSchema; | ||
| import org.apache.hudi.common.schema.HoodieSchemaField; | ||
| import org.apache.hudi.common.schema.HoodieSchemaType; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I start to think that it's better to remove the Hoodie prefix from these class names in org.apache.hudi.common.schema packages, so it's easier to read, as eventually we are going to only maintain one schema system. Wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it is implied by the package but I think we should do that after we have removed the avro usage. Otherwise the difference is more subtle and likely to be missed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good.
| return getTypes().stream() | ||
| .filter(schema -> schema.getType() != HoodieSchemaType.NULL) | ||
| .findFirst() | ||
| .orElseThrow(() -> new IllegalArgumentException("No non-null type found in Union")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use HoodieSchema#getNonNullType instead? Or is this only intended to get the type, so missing an element of a union is fine?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll switch to getNonNullType. I forgot to update this branch after the other one was merged.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it
hudi-common/src/main/java/org/apache/hudi/internal/schema/convert/InternalSchemaConverter.java
Outdated
Show resolved
Hide resolved
hudi-common/src/main/java/org/apache/hudi/internal/schema/convert/InternalSchemaConverter.java
Outdated
Show resolved
Hide resolved
.../hudi-client-common/src/main/java/org/apache/hudi/table/action/commit/HoodieMergeHelper.java
Show resolved
Hide resolved
...spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/HoodieSparkSqlWriter.scala
Show resolved
Hide resolved
.../hudi-spark/src/test/scala/org/apache/spark/sql/hudi/common/TestHoodieInternalRowUtils.scala
Show resolved
Hide resolved
85f648a to
5b63f28
Compare
12cfee8 to
970cc42
Compare
2159f34 to
df96269
Compare
yihua
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
| import org.apache.hudi.common.schema.HoodieJsonProperties; | ||
| import org.apache.hudi.common.schema.HoodieSchema; | ||
| import org.apache.hudi.common.schema.HoodieSchemaField; | ||
| import org.apache.hudi.common.schema.HoodieSchemaType; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good.
| return getTypes().stream() | ||
| .filter(schema -> schema.getType() != HoodieSchemaType.NULL) | ||
| .findFirst() | ||
| .orElseThrow(() -> new IllegalArgumentException("No non-null type found in Union")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it
Describe the issue this Pull Request addresses
Addresses #14269
The
AvroInternalSchemaConvertercurrently directly relies on an Avro schema. This PR switches the class to operate onHoodieSchemaas part of the migration to Hoodie's own type system.Summary and Changelog
AvroInternalSchemaConverteris renamed toInternalSchemaConverterInternalSchemaConverternow takeHoodieSchemainstead of an avro schemaHoodieSchemawith thefromAvroSchemamethodHoodieSchemadirectly when possibleImpact
This is laying the groundwork for moving the codebase over to Hudi's own type system.
Risk Level
Low, this maintains parity with the existing code
Documentation Update
Contributor's checklist