-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-45720][BUILD][DSTREAM][KINESIS] Upgrade KCL to 2.7.1 and remove AWS SDK for Java 1.x dependency #53256
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Upgrade codes related to IAM, DDB and Kinesis clients Shade SDKv2 dependencies Use Spark's default protobuf version in the kinesis-asl-assembly module Import mbknor-jackson-jsonschema_2.13 for glue-schema-registry Jira: https://issues.apache.org/jira/browse/SPARK-45720
Fix jaxb version
| <!-- Should be consistent with SparkBuild.scala and docs --> | ||
| <avro.version>1.12.1</avro.version> | ||
| <aws.kinesis.client.version>1.15.3</aws.kinesis.client.version> | ||
| <aws.kinesis.client.version>2.7.1</aws.kinesis.client.version> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2.7.1 is not the latest version but this version works with AWS SDK for Java 2.29.52 Spark currently depends on.
2.7.2 depends on the SDK 2.33.0 and noticed it doesn't works with 2.29.52.
| val testData3 = 21 to 30 | ||
|
|
||
| eventually(timeout(1.minute), interval(10.seconds)) { | ||
| eventually(timeout(2.minute), interval(10.seconds)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Scheduler takes about 1 min by default for initialization so timeout is increased here.
| * @param shardEndedInput provides access to a checkpointer method for completing processing of | ||
| * the shard. | ||
| */ | ||
| override def shutdown( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shutdown is divided into three functions for the corresponding shutdown reasons.
- TERMINATE => shardEnded
- REQUESTED => shutdownRequested
- ZOMBIE => leaseLost
In the current code, REQUESTED is not cared but in this case checkpoint can be done. So, checkpoint is done in shutdownRequested.
|
Thank you so much, @sarutak . |
What changes were proposed in this pull request?
This PR proposes to upgrade KCL to 2.7.1 based on @junyuc25 's PR with some updates.
By upgrading KCL, we can remove AWS SDK for Java 1.x dependency.
Why are the changes needed?
Does this PR introduce any user-facing change?
Expect the behavior is not changed.
How was this patch tested?
Confirmed all kinesis tests passed with the following commands.
Also confirmed existing examples work.
Was this patch authored or co-authored using generative AI tooling?
No.