Skip to content

[SPARK-51653][CONNECT][TESTS] Migrate the tests related to RemoteSparkSession from the connect-client-jvm module to connect-client-integration-tests #50463

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 21 commits into
base: master
Choose a base branch
from

Conversation

LuciferYang
Copy link
Contributor

@LuciferYang LuciferYang commented Mar 31, 2025

What changes were proposed in this pull request?

This pr moves the test cases related to RemoteSparkSession from the connect-client-jvm module to a new module named connect-client-integration-tests. The new module depends on both the connect-client-jvm and assembly modules, aiming to resolve the circular dependency issue between connect-client-jvm and assembly:

  1. The test cases related to RemoteSparkSession in the connect-client-jvm module rely on the output from the target/scala-2.13/jars directory of the assembly module during testing.
  2. The assembly module depends on the compilation output of the connect-client-jvm module during packaging.

The main changes of this pr are as follows:

  1. Create a new module connect-client-integration-tests, which depends on both the connect-client-jvm and assembly modules. Move the test cases and related utility code for RemoteSparkSession from the connect-client-jvm module to the new module.
  2. Revert the changes made for SPARK-51603, as the assembly module should already be built when connect-client-integration-tests is being tested.
  3. Adjust the assembly/pom.xml to execute install and deploy to allow the connect-client-integration-tests module to depend on the assembly module.
  4. Modify the ShadeRule.rename rule for com.google.protobuf in SparkBuild.scala to align with Maven, ensuring that UDFClassLoadingE2ESuite passes tests in both Maven and SBT.
  5. Remove the ShadeRule.rename rule for javax.annotation to prevent compilation failures.
  6. Update the relevant parts in dev/test-jars.txt, dev/sparktestsupport/modules.py, dev/lint-scala, and .github/workflows/maven_test.yml related to the new module and tests.

Why are the changes needed?

Resolve the issue of logical circular dependency between connect-client-jvm and assembly.

Does this PR introduce any user-facing change?

No

How was this patch tested?

  • Pass GitHub Actions
  • locally check:
build/mvn clean install -pl sql/connect/client/integration-tests -am -Phive
build/mvn test -pl sql/connect/client/integration-tests

and

build/sbt clean "connect-client-integration-tests/test" -Phive

All passed

Was this patch authored or co-authored using generative AI tooling?

No

…starting `SparkConnectServer` when the directory `assembly/target/scala-2.13/jars/` not exist"

This reverts commit 38a1958.
@LuciferYang

This comment was marked as outdated.

@LuciferYang
Copy link
Contributor Author

cc @HyukjinKwon I'm trying to do this refactoring. Do you have any better suggestions?

ShadeRule.rename("io.netty.**" -> "org.sparkproject.connect.client.io.netty.@1").inAll,
ShadeRule.rename("org.checkerframework.**" -> "org.sparkproject.connect.client.org.checkerframework.@1").inAll,
ShadeRule.rename("javax.annotation.**" -> "org.sparkproject.connect.client.javax.annotation.@1").inAll,
// ShadeRule.rename("javax.annotation.**" -> "org.sparkproject.connect.client.javax.annotation.@1").inAll,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for fix:

[warn] javac exited with exit code 1
[error] (connect-client-integration-tests / Test / compileIncremental) javac returned non-zero exit code

will take another look at it later

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems we can only delete this ShadeRule.rename

@github-actions github-actions bot added the INFRA label Apr 1, 2025
@LuciferYang LuciferYang changed the title [SPARK-51653][TESTS] Migrate the tests related to RemoteSparkSession from the connect-client-jvm module to a standalone module. [SPARK-51653][CONNECT][TESTS] Migrate the tests related to RemoteSparkSession from the connect-client-jvm module to a standalone module. Apr 1, 2025
@LuciferYang LuciferYang changed the title [SPARK-51653][CONNECT][TESTS] Migrate the tests related to RemoteSparkSession from the connect-client-jvm module to a standalone module. [SPARK-51653][CONNECT][TESTS] Migrate the tests related to RemoteSparkSession from the connect-client-jvm module to a standalone module Apr 1, 2025
@LuciferYang LuciferYang changed the title [SPARK-51653][CONNECT][TESTS] Migrate the tests related to RemoteSparkSession from the connect-client-jvm module to a standalone module [SPARK-51653][CONNECT][TESTS] Migrate the tests related to RemoteSparkSession from the connect-client-jvm module to connect-client-integration-tests Apr 1, 2025
@@ -198,7 +198,7 @@ jobs:
if [[ "$INCLUDED_TAGS" != "" ]]; then
./build/mvn $MAVEN_CLI_OPTS -pl "$TEST_MODULES" -Pyarn -Pkubernetes -Pvolcano -Phive -Phive-thriftserver -Phadoop-cloud -Pjvm-profiler -Pspark-ganglia-lgpl -Pkinesis-asl -Djava.version=${JAVA_VERSION/-ea} -Dtest.include.tags="$INCLUDED_TAGS" test -fae
elif [[ "$MODULES_TO_TEST" == "connect" ]]; then
./build/mvn $MAVEN_CLI_OPTS -Dtest.exclude.tags="$EXCLUDED_TAGS" -Djava.version=${JAVA_VERSION/-ea} -pl sql/connect/client/jvm,sql/connect/common,sql/connect/server test -fae
./build/mvn $MAVEN_CLI_OPTS -Dtest.exclude.tags="$EXCLUDED_TAGS" -Djava.version=${JAVA_VERSION/-ea} -pl sql/connect/client/jvm,sql/connect/client/integration-tests,sql/connect/common,sql/connect/server package test -fae
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, it is necessary to use package test. If only test is specified, then client-integration-tests will use the client-jvm's classes directory instead of a jar file as the build classpath, which will result in a failure of testCompile.

@LuciferYang LuciferYang marked this pull request as ready for review April 1, 2025 13:54
@LuciferYang
Copy link
Contributor Author

cc @HyukjinKwon This pr is basically completed, and we can discuss its feasibility. There might still be some duplicate code that can be cleaned up, but I think we can leave that for a follow-up if this plan is feasible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant