Skip to content

CI: Select Spark PR matrix by changed version #16946

Open
ajantha-bhat wants to merge 2 commits into
apache:mainfrom
ajantha-bhat:codex/ci-spark-pr-version-matrix
Open

CI: Select Spark PR matrix by changed version #16946
ajantha-bhat wants to merge 2 commits into
apache:mainfrom
ajantha-bhat:codex/ci-spark-pr-version-matrix

Conversation

@ajantha-bhat

@ajantha-bhat ajantha-bhat commented Jun 24, 2026

Copy link
Copy Markdown
Member

depends on #16945

Summary

This updates Spark CI to build an incremental PR matrix based on the changed Spark version path.

For ordinary PRs, Spark CI uses the JVM policy from the Java CI PR and narrows the Spark matrix when changes are scoped to a known Spark version path:

  • spark/v3.5/** runs Spark 3.5 only
  • spark/v4.0/** runs Spark 4.0 only
  • spark/v4.1/** runs Spark 4.1 only
  • unknown Spark version paths, unknown Spark paths, missing changed-file detection, full-ci, and push/main/release/tag runs fall back to the full Spark version matrix

Known Spark versions are read from knownSparkVersions in gradle.properties, so adding/removing Spark versions does not require hard-coding the version list in the planner.

This keeps the PR path conservative: stale or unknown planner inputs run more Spark CI, not less.

Testing

  • actionlint .github/workflows/spark-ci.yml .github/workflows/java-ci.yml
  • zizmor --min-severity medium --min-confidence medium --no-progress .github/workflows/spark-ci.yml .github/workflows/java-ci.yml
  • bash -n .github/scripts/ci-plan-common.sh .github/scripts/plan-spark-ci.sh
  • git diff --check codex/ci-java-pr-jdk17..HEAD
  • Local planner checks:
    • spark/v4.1/** -> Spark 4.1 only on the regular PR JVM matrix
    • unknown Spark version/path -> all known Spark versions
    • full-ci -> all known Spark versions with the full JVM matrix

AI Disclosure

  • Model: GPT-5 Codex
  • Platform/Tool: Codex
  • Human Oversight: partially reviewed
  • Prompt Summary: Update Spark CI to select the PR matrix incrementally by changed Spark version path.

@github-actions github-actions Bot added the INFRA label Jun 24, 2026
@ajantha-bhat ajantha-bhat force-pushed the codex/ci-spark-pr-version-matrix branch 2 times, most recently from d07a80c to 350a854 Compare June 24, 2026 08:49
@ajantha-bhat ajantha-bhat force-pushed the codex/ci-spark-pr-version-matrix branch from 350a854 to 29459b7 Compare June 24, 2026 15:21
cancel-in-progress: ${{ github.event_name == 'pull_request' }}

jobs:

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this intentional?

cache-read-only: true
- run: echo -e "$(ip addr show eth0 | grep "inet\b" | awk '{print $2}' | cut -d/ -f1)\t$(hostname -f) $(hostname -s)" | sudo tee -a /etc/hosts
- run: ./gradlew -DsparkVersions= -DkafkaVersions= -DflinkVersions=${{ matrix.flink }} :iceberg-flink:iceberg-flink-${{ matrix.flink }}:check :iceberg-flink:iceberg-flink-runtime-${{ matrix.flink }}:check -Pquick=true -x javadoc -DtestParallelism=auto
- env:

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need to change flink ci in this PR

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is from the dependent PR as mentioned in the description. It will go away after rebase.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants