Skip to content

[SPARK-52509][K8S] Cleanup shuffles from fallback storage #51199

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

EnricoMi
Copy link
Contributor

What changes were proposed in this pull request?

Shuffle data of individual shuffles are deleted from the fallback storage during regular shuffle cleanup.

Why are the changes needed?

Currently, the shuffle data are only removed from the fallback storage on Spark context shutdown. Long running Spark jobs accumulate shuffle data, though this data is not used by Spark any more. Those shuffles should be cleaned up while Spark context is running.

Does this PR introduce any user-facing change?

No

How was this patch tested?

Unit tests and manual test via reproduction example.

Run the reproduction example without the <<< "$scala". In the Spark shell, execute this code:

import org.apache.spark.sql.SaveMode

val n = 100000000
val j = spark.sparkContext.broadcast(1000)
val x = spark.range(0, n, 1, 100).select($"id".cast("int"))
x.as[Int]
 .mapPartitions { it => if (it.hasNext && it.next < n / 100 * 80) Thread.sleep(2000); it }
 .groupBy($"value" % 1000).as[Int, Int]
 .flatMapSortedGroups($"value"){ case (m, it) => if (it.hasNext && it.next == 0) Thread.sleep(10000); it }
  .write.mode(SaveMode.Overwrite).csv("/tmp/spark.csv")

This writes some data of shuffle 0 to the fallback storage.

Invoking System.gc() removes that shuffle directory from the fallback storage. Exiting the Spark shell removes the whole application directory.

Was this patch authored or co-authored using generative AI tooling?

No.

@github-actions github-actions bot added the CORE label Jun 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant