Spark: Avoid real-time waits in streaming read timestamp tests#16965
Draft
huan233usc wants to merge 1 commit into
Draft
Spark: Avoid real-time waits in streaming read timestamp tests#16965huan233usc wants to merge 1 commit into
huan233usc wants to merge 1 commit into
Conversation
TestStructuredStreamingRead3.testReadingStreamFromFutureTimetsamp and testReadingStreamFromTimestampFutureWithExistingSnapshots picked a start timestamp of "now + 10s" / "now + 2s" and then waited for the wall clock to reach it via waitUntilAfter(). That wait is pure idle time on every run (the future test alone averaged ~16s/run in CI). Anchor the stream's start timestamp to the committed snapshot's timestampMillis() + 1 instead. Snapshots committed before that point are still excluded and those committed after are still included, so the semantics are unchanged, but no synthetic future instant has to elapse. waitUntilAfter() also busy-spun on System.currentTimeMillis(), pegging a core for the entire wait and starving other parallel test forks on a busy CI box. Sleep once for the remaining time instead; a past timestamp returns immediately. Applied identically to spark v3.5, v4.0 and v4.1.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
TestStructuredStreamingRead3has two timestamp-based streaming read tests that pick a stream start timestamp in the future and then wait for the wall clock to reach it:testReadingStreamFromFutureTimetsampusesnow + 10000mstestReadingStreamFromTimestampFutureWithExistingSnapshotsusesnow + 2000msThe wait goes through
TestBase.waitUntilAfter(), which busy-spins onSystem.currentTimeMillis(). So the future-timestamp test idles for ~10s on every run (it averaged ~16s/run in CI), and because the wait is a spin it pegs a core for the whole duration, contending with the other parallel Gradle test forks.Change
Anchor the stream's start timestamp to the committed snapshot's
timestampMillis() + 1instead of a synthetic future instant:waitUntilAfter(startTimestamp)returns are still included.The read semantics are unchanged, but no real time has to elapse —
waitUntilAfteris now handed a near-current timestamp and returns within ~1ms. This matches the existingtestReadingStreamFromTimestamp/testReadingStreamFromTimestampOfExistingSnapshotidiom in the same class.waitUntilAfter()is also changed to sleep once for the remaining time instead of busy-spinning, so any remaining future wait no longer burns a core.Applied identically to spark v3.5, v4.0 and v4.1.
Note on coverage
The original
testReadingStreamFromFutureTimetsampasserted emptiness after each insert while the stream was running (the inserts landed before the future timestamp elapsed). The rewrite commits those snapshots first and starts the stream from a timestamp just after them, so it exercises start-timestamp filtering rather than a running stream crossing the boundary. Verifying the latter deterministically would require controlling snapshot commit timestamps, which has no clean test hook; both paths hit the same start-timestamp boundary check.Testing