test: harden retry delay timing assertions by tmustier · Pull Request #78 · tmustier/pi-symphony

tmustier · 2026-04-08T12:06:06Z

Summary

replace fixed 50ms sleeps in retry-delay assertions with polling helpers that wait for the retry entry
assert retry delay from the immediate observed remaining time instead of measuring after an arbitrary sleep
cover the continuation, failure backoff, min retry interval, and rate-limit floor cases with the same helper

Why

CI run 24133141504 failed on a flaky lower-bound assertion in min_retry_interval_ms enforces a floor on failure retry delays. The test was measuring remaining time after an arbitrary sleep, so scheduler latency on a busy runner could drop the observed value below the threshold even when the retry was scheduled correctly.

Validation

make check-elixir
mise exec -- mix test test/symphony_elixir/core_test.exs
repeated mix test test/symphony_elixir/core_test.exs:1003 (10x)

tmustier · 2026-04-08T12:16:45Z

Findings

[P2] Scheduling-window assertion can miss wrong retry delays
orchestrator/elixir/test/symphony_elixir/core_test.exs:1378
assert_due_scheduled_between/5 compares due_at_ms to the broad [window_start_ms, window_end_ms] interval rather than to the time the retry was actually scheduled. Any latency while handling {:DOWN, ...} can mask a bad delay on either side: for example, a continuation retry scheduled for 500ms will still satisfy the new 1000ms lower bound if the handler spends ~500ms before updating state.
Assert against the observation time returned by wait_for_retry_attempt/2 (due_at_ms - observed_at_ms) or capture the schedule timestamp explicitly, so handler latency cannot hide a retry-delay regression.

Verdict

needs attention — the new helper removes the flaky sleep, but it also weakens the timing checks enough that incorrect retry delays can now pass.

tmustier · 2026-04-08T12:24:01Z

Findings

[P2] Retry-delay tests no longer verify when the retry was scheduled
orchestrator/elixir/test/symphony_elixir/core_test.exs:689
The new pattern waits until retry_attempts[issue_id] exists and only then measures due_at_ms - observed_at_ms. That means a regression that delays scheduling by hundreds of milliseconds still passes, because the assertion starts the clock after the late state update instead of at the {:DOWN, ...} event.
Capture a timestamp immediately around send(pid, {:DOWN, ...}) and assert due_at_ms relative to that, or add a separate upper bound on how long wait_for_retry_attempt/2 is allowed to take.
[P2] The new 250ms tolerance is still tight enough to keep these tests flaky
orchestrator/elixir/test/symphony_elixir/core_test.exs:1363
assert_remaining_delay/3 allows only ±250ms, but observed_at_ms is taken after a synchronous :sys.get_state/1 round-trip inside a polling loop. A correct implementation will still fail whenever CI takes >250ms to observe the updated state, especially on the 1s continuation-retry case.
Widen the slack materially, or make these assertions one-sided floor checks instead of exact-delay checks.

Verdict

needs attention — the follow-up change weakens the timing coverage and still leaves the tests sensitive to scheduler jitter.

test: harden retry delay timing assertions

42f3529

tmustier force-pushed the fix/ci-min-retry-interval-flake branch from d10285c to 42f3529 Compare April 8, 2026 12:16

tmustier merged commit ef98021 into main Apr 8, 2026
3 checks passed

tmustier deleted the fix/ci-min-retry-interval-flake branch April 8, 2026 12:25

tmustier mentioned this pull request Apr 8, 2026

test: remove tracker I/O from orchestrator exit specs #79

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test: harden retry delay timing assertions#78

test: harden retry delay timing assertions#78
tmustier merged 1 commit intomainfrom
fix/ci-min-retry-interval-flake

tmustier commented Apr 8, 2026 •

edited

Loading

Uh oh!

tmustier commented Apr 8, 2026

Uh oh!

tmustier commented Apr 8, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

tmustier commented Apr 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Why

Validation

Uh oh!

tmustier commented Apr 8, 2026

Findings

Verdict

Uh oh!

tmustier commented Apr 8, 2026

Findings

Verdict

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

tmustier commented Apr 8, 2026 •

edited

Loading