CI: emit synthetic JUnit XML when rake task fails before tests run by p-datadog · Pull Request #5502 · DataDog/dd-trace-rb

p-datadog · 2026-03-25T04:42:19Z

What does this PR do?

Wraps the sh call in run_batch_tests with a rescue block that writes a synthetic
JUnit XML when a rake task fails before RSpec starts. The existing artifact upload step
picks it up — no changes needed to the upload or the dd/junit merge job.

The original exception is re-raised, so the job still fails as before. This only adds
visibility.

Motivation:

I noticed that when a rake task in a CI batch fails before RSpec runs (wrong task name,
LoadError, syntax error), the JUnit artifacts from that batch only contain results from
the tasks that succeeded. The failing task produces no XML at all.

What caught my eye is that this makes the failure invisible to anything that relies on
JUnit — test failure analysis, Datadog CI Visibility — since the artifacts exist but
show 0 failures. The only way to find the actual error is log parsing.

Ran into this on PR #5111 where 8 jobs failed with "Don't know how to build task
`spec:di_with_ext`" but JUnit showed all green. The synthetic XML would have surfaced
the rake error directly in the structured data.

Change log entry

None

Additional Notes:

Not yet verified locally — `bundle exec rake standard` requires full gem setup.
CI will validate.

Worth noting: the synthetic XML uses the same directory (`tmp/rspec/`) and naming
convention as real JUnit output, so it flows through the existing pipeline without
any special handling.

How to test the change?

Verify synthetic XML is well-formed
Verify existing tests still pass (no behavioral change on success path)
To test the failure path: temporarily change a Matrixfile entry to a nonexistent
task, push, and confirm the synthetic XML appears in artifacts
Confirm `datadog-ci junit upload` accepts the synthetic XML format

When a rake task fails before RSpec starts (wrong task name, missing gem, syntax error), no JUnit XML is produced for that task. CI artifacts then contain only results from other tasks in the batch, making the failure invisible through the normal JUnit pipeline. Write a synthetic JUnit XML with the rake error message so the failure shows up in JUnit artifact analysis without log parsing. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

github-actions · 2026-03-25T04:42:30Z

Thank you for updating Change log entry section 👏

^{Visited at: 2026-03-25 04:45:24 UTC}

datadog-official · 2026-03-25T05:00:56Z

✅ Tests

🎉 All green!

❄️ No new flaky tests detected
🧪 All tests passed

🎯 Code Coverage (details)
• Patch Coverage: 100.00%
• Overall Coverage: 95.16% (-0.02%)

_{This comment will be updated automatically if new data arrives.

🔗 Commit SHA: 44c14b8 | Docs | Datadog PR Page | Was this helpful? React with 👍/👎 or give us feedback!}

pr-commenter · 2026-03-25T05:20:45Z

Benchmarks

Benchmark execution time: 2026-03-25 05:20:43

Comparing candidate commit 44c14b8 in PR branch ci-synthetic-junit-on-rake-failure with baseline commit d170f39 in branch master.

Found 0 performance improvements and 0 performance regressions! Performance is the same for 46 metrics, 0 unstable metrics.

Explanation

This is an A/B test comparing a candidate commit's performance against that of a baseline commit. Performance changes are noted in the tables below as:

🟩 = significantly better candidate vs. baseline
🟥 = significantly worse candidate vs. baseline

We compute a confidence interval (CI) over the relative difference of means between metrics from the candidate and baseline commits, considering the baseline as the reference.

If the CI is entirely outside the configured SIGNIFICANT_IMPACT_THRESHOLD (or the deprecated UNCONFIDENCE_THRESHOLD), the change is considered significant.

Feel free to reach out to #apm-benchmarking-platform on Slack if you have any questions.

More details about the CI and significant changes

You can imagine this CI as a range of values that is likely to contain the true difference of means between the candidate and baseline commits.

CIs of the difference of means are often centered around 0%, because often changes are not that big:

---------------------------------(------|---^--------)-------------------------------->
                              -0.6%    0%  0.3%     +1.2%
                                 |          |        |
         lower bound of the CI --'          |        |
sample mean (center of the CI) -------------'        |
         upper bound of the CI ----------------------'

As described above, a change is considered significant if the CI is entirely outside the configured SIGNIFICANT_IMPACT_THRESHOLD (or the deprecated UNCONFIDENCE_THRESHOLD).

For instance, for an execution time metric, this confidence interval indicates a significantly worse performance:

----------------------------------------|---------|---(---------^---------)---------->
                                       0%        1%  1.3%      2.2%      3.1%
                                                  |   |         |         |
       significant impact threshold --------------'   |         |         |
                      lower bound of CI --------------'         |         |
       sample mean (center of the CI) --------------------------'         |
                      upper bound of CI ----------------------------------'

p-datadog added the AI Generated Largely based on code generated by an AI or LLM. This label is the same across all dd-trace-* repos label Mar 25, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CI: emit synthetic JUnit XML when rake task fails before tests run#5502

CI: emit synthetic JUnit XML when rake task fails before tests run#5502
p-datadog wants to merge 1 commit intomasterfrom
ci-synthetic-junit-on-rake-failure

p-datadog commented Mar 25, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Mar 25, 2026 •

edited

Loading

Uh oh!

datadog-official bot commented Mar 25, 2026 •

edited by datadog-datadog-prod-us1-2 bot

Loading

Uh oh!

pr-commenter bot commented Mar 25, 2026

Explanation

More details about the CI and significant changes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

p-datadog commented Mar 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Mar 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

datadog-official bot commented Mar 25, 2026 • edited by datadog-datadog-prod-us1-2 bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pr-commenter bot commented Mar 25, 2026

Benchmarks

Explanation

More details about the CI and significant changes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

p-datadog commented Mar 25, 2026 •

edited

Loading

github-actions bot commented Mar 25, 2026 •

edited

Loading

datadog-official bot commented Mar 25, 2026 •

edited by datadog-datadog-prod-us1-2 bot

Loading