fix(test): surface error lines for generic runners by axelray-dev · Pull Request #2427 · rtk-ai/rtk

axelray-dev · 2026-06-13T10:53:31Z

What

For unrecognized test runners (anything besides cargo/pytest/jest/go), extract_test_summary() previously showed only the last 5 lines of output. If failures appeared earlier in the output, they were hidden.

Now it scans all lines against the existing ERROR_PATTERNS regexes (errors, failures, panics, tracebacks, etc.) and surfaces matching lines under a [FAIL] ERRORS: header. When no error lines are found, the original "last 5 lines" fallback is preserved.

Follow-up: error-pattern scanning is now gated on non-zero exit code. ERROR_PATTERNS are substring-based and can match benign summary lines (e.g. "Summary: 0 failed, 10 passed") in passing runners. Passing runners (exit 0) always get the tail fallback regardless of keyword matches.

Changes

src/cmds/rust/runner.rs: In extract_test_summary(), the generic runner fallback now applies ERROR_PATTERNS matching before falling back to tail output, but only when the exit code is non-zero. Switched run_test from run_filtered to run_filtered_with_exit to receive the exit code. Added 3 tests covering error surfacing, no-error fallback, and the false-positive case.

Tests

cargo fmt --all && cargo clippy --all-targets && cargo test --all

All 2153 tests pass. The three new tests verify:

Generic runner output with error lines at the top surfaces those errors instead of the last 5 lines
Generic runner output with no errors preserves the existing last-5-lines behavior
Passing runner (exit 0) with "failed" keyword in output shows last 5 lines, not [FAIL] ERRORS:

Notes

Only the generic runner fallback path is changed. Cargo, pytest, jest, and go runners are unaffected.
Uses the existing ERROR_PATTERNS lazy_static, no new patterns added.
DCO signed.

Signed-off-by: axelray-dev <110029405+axelray-dev@users.noreply.github.com>

hgunduzoglu

Checked this out and tested locally (macOS arm64). The core behavior is right and the truncation hint (... +N more) is a nice touch:

generic runner output	before	this PR
error at top, then 6 passing lines	last 5 lines (error hidden)	`[FAIL] ERRORS:` surfaces the error ✓
no error lines	last 5 lines	last 5 lines ✓ (fallback preserved)

cargo fmt/clippy/test clean, 16 runner tests pass. Scoping it to is_generic (leaving cargo/pytest/jest/go untouched) is the right call.

One real edge case worth handling: false positives on passing runs. ERROR_PATTERNS is substring-based and broad — r"(?i)^.*failed.*$", .*failure.*, .*warning.*, .*exception.* match those words anywhere on a line, including benign summary lines. So a runner that passes (exit 0) but prints a summary surfaces a misleading FAIL header:

$ rtk test ./runner.sh     # script prints "Summary: 0 failed, 10 passed", exit 0
[FAIL] ERRORS:
  Summary: 0 failed, 10 passed

An agent reading [FAIL] ERRORS: … 0 failed, 10 passed would conclude the suite failed when it passed — the opposite of the accurate-signal goal. Before this PR that same run showed a neutral OUTPUT (last 5 lines):, so it's a behavior change specifically in the passing case.

Since #2420 is really about surfacing failures when the runner fails, gating the error-surfacing on a non-zero exit code would fix both the original issue and this false positive. extract_test_summary doesn't currently receive the exit code, but runner.rs already has a RunMode::FilteredWithExit variant, so it can be threaded through without much churn — then: non-zero exit → [FAIL] ERRORS: (with the pattern scan), exit 0 → the existing tail fallback.

Core fix is solid; this is the one thing I'd tighten before merge. Could also add a test for the exit-0-with-"failed"-in-summary case to lock it down.

ERROR_PATTERNS are substring-based and can match benign summary lines (e.g. 'Summary: 0 failed, 10 passed') in passing runners. Only scan for error patterns when the exit code is non-zero to avoid false-positive [FAIL] ERRORS: headers on successful runs.

axelray-dev · 2026-06-18T06:52:42Z

Good catch, thanks. Pushed a fix that gates the error-pattern scan on non-zero exit code. Now uses run_filtered_with_exit so the exit code is threaded through to extract_test_summary -- passing runners (exit 0) always get the tail fallback regardless of keyword matches in the output.

Added a test for the exact false-positive case you described (exit 0 with "0 failed, 10 passed" in output). All 2153 tests pass.

luccinmasirika · 2026-06-18T18:29:01Z

Thanks @axelray-dev for picking this one up, and good call gating the scan on the exit code.

I pulled the branch and ran the exact repro from the issue (script prints FAIL: test_login broke, then 50 PASS: lines, exits 1). It still comes back as:

OUTPUT (last 5 lines):
  PASS: case_46
  ...
  PASS: case_50

So the FAIL: line is still hidden. The scan itself works, but ERROR_PATTERNS only has failed and failure, no bare fail/FAIL, so a line like FAIL: test_login broke never matches and falls through to the tail. Easy to confirm: changing the repro to print FAILED: surfaces it under [FAIL] ERRORS:, but FAIL: doesn't.

Adding one pattern covers it:

Regex::new(r"(?i)^.*\bfail\b.*$").unwrap(),

With that, the repro surfaces FAIL: test_login broke on exit 1, and since the scan is still gated on a non-zero exit, the passing 0 failed, 10 passed case stays on the tail fallback. Runner tests stay green. Might be worth a test using the literal repro line so this stays locked down.

luccinmasirika · 2026-06-18T18:40:35Z

        Regex::new(r"(?i)^.*\berr\b.*$").unwrap(),
        Regex::new(r"(?i)^.*warning[\s:\[].*$").unwrap(),
        Regex::new(r"(?i)^.*\bwarn\b.*$").unwrap(),
        Regex::new(r"(?i)^.*failed.*$").unwrap(),


Concretely, this is the one-line addition that covers it. Putting it just above
the failed pattern keeps that one intact:

Suggested change

Regex::new(r"(?i)^.*\bfail\b.*$").unwrap(),

Regex::new(r"(?i)^.*failed.*$").unwrap(),

Add Regex pattern for 'fail' as a whole word before the existing 'failed' pattern to catch more error cases while keeping the existing pattern intact.

axelray-dev · 2026-06-29T12:23:29Z

Pushed the word-boundary pattern as suggested. Added a regression test using the exact repro line ("FAIL: something went wrong" with exit 1) to lock it down. All runner tests pass.

fix: surface generic runner errors

2bb1a86

Signed-off-by: axelray-dev <110029405+axelray-dev@users.noreply.github.com>

axelray-dev marked this pull request as ready for review June 13, 2026 14:24

hgunduzoglu reviewed Jun 17, 2026

View reviewed changes

luccinmasirika reviewed Jun 18, 2026

View reviewed changes

Apply code review suggestion: add word-boundary fail pattern

f88618a

Add Regex pattern for 'fail' as a whole word before the existing 'failed' pattern to catch more error cases while keeping the existing pattern intact.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(test): surface error lines for generic runners#2427

fix(test): surface error lines for generic runners#2427
axelray-dev wants to merge 3 commits into
rtk-ai:developfrom
axelray-dev:fix/2420-test-output-truncation

axelray-dev commented Jun 13, 2026 •

edited

Loading

Uh oh!

hgunduzoglu left a comment

Uh oh!

axelray-dev commented Jun 18, 2026

Uh oh!

luccinmasirika commented Jun 18, 2026 •

edited

Loading

Uh oh!

luccinmasirika Jun 18, 2026

Uh oh!

axelray-dev commented Jun 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants


	Regex::new(r"(?i)^.\bfail\b.$").unwrap(),
	Regex::new(r"(?i)^.failed.$").unwrap(),

Uh oh!

Conversation

axelray-dev commented Jun 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What

Changes

Tests

Notes

Uh oh!

hgunduzoglu left a comment

Choose a reason for hiding this comment

Uh oh!

axelray-dev commented Jun 18, 2026

Uh oh!

luccinmasirika commented Jun 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

luccinmasirika Jun 18, 2026

Choose a reason for hiding this comment

Uh oh!

axelray-dev commented Jun 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

axelray-dev commented Jun 13, 2026 •

edited

Loading

luccinmasirika commented Jun 18, 2026 •

edited

Loading