Skip to content

fix(core): avoid workspace-wide walk in expand_outputs#35464

Open
marco2216 wants to merge 2 commits intonrwl:masterfrom
marco2216:perf/expand-outputs-skip-walk
Open

fix(core): avoid workspace-wide walk in expand_outputs#35464
marco2216 wants to merge 2 commits intonrwl:masterfrom
marco2216:perf/expand-outputs-skip-walk

Conversation

@marco2216
Copy link
Copy Markdown

@marco2216 marco2216 commented Apr 27, 2026

Current Behavior

When a cached target's outputs array contains any entry with a glob (*, **), the native expand_outputs(workspaceRoot, outputs[]) (called from NxCache.put / copyFilesFromCache via _expandOutputs in packages/nx/src/tasks-runner/cache.ts) walks the entire workspace once per call - regardless of the glob's anchor depth or how few files it matches. Even a glob anchored to a leaf directory containing one file pays the full cost.

On a 50k-file workspace (warm OS cache, M-series Mac SSD) this is ~55 ms per call; on production workspaces with several hundred thousand tracked files / build artifacts it scales to multiple seconds per cached task. Concrete file paths and pure directory paths take a fast path inside the same Rust function (<1 ms) - the cost is specifically tied to the workspace-rooted walk triggered by glob entries.

The sibling get_files_for_outputs (in the same module, exposed via getFilesForOutputsBatch and already used by outputs-tracking.ts) already does this correctly: it partitions globs by their anchored root using partition_glob and walks only those subtrees. So the underlying primitive is already fast on the same inputs.

Expected Behavior

expand_outputs should resolve glob entries without walking the entire workspace, matching the behavior of get_files_for_outputs. On the same inputs the two functions return the same flat list of resolved files; routing the glob branch through get_files_for_outputs makes expand_outputs ~27x faster on globs and constant-time independent of workspace size.

Patch

packages/nx/src/native/cache/expand_outputs.rs: in the glob branch of _expand_outputs, when there are no negated globs (the common case), delegate to get_files_for_outputs instead of building a workspace-rooted nx_walker_sync. Negated globs keep the existing workspace-rooted walk so that ignore semantics (!path/to/exclude) are preserved exactly. No JS or napi-rs surface change - all native callers and the JS expandOutputs binding benefit transparently.

Reproducer & numbers

Standalone reproducer: https://github.com/marco2216/nx-expand-outputs-glob-perf-repropnpm install && bash bench.sh. Default seed is 50k synthetic files, generated locally on first run; scale up via SEED_FILES=200000 bash bench.sh to amplify the bug. nx is pinned to current latest.

$ bash bench.sh   # before
concrete (1 entry):    mean=0.00 ms
glob (1 entry, *):     mean=54.80 ms
glob (1 entry, **):    mean=55.44 ms
glob (5 entries, *):   mean=53.77 ms   # single walk per call, batched across entries
directory (1 entry):   mean=0.00 ms

$ bash bench.sh   # after this PR (verified against the equivalent JS-side patch
                  # routing _expandOutputs through getFilesForOutputsBatch)
concrete (1 entry):    mean=0.00 ms
glob (1 entry, *):     mean=2.02 ms
glob (1 entry, **):    mean=1.88 ms
glob (5 entries, *):   mean=1.82 ms
directory (1 entry):   mean=1.76 ms

In a large monorepo we maintain, applying the equivalent change as a workspace-side workaround (removing globs from outputs) drops our prebuild from 22 s to 9 s serial, and 15 s to 3.7 s parallel (~4x speedup). The reproducer is a synthetic minimum; the real-world win on production codebases is substantially larger.

Correctness verification

Ran cache miss -> cache hit cycle for targets with both concrete and glob outputs configs in the reproducer; both restore output files byte-correct on cache hit. The existing Rust test suite for this module (should_expand_outputs, should_handle_multiple_extensions, should_handle_multiple_outputs_with_negation, should_expand_outputs_with_symlinks_and_globs, should_handle_cache_put_and_restore_with_symlinks) was reviewed against the new code path:

  • Non-glob entries: unchanged fast path.
  • Glob entries without negation: now flow through get_files_for_outputs, which produces the same result set on these tests.
  • Glob entries with negation: unchanged; still walks workspace root with nx_walker_sync so !apps/web/.next/cache semantics are preserved exactly.

New test

Added should_not_walk_outside_glob_anchored_root in expand_outputs.rs that:

  • builds a temp workspace where the glob's anchored root contains 2 matching files,
  • adds decoy files outside the anchored root (src/, dist/other-app/, noise/file-{0..49}.txt) that the previous workspace-wide walker would visit,
  • calls expand_outputs("dist/app/**/*.js") and asserts the result matches get_files_for_outputs on the same inputs and contains exactly the two anchored-root files.

This catches both functional drift between the two functions and any future regression that reintroduces a workspace-rooted walk in the glob fast path.

Related Issue(s)

Fixes #

When a cached target's `outputs` array contains any glob entry
(`*`, `**`), the native `expand_outputs(workspaceRoot, outputs[])`
called from `NxCache.put` / `copyFilesFromCache` walked the entire
workspace once per call - regardless of how deeply the glob is
anchored or how few files it matches. On a 50k-file workspace this
is ~55 ms per call; on production workspaces it scales to several
seconds per cached task.

The sibling `get_files_for_outputs` already partitions globs by their
anchored root (via `partition_glob`) and walks only those subtrees, so
the underlying primitive is fast. Delegate `expand_outputs`'s glob
branch to `get_files_for_outputs` whenever no negated globs are
present, preserving negation semantics by keeping the existing
workspace-rooted walk in that path. On the same inputs this is ~27x
faster for a single-glob entry and constant-time independent of
workspace size.

Add a regression test that exercises a glob anchored to a deep
subdirectory with decoy files outside that subtree, asserting that
`expand_outputs` and `get_files_for_outputs` produce the same result
and that nothing outside the anchored root leaks in.
@netlify
Copy link
Copy Markdown

netlify Bot commented Apr 27, 2026

👷 Deploy request for nx-docs pending review.

Visit the deploys page to approve it

Name Link
🔨 Latest commit 46e01da

@netlify
Copy link
Copy Markdown

netlify Bot commented Apr 27, 2026

👷 Deploy request for nx-dev pending review.

Visit the deploys page to approve it

Name Link
🔨 Latest commit 46e01da

@marco2216 marco2216 marked this pull request as ready for review April 27, 2026 13:28
@marco2216 marco2216 requested a review from a team as a code owner April 27, 2026 13:28
@marco2216 marco2216 requested a review from JamesHenry April 27, 2026 13:28
@marco2216
Copy link
Copy Markdown
Author

Hey @FrozenPandaz would you mind having a look at this and giving me some feedback on whether this issue and solution approach makes sense? Ofc. it's not optimal that this issue remains if negated globs are there, so maybe there's a better way. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant