feat(tree-shaking): recognize pure global constructors and calls#13701
feat(tree-shaking): recognize pure global constructors and calls#13701
Conversation
Teach the side-effect detector about built-in pure globals so that unused `new Set()`, `new Map()`, `new Uint8Array()`, `Object.keys(x)`, `Array.isArray(x)`, `String(x)`, etc. can be tree-shaken. SWC's `may_have_side_effects` only recognized `Date` and empty fn/class callees; everything else was kept unnecessarily. The callee must resolve to an unresolved global (ctxt check) so shadowed bindings like `const Set = ...; new Set()` are preserved. Arguments are still recursively checked for purity via the existing `are_pure_args` / `is_pure_call_args` helpers.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: f0ce1c562a
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| | "freeze" | ||
| | "fromEntries" | ||
| | "is" | ||
| | "assign" |
There was a problem hiding this comment.
Remove Object.assign/freeze from pure member-call allowlist
Object.assign and Object.freeze mutate the object passed in, so they are observable even when the return value is unused. Because is_pure_member_call now classifies both as pure and is_pure_call_args only checks argument expression purity, tree-shaking can incorrectly drop calls like Object.assign(config, defaults) or Object.freeze(exportsObj), changing later reads/exports of that object.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Pull request overview
This PR expands Rspack’s side-effect analysis to treat certain built-in global constructors and global/member calls as pure (when their arguments are considered pure), enabling more aggressive tree-shaking of unused expressions like new Map() and Object.keys(x). It also adds a new tree-shaking fixture to validate the intended behavior.
Changes:
- Add a “pure globals” fast-path for call expressions (direct globals and selected
Object.*/Array.*member calls). - Add a “pure globals” fast-path for
newexpressions with selected built-in constructors. - Introduce a new tree-shaking test case (
global-pure-new) to validate unused pure constructions/calls are eliminated while keeping shadowed/impure cases.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
crates/rspack_plugin_javascript/src/parser_plugin/side_effects_parser_plugin.rs |
Adds allowlists and fast-path checks for treating certain global calls/constructors as pure. |
tests/rspack-test/treeShakingCases/global-pure-new/app.js |
Adds a fixture module meant to test tree-shaking of pure globals + guards. |
tests/rspack-test/treeShakingCases/global-pure-new/index.js |
Imports/uses an exported marker to ensure the module is included in the bundle. |
tests/rspack-test/treeShakingCases/global-pure-new/rspack.config.js |
Enables optimization.sideEffects for the new tree-shaking fixture. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| const ShadowedSet = sideEffect(); | ||
| let shadowed = new ShadowedSet(); |
There was a problem hiding this comment.
The “Shadowed — must NOT be treated as pure” test doesn’t actually shadow a built-in global: ShadowedSet is a new identifier and will never match the Set/Map allowlist, so this doesn’t exercise the ctxt == unresolved_ctxt shadowing guard. Consider shadowing the real name instead (e.g. const Set = sideEffect(); new Set()) so this case fails if the unresolved-global check is broken.
| const ShadowedSet = sideEffect(); | |
| let shadowed = new ShadowedSet(); | |
| const Set = sideEffect(); | |
| let shadowed = new Set(); |
Rsdoctor Bundle Diff AnalysisFound 6 projects in monorepo, 6 projects with changes. 📊 Quick Summary
📋 Detailed Reports (Click to expand)📁 popular-libsPath:
📦 Download Diff Report: popular-libs Bundle Diff 📁 react-10kPath:
📦 Download Diff Report: react-10k Bundle Diff 📁 react-1kPath:
📦 Download Diff Report: react-1k Bundle Diff 📁 romePath:
📦 Download Diff Report: rome Bundle Diff 📁 react-5kPath:
📦 Download Diff Report: react-5k Bundle Diff 📁 ui-componentsPath:
📦 Download Diff Report: ui-components Bundle Diff Generated by Rsdoctor GitHub Action |
📦 Binary Size-limit
❌ Size increased by 14.00MB from 48.72MB to 62.72MB (⬆️28.74%) |
Merging this PR will degrade performance by 1.99%
Performance Changes
Comparing Footnotes |
…calls
Review feedback: the previous pass was too aggressive — it classified
calls like `Object.assign(a, b)`, `Object.freeze(x)`, `Array.from(iter)`,
and `String(obj)` as pure, which can drop observable effects (mutation,
iterator protocol, `valueOf`/`toString` coercion).
Tighten the fast path to two forms:
1. Args must be *trivially safe* — literals (number/string/bool/null/regex),
empty template literals, unary of safe, or unresolved `undefined`/
`NaN`/`Infinity`. No objects, no arrays, no identifiers.
2. `Boolean(x)`, `Array.isArray(x)`, `Object.is(a, b)`,
`Number.isInteger`/`isFinite`/`isNaN`/`isSafeInteger`, and
`Array.of(...)` are pure regardless of arg values — they don't coerce,
iterate, or read properties.
Remove from allowlist: `Object.{keys,values,entries,assign,freeze,
create,fromEntries,getOwnPropertyNames,getOwnPropertySymbols,
getOwnPropertyDescriptor,getOwnPropertyDescriptors,getPrototypeOf}`,
`Array.from`.
Also fix the shadow test case to actually shadow the real `Set`/`Map`
globals (previously used `ShadowedSet` which never matched the
allowlist). Split shadow test into its own module so the `const Set`
declaration doesn't pull the non-shadowed uses into TDZ.
Regenerate `runtime-specific-used-exports` stats snapshot — `module.js`
containing `export const identity = Symbol();` is now correctly
recognized as side-effect-free, enabling module concatenation.
Move all known-pure-global logic out of side_effects_parser_plugin.rs
(−144 lines) into a new parser_plugin/pure_globals.rs module (+210
lines, including 16 unit tests).
Key design changes:
* Single entry point: `classify_pure_global(callee, ctxt, position)`
returns `Option<ArgGate>` — the caller just matches on the gate.
* `ArgGate` enum: `AnyPureArgs` (callee never coerces, e.g. Boolean)
vs `TriviallySafeArgs` (callee may coerce, e.g. new Set).
* `CalleePosition` enum: `New` vs `Call` — so the same name can have
different purity in each position (e.g. `Symbol()` pure as call,
but `new Symbol()` throws → not in the list).
* All classification tables, `is_trivially_safe_expr`, and
`are_args_trivially_safe` live in `pure_globals.rs`.
* 16 table-driven unit tests covering ident/member/negative cases.
* Call sites in `is_pure_new_expr` / `is_pure_call_expr` reduced to
a single `match gate { … }` each.
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: d3e79f5237
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| pub fn is_trivially_safe_expr(expr: &Expr, unresolved_ctxt: SyntaxContext) -> bool { | ||
| match expr { | ||
| // Number, string, boolean, null, regex, bigint literals. | ||
| Expr::Lit(_) => true, |
There was a problem hiding this comment.
Preserve throwing built-ins in trivial-arg purity checks
Treating every literal as trivially safe here makes the new fast path unsound for several globals classified as TriviallySafeArgs (e.g. Set, Map, Array, ArrayBuffer): literals can still trigger synchronous throws, such as new Set(1) (TypeError) or new Array(-1) (RangeError). Because is_pure_new_expr/is_pure_call_expr now return pure when this check passes, unused expressions that should throw can be tree-shaken away, changing runtime behavior during module evaluation.
Useful? React with 👍 / 👎.
Reviewer caught a real soundness bug: literal args can still trigger
synchronous throws in `Set`/`Map`/`Array`/TypedArray constructors and
`Number`/`Date`. With the previous `TriviallySafeArgs` gate, calls like
`new Set(1)` (TypeError), `new Array(-1)` (RangeError), and
`new Uint8Array(1.5)` (RangeError) were incorrectly classified as pure
and could be tree-shaken away — silently dropping the throw.
Split `ArgGate` into four variants by argument-shape requirement:
* `AnyPureArgs` — callee neither coerces nor throws on values
(e.g. `Boolean(x)`, `Array.isArray(x)`, `Object.is(a, b)`).
* `TriviallySafeArgs` — callee handles every literal safely
(e.g. `String("x")`, `Object("y")`, `Symbol("desc")`).
* `NullishOrNoArgs` — callee throws on non-iterable / non-numeric
values, so only `null`/`undefined`/no-args are safe (e.g.
`new Set()`, `new Map()`, `new Date()`, `new Number()`,
`new Array()`).
* `NullishOrNonNegativeIntLiteral` — `RangeError` on negative,
fractional, or out-of-range lengths; only nullish or `[0, 2^32)`
integer literals pass (e.g. `new Uint8Array(16)`,
`new ArrayBuffer(0)`).
Reclassification:
* `Set`/`Map`/`WeakSet`/`WeakMap`: `NullishOrNoArgs`.
* `Array` (both new and call): `NullishOrNoArgs`.
* `Date`/`Number` (both): `NullishOrNoArgs` (BigInt → ToNumber throws).
* TypedArrays / `ArrayBuffer` / `SharedArrayBuffer`:
`NullishOrNonNegativeIntLiteral`.
* `String`/`Object`: stays `TriviallySafeArgs` (ToString/ToObject
accept any literal incl BigInt).
* `Boolean`: promoted to `AnyPureArgs` in both positions (ToBoolean
never throws and never invokes user code).
* `Symbol` (call only): `TriviallySafeArgs`.
New helpers in `pure_globals.rs`:
* `is_nullish_expr`: null literal / `undefined` ident / `void <safe>`.
* `is_non_negative_int_literal`: finite, integer, `[0, 2^32)`,
handles `+N` UnaryExpr; rejects BigInt, negative, fractional,
NaN/Infinity.
* `check_arg_gate`: dispatches arg-shape check based on gate.
Snapshot test extended with 9 new "MUST be kept" cases covering each
throw category. All cases verified in the regenerated
`treeshaking.snap.txt` — pure forms are wrapped in `null && (…)`,
throwing forms are preserved verbatim.
Unit tests: 25 total (was 16), covering each gate, both positions for
each global, and direct arg-shape checks against synthetic Lit/Num
expressions.
|
Good catch — that was a real soundness bug. Pushed a85c293. The previous
Fix: split
Verification. Snapshot test extended with 9 new "MUST be kept" cases covering each throw category. Regenerated snapshot confirms:
Unit tests bumped from 16 → 25, including direct arg-shape checks against synthetic |
|
The 14MB size delta is not from this PR — it's a CI infrastructure mismatch introduced by #13726 (`d79172e7`). What's happening:
Evidence this isn't the PR's code:
Fix options (need maintainer action):
Happy to rebase or trigger a re-run once that's resolved. |
Summary
new Set(),new Map(),new Uint8Array(),Object.keys(x),Array.isArray(x),String(x), etc. are now tree-shaken.may_have_side_effectsonly recognizedDateas a pure callee and only empty-fn / pure-class expressions as purenewcallees — soexport const cache = new Map()was always retained even whencachewas unused. Rolldown handles this viaoxc_ecmascript::side_effects::MayHaveSideEffects, which maintains a pure-globals list; this PR mirrors that approach in Rspack.ctxt == unresolved_ctxt), soconst Set = sideEffect(); new Set()is still kept. Arguments are recursively checked via the existingare_pure_args/is_pure_call_argshelpers, sonew Set([impureArg()])is still kept.Lists
newpure constructors:Set,Map,WeakSet,WeakMap,Object,Array,String,Number,Boolean,Date,ArrayBuffer,SharedArrayBuffer, and all 11 TypedArrays (Uint8Array,Int8Array,Uint8ClampedArray,Uint16Array,Int16Array,Uint32Array,Int32Array,Float32Array,Float64Array,BigInt64Array,BigUint64Array).Array,Object,String,Number,Boolean,Symbol,Date.Object.{keys,values,entries,getOwnPropertyNames,getOwnPropertySymbols,getOwnPropertyDescriptor,getOwnPropertyDescriptors,getPrototypeOf,create,freeze,fromEntries,is,assign},Array.{isArray,from,of}.BigInt,RegExp,Error/TypeError/…,Promise,Reflect.*,JSON.*. These can be added later with argument validation.Test plan
cargo build -p rspack_plugin_javascriptcargo test -p rspack_plugin_javascript --lib side_effectstests/rspack-test/treeShakingCases/global-pure-new/(snapshot will be generated by CI — includes happy path, shadowed-binding guard, and impure-arg guard).tests/rspack-test/treeShakingCases/pure_comments_new_expr/still passes (unchanged behavior: innernew Set()inside a user function is still tied to the function's lifetime).