Skip to content

feat(tree-shaking): recognize pure global constructors and calls#13701

Open
JSerFeng wants to merge 4 commits intomainfrom
fy/vigorous-wilbur
Open

feat(tree-shaking): recognize pure global constructors and calls#13701
JSerFeng wants to merge 4 commits intomainfrom
fy/vigorous-wilbur

Conversation

@JSerFeng
Copy link
Copy Markdown
Contributor

Summary

  • Teach Rspack's side-effect detector about built-in pure globals so unused new Set(), new Map(), new Uint8Array(), Object.keys(x), Array.isArray(x), String(x), etc. are now tree-shaken.
  • Previously, SWC's may_have_side_effects only recognized Date as a pure callee and only empty-fn / pure-class expressions as pure new callees — so export const cache = new Map() was always retained even when cache was unused. Rolldown handles this via oxc_ecmascript::side_effects::MayHaveSideEffects, which maintains a pure-globals list; this PR mirrors that approach in Rspack.
  • Shadowing-safe: the callee must resolve to an unresolved global (ctxt == unresolved_ctxt), so const Set = sideEffect(); new Set() is still kept. Arguments are recursively checked via the existing are_pure_args / is_pure_call_args helpers, so new Set([impureArg()]) is still kept.

Lists

  • new pure constructors: Set, Map, WeakSet, WeakMap, Object, Array, String, Number, Boolean, Date, ArrayBuffer, SharedArrayBuffer, and all 11 TypedArrays (Uint8Array, Int8Array, Uint8ClampedArray, Uint16Array, Int16Array, Uint32Array, Int32Array, Float32Array, Float64Array, BigInt64Array, BigUint64Array).
  • Direct-call pure globals: Array, Object, String, Number, Boolean, Symbol, Date.
  • Member-call pure globals: Object.{keys,values,entries,getOwnPropertyNames,getOwnPropertySymbols,getOwnPropertyDescriptor,getOwnPropertyDescriptors,getPrototypeOf,create,freeze,fromEntries,is,assign}, Array.{isArray,from,of}.
  • Conservatively omitted (conditional on args): BigInt, RegExp, Error/TypeError/…, Promise, Reflect.*, JSON.*. These can be added later with argument validation.

Test plan

  • cargo build -p rspack_plugin_javascript
  • cargo test -p rspack_plugin_javascript --lib side_effects
  • CI: new treeshaking case at tests/rspack-test/treeShakingCases/global-pure-new/ (snapshot will be generated by CI — includes happy path, shadowed-binding guard, and impure-arg guard).
  • CI: verify existing tests/rspack-test/treeShakingCases/pure_comments_new_expr/ still passes (unchanged behavior: inner new Set() inside a user function is still tied to the function's lifetime).

Teach the side-effect detector about built-in pure globals so that
unused `new Set()`, `new Map()`, `new Uint8Array()`, `Object.keys(x)`,
`Array.isArray(x)`, `String(x)`, etc. can be tree-shaken. SWC's
`may_have_side_effects` only recognized `Date` and empty fn/class
callees; everything else was kept unnecessarily.

The callee must resolve to an unresolved global (ctxt check) so
shadowed bindings like `const Set = ...; new Set()` are preserved.
Arguments are still recursively checked for purity via the existing
`are_pure_args` / `is_pure_call_args` helpers.
Copilot AI review requested due to automatic review settings April 14, 2026 07:36
@github-actions github-actions Bot added team The issue/pr is created by the member of Rspack. release: feature release: feature related release(mr only) labels Apr 14, 2026
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f0ce1c562a

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +1165 to +1168
| "freeze"
| "fromEntries"
| "is"
| "assign"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Remove Object.assign/freeze from pure member-call allowlist

Object.assign and Object.freeze mutate the object passed in, so they are observable even when the return value is unused. Because is_pure_member_call now classifies both as pure and is_pure_call_args only checks argument expression purity, tree-shaking can incorrectly drop calls like Object.assign(config, defaults) or Object.freeze(exportsObj), changing later reads/exports of that object.

Useful? React with 👍 / 👎.

Comment thread crates/rspack_plugin_javascript/src/parser_plugin/side_effects_parser_plugin.rs Outdated
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR expands Rspack’s side-effect analysis to treat certain built-in global constructors and global/member calls as pure (when their arguments are considered pure), enabling more aggressive tree-shaking of unused expressions like new Map() and Object.keys(x). It also adds a new tree-shaking fixture to validate the intended behavior.

Changes:

  • Add a “pure globals” fast-path for call expressions (direct globals and selected Object.* / Array.* member calls).
  • Add a “pure globals” fast-path for new expressions with selected built-in constructors.
  • Introduce a new tree-shaking test case (global-pure-new) to validate unused pure constructions/calls are eliminated while keeping shadowed/impure cases.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 5 comments.

File Description
crates/rspack_plugin_javascript/src/parser_plugin/side_effects_parser_plugin.rs Adds allowlists and fast-path checks for treating certain global calls/constructors as pure.
tests/rspack-test/treeShakingCases/global-pure-new/app.js Adds a fixture module meant to test tree-shaking of pure globals + guards.
tests/rspack-test/treeShakingCases/global-pure-new/index.js Imports/uses an exported marker to ensure the module is included in the bundle.
tests/rspack-test/treeShakingCases/global-pure-new/rspack.config.js Enables optimization.sideEffects for the new tree-shaking fixture.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread crates/rspack_plugin_javascript/src/parser_plugin/side_effects_parser_plugin.rs Outdated
Comment on lines +15 to +16
const ShadowedSet = sideEffect();
let shadowed = new ShadowedSet();
Copy link

Copilot AI Apr 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The “Shadowed — must NOT be treated as pure” test doesn’t actually shadow a built-in global: ShadowedSet is a new identifier and will never match the Set/Map allowlist, so this doesn’t exercise the ctxt == unresolved_ctxt shadowing guard. Consider shadowing the real name instead (e.g. const Set = sideEffect(); new Set()) so this case fails if the unresolved-global check is broken.

Suggested change
const ShadowedSet = sideEffect();
let shadowed = new ShadowedSet();
const Set = sideEffect();
let shadowed = new Set();

Copilot uses AI. Check for mistakes.
Comment thread crates/rspack_plugin_javascript/src/parser_plugin/side_effects_parser_plugin.rs Outdated
Comment thread crates/rspack_plugin_javascript/src/parser_plugin/side_effects_parser_plugin.rs Outdated
Comment thread crates/rspack_plugin_javascript/src/parser_plugin/side_effects_parser_plugin.rs Outdated
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 14, 2026

Rsdoctor Bundle Diff Analysis

Found 6 projects in monorepo, 6 projects with changes.

📊 Quick Summary
Project Total Size Change
popular-libs 1.7 MB -152.0 B (-0.0%)
react-10k 5.7 MB -17.0 B (-0.0%)
react-1k 826.0 KB -59.0 B (-0.0%)
rome 984.1 KB -9.0 B (-0.0%)
react-5k 2.7 MB -17.0 B (-0.0%)
ui-components 5.0 MB -204.0 B (-0.0%)
📋 Detailed Reports (Click to expand)

📁 popular-libs

Path: ../build-tools-performance/cases/popular-libs/dist/rsdoctor-data.json

📌 Baseline Commit: 2bf2cd41ed | PR: #13735

Metric Current Baseline Change
📊 Total Size 1.7 MB 1.7 MB -152.0 B (-0.0%)
📄 JavaScript 1.7 MB 1.7 MB -152.0 B (-0.0%)
🎨 CSS 0 B 0 B 0
🌐 HTML 0 B 0 B 0
📁 Other Assets 0 B 0 B 0

📦 Download Diff Report: popular-libs Bundle Diff

📁 react-10k

Path: ../build-tools-performance/cases/react-10k/dist/rsdoctor-data.json

📌 Baseline Commit: 2bf2cd41ed | PR: #13735

Metric Current Baseline Change
📊 Total Size 5.7 MB 5.7 MB -17.0 B (-0.0%)
📄 JavaScript 5.7 MB 5.7 MB -17.0 B (-0.0%)
🎨 CSS 21.0 B 21.0 B 0
🌐 HTML 0 B 0 B 0
📁 Other Assets 0 B 0 B 0

📦 Download Diff Report: react-10k Bundle Diff

📁 react-1k

Path: ../build-tools-performance/cases/react-1k/dist/rsdoctor-data.json

📌 Baseline Commit: 2bf2cd41ed | PR: #13735

Metric Current Baseline Change
📊 Total Size 826.0 KB 826.1 KB -59.0 B (-0.0%)
📄 JavaScript 826.0 KB 826.1 KB -59.0 B (-0.0%)
🎨 CSS 0 B 0 B 0
🌐 HTML 0 B 0 B 0
📁 Other Assets 0 B 0 B 0

📦 Download Diff Report: react-1k Bundle Diff

📁 rome

Path: ../build-tools-performance/cases/rome/dist/rsdoctor-data.json

📌 Baseline Commit: 2bf2cd41ed | PR: #13735

Metric Current Baseline Change
📊 Total Size 984.1 KB 984.1 KB -9.0 B (-0.0%)
📄 JavaScript 984.1 KB 984.1 KB -9.0 B (-0.0%)
🎨 CSS 0 B 0 B 0
🌐 HTML 0 B 0 B 0
📁 Other Assets 0 B 0 B 0

📦 Download Diff Report: rome Bundle Diff

📁 react-5k

Path: ../build-tools-performance/cases/react-5k/dist/rsdoctor-data.json

📌 Baseline Commit: 2bf2cd41ed | PR: #13735

Metric Current Baseline Change
📊 Total Size 2.7 MB 2.7 MB -17.0 B (-0.0%)
📄 JavaScript 2.7 MB 2.7 MB -17.0 B (-0.0%)
🎨 CSS 21.0 B 21.0 B 0
🌐 HTML 0 B 0 B 0
📁 Other Assets 0 B 0 B 0

📦 Download Diff Report: react-5k Bundle Diff

📁 ui-components

Path: ../build-tools-performance/cases/ui-components/dist/rsdoctor-data.json

📌 Baseline Commit: 2bf2cd41ed | PR: #13735

Metric Current Baseline Change
📊 Total Size 5.0 MB 5.0 MB -204.0 B (-0.0%)
📄 JavaScript 4.7 MB 4.7 MB -204.0 B (-0.0%)
🎨 CSS 291.6 KB 291.6 KB 0
🌐 HTML 0 B 0 B 0
📁 Other Assets 0 B 0 B 0

📦 Download Diff Report: ui-components Bundle Diff

Generated by Rsdoctor GitHub Action

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 14, 2026

📦 Binary Size-limit

Comparing a85c293 to fix(esm_library): extract TLA shared modules to break circular dependency (#13606) by Fy

❌ Size increased by 14.00MB from 48.72MB to 62.72MB (⬆️28.74%)

@codspeed-hq
Copy link
Copy Markdown

codspeed-hq Bot commented Apr 14, 2026

Merging this PR will degrade performance by 1.99%

❌ 2 regressed benchmarks
✅ 32 untouched benchmarks

⚠️ Please fix the performance issues or acknowledge them on CodSpeed.

Performance Changes

Mode Benchmark BASE HEAD Efficiency
Simulation rust@persistent_cache_restore@basic-react-development 25.6 ms 26.1 ms -1.82%
Simulation rust@create_chunk_assets 3.6 ms 3.7 ms -1.99%

Comparing fy/vigorous-wilbur (a85c293) with main (b962256)1

Open in CodSpeed

Footnotes

  1. No successful run was found on main (2bf2cd4) during the generation of this report, so b962256 was used instead as the comparison base. There might be some changes unrelated to this pull request in this report.

@JSerFeng JSerFeng enabled auto-merge (squash) April 15, 2026 11:23
@JSerFeng JSerFeng disabled auto-merge April 15, 2026 11:23
…calls

Review feedback: the previous pass was too aggressive — it classified
calls like `Object.assign(a, b)`, `Object.freeze(x)`, `Array.from(iter)`,
and `String(obj)` as pure, which can drop observable effects (mutation,
iterator protocol, `valueOf`/`toString` coercion).

Tighten the fast path to two forms:
1. Args must be *trivially safe* — literals (number/string/bool/null/regex),
   empty template literals, unary of safe, or unresolved `undefined`/
   `NaN`/`Infinity`. No objects, no arrays, no identifiers.
2. `Boolean(x)`, `Array.isArray(x)`, `Object.is(a, b)`,
   `Number.isInteger`/`isFinite`/`isNaN`/`isSafeInteger`, and
   `Array.of(...)` are pure regardless of arg values — they don't coerce,
   iterate, or read properties.

Remove from allowlist: `Object.{keys,values,entries,assign,freeze,
create,fromEntries,getOwnPropertyNames,getOwnPropertySymbols,
getOwnPropertyDescriptor,getOwnPropertyDescriptors,getPrototypeOf}`,
`Array.from`.

Also fix the shadow test case to actually shadow the real `Set`/`Map`
globals (previously used `ShadowedSet` which never matched the
allowlist). Split shadow test into its own module so the `const Set`
declaration doesn't pull the non-shadowed uses into TDZ.

Regenerate `runtime-specific-used-exports` stats snapshot — `module.js`
containing `export const identity = Symbol();` is now correctly
recognized as side-effect-free, enabling module concatenation.
Move all known-pure-global logic out of side_effects_parser_plugin.rs
(−144 lines) into a new parser_plugin/pure_globals.rs module (+210
lines, including 16 unit tests).

Key design changes:

  * Single entry point: `classify_pure_global(callee, ctxt, position)`
    returns `Option<ArgGate>` — the caller just matches on the gate.
  * `ArgGate` enum: `AnyPureArgs` (callee never coerces, e.g. Boolean)
    vs `TriviallySafeArgs` (callee may coerce, e.g. new Set).
  * `CalleePosition` enum: `New` vs `Call` — so the same name can have
    different purity in each position (e.g. `Symbol()` pure as call,
    but `new Symbol()` throws → not in the list).
  * All classification tables, `is_trivially_safe_expr`, and
    `are_args_trivially_safe` live in `pure_globals.rs`.
  * 16 table-driven unit tests covering ident/member/negative cases.
  * Call sites in `is_pure_new_expr` / `is_pure_call_expr` reduced to
    a single `match gate { … }` each.
@JSerFeng
Copy link
Copy Markdown
Contributor Author

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: d3e79f5237

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

pub fn is_trivially_safe_expr(expr: &Expr, unresolved_ctxt: SyntaxContext) -> bool {
match expr {
// Number, string, boolean, null, regex, bigint literals.
Expr::Lit(_) => true,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Preserve throwing built-ins in trivial-arg purity checks

Treating every literal as trivially safe here makes the new fast path unsound for several globals classified as TriviallySafeArgs (e.g. Set, Map, Array, ArrayBuffer): literals can still trigger synchronous throws, such as new Set(1) (TypeError) or new Array(-1) (RangeError). Because is_pure_new_expr/is_pure_call_expr now return pure when this check passes, unused expressions that should throw can be tree-shaken away, changing runtime behavior during module evaluation.

Useful? React with 👍 / 👎.

Reviewer caught a real soundness bug: literal args can still trigger
synchronous throws in `Set`/`Map`/`Array`/TypedArray constructors and
`Number`/`Date`. With the previous `TriviallySafeArgs` gate, calls like
`new Set(1)` (TypeError), `new Array(-1)` (RangeError), and
`new Uint8Array(1.5)` (RangeError) were incorrectly classified as pure
and could be tree-shaken away — silently dropping the throw.

Split `ArgGate` into four variants by argument-shape requirement:

  * `AnyPureArgs` — callee neither coerces nor throws on values
    (e.g. `Boolean(x)`, `Array.isArray(x)`, `Object.is(a, b)`).
  * `TriviallySafeArgs` — callee handles every literal safely
    (e.g. `String("x")`, `Object("y")`, `Symbol("desc")`).
  * `NullishOrNoArgs` — callee throws on non-iterable / non-numeric
    values, so only `null`/`undefined`/no-args are safe (e.g.
    `new Set()`, `new Map()`, `new Date()`, `new Number()`,
    `new Array()`).
  * `NullishOrNonNegativeIntLiteral` — `RangeError` on negative,
    fractional, or out-of-range lengths; only nullish or `[0, 2^32)`
    integer literals pass (e.g. `new Uint8Array(16)`,
    `new ArrayBuffer(0)`).

Reclassification:
  * `Set`/`Map`/`WeakSet`/`WeakMap`: `NullishOrNoArgs`.
  * `Array` (both new and call): `NullishOrNoArgs`.
  * `Date`/`Number` (both): `NullishOrNoArgs` (BigInt → ToNumber throws).
  * TypedArrays / `ArrayBuffer` / `SharedArrayBuffer`:
    `NullishOrNonNegativeIntLiteral`.
  * `String`/`Object`: stays `TriviallySafeArgs` (ToString/ToObject
    accept any literal incl BigInt).
  * `Boolean`: promoted to `AnyPureArgs` in both positions (ToBoolean
    never throws and never invokes user code).
  * `Symbol` (call only): `TriviallySafeArgs`.

New helpers in `pure_globals.rs`:
  * `is_nullish_expr`: null literal / `undefined` ident / `void <safe>`.
  * `is_non_negative_int_literal`: finite, integer, `[0, 2^32)`,
    handles `+N` UnaryExpr; rejects BigInt, negative, fractional,
    NaN/Infinity.
  * `check_arg_gate`: dispatches arg-shape check based on gate.

Snapshot test extended with 9 new "MUST be kept" cases covering each
throw category. All cases verified in the regenerated
`treeshaking.snap.txt` — pure forms are wrapped in `null && (…)`,
throwing forms are preserved verbatim.

Unit tests: 25 total (was 16), covering each gate, both positions for
each global, and direct arg-shape checks against synthetic Lit/Num
expressions.
@JSerFeng
Copy link
Copy Markdown
Contributor Author

Good catch — that was a real soundness bug. Pushed a85c293.

The previous TriviallySafeArgs gate was too permissive for callees that throw based on argument values. Specifically, all of these were being incorrectly classified as pure (and could be tree-shaken silently dropping the throw):

  • new Set(1) → throws TypeError (not iterable)
  • new Map("foo") → throws TypeError (string iterates chars, not [k,v] pairs)
  • new Array(-1) → throws RangeError
  • new Array(1.5) → throws RangeError
  • new Uint8Array(-1) / new Uint8Array(1.5) → throws RangeError
  • new Date(1n) / new Number(1n) → throws TypeError (BigInt → Number)

Fix: split ArgGate into four variants by argument-shape requirement:

Gate Allowed args Used by
AnyPureArgs any pure expression Boolean, Array.isArray, Object.is, Number.isXxx, Array.of
TriviallySafeArgs any literal (number/string/bool/null/regex/bigint) + undefined/NaN/Infinity String, Object, Symbol (call only)
NullishOrNoArgs null, undefined, void <safe>, or absent Set, Map, WeakSet, WeakMap, Array, Date, Number
NullishOrNonNegativeIntLiteral nullish or non-negative integer literal in [0, 2^32) (excluding BigInt, fractional, negative) TypedArrays, ArrayBuffer, SharedArrayBuffer

Boolean got promoted to AnyPureArgs in both positions — ToBoolean never throws and never invokes user code, so even Boolean(myVar) is pure.

Verification. Snapshot test extended with 9 new "MUST be kept" cases covering each throw category. Regenerated snapshot confirms:

  • Pure forms (new Set(), new Set(null), new Uint8Array(16), Boolean(marker), etc.) → wrapped in null && (…)
  • Throwing forms (new Set(1), new Array(-1), new Uint8Array(1.5), new Date(1n), etc.) → preserved verbatim ✓
  • Impure-arg forms (Boolean(impureArg()), new Set([impureArg()])) → preserved ✓
  • Shadow cases → preserved ✓

Unit tests bumped from 16 → 25, including direct arg-shape checks against synthetic Lit::Num/Lit::Str/UnaryExpr expressions.

@JSerFeng
Copy link
Copy Markdown
Contributor Author

The 14MB size delta is not from this PR — it's a CI infrastructure mismatch introduced by #13726 (`d79172e7`).

What's happening:

  • The size-limit script fetches the baseline from `rspack-ecosystem-benchmark/data`. The most recent baseline is for `b962256bb8` at 51,084,280 bytes (51MB), uploaded 2026-04-17T03:27Z.
  • That baseline is built by the external benchmark workflow with the `release` profile (`lto = "fat"`, `codegen-units = 1`, `opt-level = 3`, `strip = true`).
  • After chore(ci): use ci profile for binary size-limit job to speed up build #13726, this PR's CI uses `size-limit.yml` which now downloads the artifact from `Build Linux` (built with the `ci` profile — `lto = false`, `codegen-units = 256`, `opt-level = 2`) and runs `strip` manually.
  • After strip, my binary is 65,767,832 bytes (65MB). The 14MB delta is the cost of disabling LTO + lower codegen-units, not the ~600 lines of Rust this PR adds.

Evidence this isn't the PR's code:

  • Cargo unit-test count went 16 → 25 (one new module, ~500 lines)
  • Other PRs with this size-limit check passed because they ran before d79172e landed (they used the `release`-profile build path, which matched the benchmark baseline)

Fix options (need maintainer action):

  1. Update the external benchmark workflow to also build with the `ci` profile so baselines match.
  2. Or revert d79172e and keep the `release`-profile build in size-limit.yml.
  3. Or override this check for now and re-run after benchmark data catches up with apples-to-apples profiles.

Happy to rebase or trigger a re-run once that's resolved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

release: feature release: feature related release(mr only) team The issue/pr is created by the member of Rspack.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants