[pull] canary from vercel:canary#1043
Merged
Merged
Conversation
…:snapshots (#93788) ### What? Fix a latent deadlock between `track_modification_internal` / `SnapshotShardIter::next` and `Storage::end_snapshot` in `turbo-tasks-backend`, and harden `dash_map_multi::RefMut` so that holding a `StorageWriteGuard` across an `.await` can no longer compile. ### Why? `Storage` has three sharded `DashMap`s — `task_cache`, `map`, and `snapshots`. The two call sites that touch both `map` and `snapshots` (`track_modification_internal` at `storage.rs:685/703` and `SnapshotShardIter::next` at `storage.rs:845/851`) hold a `map` shard write lock and then take a `snapshots` shard write lock — i.e. order `map → snapshots`. `end_snapshot` did the opposite: it called `parallel::for_each(self.snapshots.shards(), ...)`, took a `snapshots` shard write lock, drained it, and then called `self.map.get_mut(&key)` to promote flags — order `snapshots → map`. The existing comment argued this was safe because `track_modification` only inserts into `snapshots` when `snapshot_mode == true` and `end_snapshot` first stores `snapshot_mode = false`. That argument is incorrect. `track_modification_internal` loads `snapshot_mode` (line 616) and inserts into `snapshots` later (line 685 or 703); the two are not atomic. The interleaving: 1. T1 in `track_modification_internal`: holds `map[N]`, reads `snapshot_mode() → true`. 2. T2 in `end_snapshot`: stores `snapshot_mode = false`, takes `snapshots[N]` write lock. 3. T2: tries `map.get_mut(&key)` → blocks on `map[N]` (T1 holds it). 4. T1: tries `snapshots.insert(...)` → blocks on `snapshots[N]` (T2 holds it). deadlocks both threads. Separately, `dash_map_multi::RefMut` (which backs `StorageWriteGuard`) had `unsafe impl Send`. The wrapped `parking_lot` `RwLockWriteGuard` is intentionally `!Send` upstream (via `lock_api::GuardNoSend(*mut ())`); the manual override let callers compile code that holds a `StorageWriteGuard` across an `.await`. In that pattern the guard pins a shard write lock from a parked future and every other tokio worker piles up trying to take the same shard — a deadlock that the borrow checker would otherwise catch for free. ### How? **`Storage::end_snapshot`** (`storage.rs:383`): - Zip `map.shards()` with `snapshots.shards()` by index and lock each pair in the documented order: `map_shard.write()` first, then `snap_shard.write()`. - For every key drained from `snapshots[N]`, resolve it directly in the held `map_shard` guard via `RawTable::find`, rather than calling `self.map.get_mut` (which would re-enter the same shard). - `debug_assert_eq!` on the shard counts. - Update the obsolete "this is fine" comment to describe the prior race and the new pattern. **Shard pairing invariant** is now documented on the `map` and `snapshots` field declarations: both `DashMap`s are constructed with the same `shard_amount`, the same `TaskId` key type, and the same stateless `FxBuildHasher`, so shard `N` in `snapshots` corresponds exactly to shard `N` in `map`. Every key present in `snapshots.shards()[N]` (if present in `map`) lives in `map.shards()[N]`, so the zipped acquisition covers every drainable entry. **`dash_map_multi::RefMut`** (`dash_map_multi.rs:32`): - Removed the `unsafe impl<K: Eq + Hash + Sync, V: Sync> Send for RefMut<'_, K, V> {}`. - `RwLockWriteGuard` already contains `GuardNoSend(*mut ())`, so the auto-trait derivation correctly marks `RefMut` (and therefore `StorageWriteGuard`, `TaskGuardImpl`, `OccupiedEntry`, `VacantEntry`) `!Send`. Verified by compile-fail probes on all four guard types. - Kept `unsafe impl Sync` — sharing `&RefMut` between threads is still sound (the write guard provides exclusive access, `K: Sync + V: Sync`). - Replaced the old SAFETY comment with one that explains why `Send` is intentionally omitted. ### Audit summary After the fix, the full lock-order graph for the three `Storage` dashmaps is: ``` task_cache ─→ map ─→ snapshots ▲ └── (non-blocking try_lock_and_remove + defer in evict_after_snapshot) ``` Total order: `task_cache → map → snapshots`. The only reverse edge is `evict_after_snapshot`'s non-blocking `try_lock_and_remove` on `task_cache` while holding `map`, which defers on contention — no hold-and-wait possible. Within `map`, the only multi-guard path is `access_pair_mut`, whose retry loop in `get_multiple_mut` holds zero shard guards at every blocking acquire. Per-thread nested task guards are prevented at runtime in debug builds by `TaskLockCounter::acquire` (`operation/mod.rs:128–166`). ### Testing - `cargo check --all-targets` (turbopack workspace): clean. - `cargo test -p turbo-tasks-backend --lib storage::`: 4/4 pass, including `modify_during_snapshot_clears_live_modified_flags` and `modify_different_category_during_snapshot` which exercise the snapshot lifecycle this PR touches. - Compile-fail probes confirmed `StorageWriteGuard`, `TaskGuardImpl`, `OccupiedEntry`, and `VacantEntry` are all `!Send` post-change. <!-- NEXT_JS_LLM_PR -->
…d collectibles (#93114) ### What? Replaces the silent "make root node" promotion in the turbo-tasks backend with a panic that enforces tasks to already have the `root` attribute before performing strongly consistent reads, reading task collectibles, or removing collectibles. ### Why? Previously, the backend would silently promote any non-root task to a root node (aggregation number `u32::MAX`) when these operations were requested. This masked incorrect task configuration — tasks that needed root-level aggregation weren't explicitly declared as such, making it harder to reason about the aggregation graph and hiding potential performance issues from implicit promotions. ### How? **Backend enforcement (3 locations):** - **Strongly consistent read** (`backend/mod.rs`): Checks `NativeFunction.is_root` on the target task. If it's a persistent task without `root`, panics with both the target and reader task descriptions. - **Read task collectibles** (`backend/mod.rs`): Same check when reading collectibles from a task. - **Remove collectibles** (`operation/update_collectible.rs`): Same check when removing collectibles (count < 0). All three locations drop the task guard before panicking to avoid deadlocking with `debug_get_task_description`. **Macro support for `root` on methods:** - `value_impl_macro.rs`: Now reads the `root` attribute from `#[turbo_tasks::function(root)]` on inherent impl and trait impl methods (previously hardcoded to `false`). - `value_trait_macro.rs`: Same for trait default methods. **`root` attribute additions:** - All `#[turbo_tasks::function(operation)]` in test files, benchmarks, and fuzz code - Production operations in `crates/next-api/` that are read with strong consistency - Regular functions and methods in tests that use `.strongly_consistent()` - Trait methods in `value_impl` and `value_trait` blocks used with strong consistency **New test:** - `non_root_task_panic.rs`: Verifies the panic fires when attempting a strongly consistent read on a non-root operation task. Captures the panic from the worker thread via a panic hook since the panic propagates as a channel error to the test thread. <!-- NEXT_JS_LLM_PR --> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
### What?
Adds a new internal CLI subcommand:
```
next internal static-routes-info [directory] [options]
```
It runs against an already-built Next.js app (`.next/` from `next
build`), reads the manifests under `distDir`, and reports per-route
bundle sizes split into six file categories. Supports markdown (default)
or JSON output, sorting, limiting, and per-category file listings.
### Why?
We want a **static, build-output-based** way to compare different
chunking strategies for JS and CSS without running the user's app.
Existing tooling either requires running the app
(`@next/bundle-analyzer` is a webpack plugin), is bundler-specific, or
aggregates per-bundle/per-asset rather than per-route. This tool answers
the concrete question "how much JS / CSS does each route ship today?"
purely by reading static manifests, so it can be diffed across builds,
branches, or bundlers (Turbopack vs webpack) and used to evaluate
chunking changes.
It is namespaced under `next internal` because the output format and
category boundaries are tied to internal manifest shapes; we don't
intend to make it a stable public surface yet.
### How?
#### Command surface
```
Usage: next internal static-routes-info [directory] [options]
Options:
--json Output as JSON instead of markdown.
--limit <n> Only show the first N routes after sorting (totals always reflect all routes).
--sort <key> Sort routes by: name (default, ascending), or one of
client, client-js, client-css, client-map,
server, server-bundled-js, server-unbundled, server-map,
total (descending).
--files Include the list of files (relative to the output directory)
per category in the JSON output. Requires --json.
-h, --help Displays this message.
```
`--limit` and `--sort` always consider every route for the totals; only
the displayed table is trimmed/reordered. `--files` is only meaningful
in JSON output and errors otherwise. Invalid `--sort` keys error with
the valid set listed.
#### Six categories per route
Each file the tool sees is placed into **exactly one** of these six
buckets, so totals are not double-counted:
| Category | Description |
| ------------------ |
--------------------------------------------------------------------------------------------------
|
| Client JS | `.js` chunks loaded by the browser |
| Client CSS | `.css` files loaded by the browser |
| Client Source Maps | `.map` files for client JS / CSS |
| Server Bundled JS | `.js` chunks executed on the server (App Router,
Pages SSR, route handlers, middleware) |
| Server Unbundled | Files traced via `*.nft.json` outside `distDir`
(typically `node_modules` deps for `serverExternalPackages` / Pages SSR)
|
| Server Source Maps | `.map` files for server JS, including `.map`s
referenced from nft.json |
Source maps are discovered three ways: `.map` extension matches, `//#
sourceMappingURL=...` trailers in JS, and `/*# sourceMappingURL=...*/`
trailers in CSS. Maps always go into the `Maps` category even when the
manifest puts them next to their bundle, so they never inflate Bundled
or CSS counts. `sourceMappingURL` reads are memoized per chunk so a
chunk shared by N routes is opened once.
#### Two-step measurement
1. **Capture per-route file sets** by reading manifests:
- `pages-manifest.json` and `app-path-routes-manifest.json` for the
route list.
- `<entry>.nft.json` for server-bundled chunks plus traced node_modules
deps.
- `<entry>_client-reference-manifest.js` for App Router client JS/CSS
(both `entryJSFiles` / `entryCSSFiles` on Turbopack and
`clientModules.chunks` on webpack — the parser handles both layouts).
- `build-manifest.json` for shared App Router root chunks
(`rootMainFiles`) and Pages Router client chunks.
- `middleware-manifest.json` for middleware and edge route handlers.
2. **Deduplicate inside each per-route category, then measure.** A
global `lstat` cache stats every unique path once across the whole run;
per-category sets dedupe via string equality. Files are routed by
extension (`.map` → Maps, `.css` → CSS, `.js` → bundled / unbundled
depending on whether the path stays inside `distDir`) at the point they
enter a set, so e.g. an `.nft.json` referencing both bundle and `.map`
paths places each in the right bucket.
#### Shared metric
For each route, every category also carries a `sharedAvg`: the average
size of the *intersection* between this route and each peer route of the
same type. Computed as
```
sharedAvg = (Σ over peers p: |files(this) ∩ files(p)|) / number_of_peers
```
with both file count and bytes reported. The metric is also expressed as
a percentage of the route's own count and bytes (`percentCount`,
`percentBytes`) to make sharing easy to interpret at a glance — e.g.
`5.3 files (88%) / 424.12 KB (100%)` means "88% of this route's files,
and effectively all of its bytes, are also shipped by an average peer".
Routes with no peers (only one route of their type) get `null`. Note
that percentages are NOT commutative across peers (they're divided by
each route's own count/bytes) while raw intersection numbers are.
#### Output
Markdown (default), with three sections — `## Routes`, `## Shared (avg
per other route of same type)`, `## Totals` — each rendered as a
fixed-width aligned table. Empty cells render as `-` (and routes with no
peers in the Shared section as `n/a`) so meaningful values stand out:
```
## Routes
| Route | Type | Client JS | Client CSS | Client Source Maps | Server Bundled JS | Server Unbundled | Server Source Maps |
| ------------ | ---------- | ------------------- | --------------- | ------------------ | -------------------- | ------------------- | ------------------- |
| / | app-page | 6 files / 424.40 KB | 2 files / 153 B | - | 16 files / 384.47 KB | 140 files / 1.37 MB | 16 files / 2.31 MB |
| /api/edge | app-route | - | - | - | 9 files / 296.82 KB | - | 4 files / 1.53 MB |
…
## Shared (avg per other route of same type)
| Route | Type | Client JS | Client CSS | Client Source Maps | Server Bundled JS | Server Unbundled | Server Source Maps |
| ------------ | -------- | ---------------------------------- | ---------------------------- | ------------------ | -------------------------------- | --------------------------------- | ------------------------------ |
| / | app-page | 5.3 files (88%) / 424.12 KB (100%) | 1.3 files (63%) / 52 B (34%) | - | 11 files (69%) / 357.52 KB (93%) | 140 files (100%) / 1.37 MB (100%) | 11 files (69%) / 2.19 MB (95%) |
…
```
JSON has the same per-category structure (count + bytes + sharedAvg +
optional files list when `--files` is used) with identical category
ordering: `clientJs`, `clientCss`, `clientMaps`, `serverBundled`,
`serverUnbundled`, `serverMaps`. JSON values are exact (e.g. `0/0` is
preserved as `{count:0, bytes:0}` rather than `-`) so machine consumers
aren't affected by the markdown placeholder. Totals also expose dedup'd
`files` arrays under `--files`.
#### Route types
Reported types: `app-page`, `app-route`, `pages`, `pages-static`,
`pages-api`, `middleware`. App Router route handlers with `runtime:
'edge'` report as `app-route` (not a separate `edge-function`) so
they're directly comparable with their Node-runtime peers. Middleware is
a first-class type rather than being lumped under edge-function.
#### Robust manifest parsing
`_client-reference-manifest.js` is a JS module, not JSON. Both bundlers
emit it but with different layouts:
- Turbopack: multi-line, with a `for (const key in
MANIFEST[entry].clientModules) MANIFEST[entry].clientModules[k] = val`
suffix when a deployment ID is set.
- Webpack: single-line, no whitespace around `=`.
We extract the JSON body without evaluating the file. The implementation
locates the `globalThis.__RSC_MANIFEST[` anchor, walks the JS string
literal that holds the entry name (honoring `\\` escapes), then
balance-walks the `{...}` body. This handles entry names that contain
`]` characters, e.g.
```js
globalThis.__RSC_MANIFEST["/(dashboard)/[teamSlug]/(team)/~/stores/(store-details)/blob/[storeId]/page"] = {...}
```
Any structural surprise (anchor missing, unterminated string/object,
JSON parse failure) throws with the file path and offset — we never
silently undercount client JS/CSS for a route. Only file-not-found stays
as a `null` return — that's a normal case for server entries with no
client-reference manifest (middleware, route handlers, etc.).
### Tests
`test/production/static-routes-info/` is a real fixture covering every
route type:
- App Router: `/`, `/about`, `/no-client`, `/items/[itemId]` (a dynamic
segment inside a `(group)` route group, which forces `]` to appear
unescaped in the manifest entry name and exercises the parser), plus the
auto-generated `/_not-found`.
- App Router route handlers: `/api/node` (default Node runtime),
`/api/edge` (`runtime: 'edge'`).
- Pages Router: `/pages-ssr`, `/pages-ssr-2` (siblings sharing chunks),
`/pages-static`, `/api/hello`.
- Middleware: `middleware.ts`.
- A shared lib (`lib/shared.ts`) imported by both pages-router siblings
and `/`-`/about` to give the shared-avg metric something non-trivial to
measure.
- A `'use client'` `Counter` component imported by `/` and `/about` (but
not `/no-client`), which itself imports `counter.module.css`. Routes
that import Counter must ship strictly more client JS (and on Turbopack,
more client CSS) than `/no-client` — this is asserted, and it's the
cross-bundler regression check for the App Router client-JS collection
on webpack via `clientModules.chunks` (without it, every webpack
app-page reports `clientJs.count = 0`).
The test file (`static-routes-info.test.ts`) covers all the above plus
output formats, sort options, limit semantics, file-list integrity,
totals dedup, shared-avg correctness against a hand-computed reference,
markdown/JSON consistency, and the empty-cell `-` placeholder. The
shared-avg metric is verified three independent ways: against a
from-scratch reimplementation that walks the `--files` lists and
recomputes every (route, category) cell; against a "sharedAvg.count ==
own.count IFF every peer is a strict superset" invariant that makes 100%
values load-bearing; and against a hand-known case where one route ships
a chunk no peer does, forcing strictly-below-100% sharing. 31 tests,
passing on both Turbopack and webpack.
The tool was also exercised against `bench/basic-app`,
`bench/heavy-npm-deps`, `bench/nested-deps`, `bench/app-router-server`,
and `bench/nested-deps-app-router` while developing.
### Notes for reviewers
- New error codes added to `errors.json` for the manifest-parser throws
and other invariant violations.
- The command is registered under `next internal`; not advertised in
user-facing docs by design.
- Webpack quirk documented in the test: `flight-manifest-plugin.ts`'s
`mergeManifest` merges every app-page's `entryCSSFiles` into every other
route's CRM, so per-route CSS attribution on webpack is inherently fuzzy
— the test asserts CSS attribution on Turbopack only, and the comment
explains why.
<!-- NEXT_JS_LLM_PR -->
---------
Co-authored-by: v-work-app[bot] <262237222+v-work-app[bot]@users.noreply.github.com>
Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: Tobias Koppers <sokra@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
See Commits and Changes for more details.
Created by
pull[bot] (v2.0.0-alpha.4)
Can you help keep this open source service alive? 💖 Please sponsor : )