Skip to content

[pull] main from tldraw:main#596

Merged
pull[bot] merged 9 commits into
code:mainfrom
tldraw:main
Jun 16, 2026
Merged

[pull] main from tldraw:main#596
pull[bot] merged 9 commits into
code:mainfrom
tldraw:main

Conversation

@pull

@pull pull Bot commented Jun 16, 2026

Copy link
Copy Markdown

See Commits and Changes for more details.


Created by pull[bot] (v2.0.0-alpha.4)

Can you help keep this open source service alive? 💖 Please sponsor : )

frolic and others added 9 commits June 16, 2026 09:33
#9100)

`@tldraw/mermaid` depends on `mermaid`, which is ESM-only. The static
`import mermaid from 'mermaid'` at the top of `createMermaidDiagram.ts`
compiles to `require("mermaid")` in the package's `dist-cjs` build, and
`require()` of an ES module throws `ERR_REQUIRE_ESM` for CommonJS
consumers on Node 20.0–20.18 (the repo floor is `^20.0.0`) and trips up
Jest and ts-node — the same failure class as
[#7905](#7905).

`createMermaidDiagram()` is already `async` and `mermaid` is only
referenced inside it (every other `mermaid` import is `import type`,
erased at build), so it can be loaded with a dynamic `import()`:

```ts
// before — top-level; compiles to require("mermaid") in dist-cjs
import mermaid from 'mermaid'

// after — loaded the first time a diagram is created
const mermaid = (await import('mermaid')).default
```

esbuild keeps `import()` as a real dynamic import (it doesn't downlevel
to `require`), so no `require(<esm>)` ends up in the CJS build —
`@tldraw/mermaid` becomes requirable from CommonJS regardless of how a
consumer imports it.

It also helps bundling slightly, but only for consumers who import
`@tldraw/mermaid` **statically** at their module top level: for them
mermaid (and its large d3/dagre tree) moves out of the main bundle into
a chunk loaded on the first `createMermaidDiagram()` call. Consumers who
already `await import('@tldraw/mermaid')` — the recommended pattern, and
what `SneakyMermaidHandler` and the examples do — already have mermaid
in a lazy chunk, so their bundle is unchanged.

Self-contained and independent of the broader ESM-only-in-CJS efforts in
#9050 (build-level inlining) and #9098 (Node engine bump, which makes
`require('mermaid')` work natively on Node ≥20.19 / 22.12).

### Change type

- [x] `bugfix`

### Test plan

1. `cd packages/mermaid && yarn build`.
2. `packages/mermaid/dist-cjs/createMermaidDiagram.js` contains `await
import("mermaid")` and no `require("mermaid")` (mermaid stays external
in both builds).
3. The 53 mermaid tests pass (`cd packages/mermaid && yarn test run`);
typecheck passes.

- [ ] Unit tests
- [ ] End to end tests

### Release notes

- `@tldraw/mermaid` now loads its ESM-only `mermaid` dependency lazily,
so the package is requirable from CommonJS (and mermaid only loads when
a diagram is created).

### Code changes

| Section   | LOC change |
| --------- | ---------- |
| Core code | +7 / -1    |
…emplate (#9203)

In order to give new workspaces a welcoming first file instead of a
blank canvas — and to make that content editable without a deploy — this
PR seeds a new workspace's first file by forking a "welcome template".
When a workspace is created (behind the `groups_frontend` flag), its
first file is seeded by passing a fixed `createSource: 'welcome'`, which
the sync worker resolves: it forks the published snapshot of the file
currently marked as the welcome template, falling back to a committed
default snapshot when none is set or it can't be read (fresh env, deploy
skew, unpublished). An admin marks any published file as the template.
The file is ordinary — renameable, editable, deletable — and "My files"
keeps its blank first file.

Relates to #9143.

> [!NOTE]
> Draft + alternative approach. This is the fork/publish alternative to
#9142, which seeds the same welcome content from a `RoomSnapshot` baked
into the sync worker. The trade-off: #9142 ships the content in the
worker bundle (identical everywhere, no data setup, but editing means
re-exporting and re-deploying); this PR makes the welcome content a real
published file an admin can edit and re-publish live, with a committed
default so it still works everywhere out of the box. Opening both for
comparison — only one should land.

How it works:

- **`createSource: 'welcome'`** (`WELCOME_CREATE_SOURCE` in
`dotcom-shared`) is a fixed, slug-less marker — the client never knows
which file is the template, so an admin can retarget it without a client
change. `TLFileDurableObject.handleFileCreateFromSource` special-cases
it before the prefix/id split.
- **Resolution** (`resolveWelcomeSnapshot`): read the `welcome_template`
pointer, fork that file's published snapshot via the existing
`getPublishedRoomSnapshot`, else the committed default. A template that
is *set but unreadable* is reported (Sentry) rather than silently
serving the default; a genuinely-absent template is silent.
- **`welcome_template`** (migration `035`) is a singleton Postgres
table, deliberately outside the Zero publication — worker-side config,
never replicated to clients.
- **Admin**: `/app/admin/welcome-template` GET/POST/clear (admin-gated)
+ a "Welcome template" section in the admin panel. Setting requires the
file be published; the panel warns if the current template is no longer
live.
- **Seeding** is single-flighted per workspace, so a creation flow and a
workspace switch can't create duplicates and all callers get the result.
The committed default reuses a self-rebaking snapshot test that keeps it
migrated to the current schema.

### Change type

- [x] `feature`

### Test plan

Preview deploy: https://pr-9203-preview-deploy.tldraw.com/

**Setup**

1. Open the preview and sign in; enrol yourself in groups via `/admin`
(the `groups_frontend` flag).

**Default (no template set)**

2. Create a workspace and open it → its first file is "Welcome to your
workspace" with the default welcome canvas; no inline-rename prompt.
3. Empty "My files" and open it → still a blank, date-named file with
the rename prompt.

**Admin template + fork**

4. Create a file, give it distinct content, and **publish** it.
5. In `/admin` → Welcome template, set it by file ID.
6. Create another workspace → its welcome file is a fork of that
published file's content.
7. Unpublish/delete that file → `/admin` shows the template is no longer
live; new workspaces fall back to the default.

- [x] Unit tests
- [ ] End to end tests

### Code changes

| Section         | LOC change  |
| --------------- | ----------- |
| Core code       | +22 / -0    |
| Tests           | +209 / -0   |
| Automated files | +9 / -0     |
| Apps            | +2784 / -44 |

### Release notes

- Add a welcome file to new workspaces: creating a workspace opens a
"Welcome to your workspace" canvas (forked from an admin-set template,
or a built-in default). Deleting it, or opening an already-empty
workspace, gives a blank file as before.
This PR updates the i18n strings.

### Change type
- [x] `other`

Co-authored-by: huppy-bot[bot] <128400622+huppy-bot[bot]@users.noreply.github.com>
Co-authored-by: Mime Čuvalo <mimecuvalo@gmail.com>
In order to stop the live-sharing scenario e2e suite
(`sharing-live.scenario.spec.ts`, added in #9187) flaking on CI, this PR
replaces its fixed-timeout cross-client sync waits with a deterministic
readiness signal and makes the workspace switcher open reliably. Closes
#9214.

The flake (e.g. "member removal revokes the active member window without
reload", but also workspace/file deletion, role changes, invite
acceptance, and `ui` scenarios) timed out in setup.
`expectWorkspaceVisible` opened the switcher dropdown once and waited
for the workspace link to render — but that list is driven reactively by
Zero sync via `getWorkspaceMemberships`, so the single wait was really
gating on variable cross-client sync latency *through* a dropdown whose
open state churns on re-render. A previous 5s→15s timeout bump didn't
fix it; 2 of 11 surveyed CI runs exhausted all retries and went red.

The fix separates the two tangled concerns, then hardens the UI step:

- `waitForWorkspaceMembershipSync` polls
`window.app.getWorkspaceMemberships()` until the workspace is
present/absent — the genuine sync gate, immune to dropdown churn —
bounded at 30s so a real sync regression still fails with a clear signal
rather than hanging (this is the deterministic readiness signal from the
issue's open question 1).
- `expectWorkspaceVisible` / `expectWorkspaceNotVisible` gate on that
data first, then assert the UI.
- `openWorkspaceSwitcher` now ensures the dropdown is actually open: it
clicks only when not already open and retries until the menu content is
mounted. A first CI run on this branch confirmed the residual race was
here — the data gate passed but a lost open-click left the link
assertion waiting on a closed dropdown. This hardens every caller
(`switchToWorkspace`, create-workspace, settings/invite flows) and the
`ui` scenario suite, which share this page object.
- `expectFileVisible` / `expectFileNotVisible` and
`expectActiveWorkspace` move off the bare 5s default onto the suite's
established 10s cross-client propagation budget.
- The suite's per-test timeout is raised to 60s so the data-layer wait
has headroom past the heavy two-context setup.

### Change type

- [x] `other`

### Test plan

This is test-infrastructure hardening; correctness is verified by the
suite running green and non-flaky in CI. The bar is a clean pass on the
**first** attempt (the dotcom e2e job retries twice), not a
retry-rescued pass.

- [x] End to end tests

### Code changes

| Section | LOC change |
| ------- | ---------- |
| Tests   | +65 / -11  |
Fixed #8821 - the following indicator border in
`packages/tldraw/src/lib/ui/TldrawUi.tsx` was moved out of
`TldrawUiInFrontOfTheCanvas()` and added together with the `Toasts` and
`Dialogs` components. Removed dead code relating to the
`--tl-layer-following-indicator` CSS variable.

<img width="271" height="122" alt="image"
src="https://github.com/user-attachments/assets/ae75eb0c-f70b-416e-b993-4afbd415670c"
/>


### Change type

- [x] `bugfix`
- [ ] `improvement`
- [ ] `feature`
- [ ] `api`
- [ ] `other`

### Test plan

Unclear about testing processes - would like advice from others
regarding this please.

- [ ] Unit tests
- [ ] End to end tests

### Release notes

- The following indicator border is no longer hidden beneath UI panels.

---------

Co-authored-by: Andromeda Stroev <andromeda@MacBook-Pro-5.local>
…9231)

In order to fix #9228, this PR stops the sidebar workspace switcher from
dismissing itself when reopened during a workspace switch.

Selecting a workspace navigates immediately, but the target file's
canvas loads asynchronously and (deep links are on for the file route)
rewrites the URL with a `?d=` param once its editor mounts. The switcher
stored its open/closed state via `useMenuIsOpen`, which scopes the menu
id to the active file editor's `contextId`. The sidebar receives that
editor through `globalEditor`/`EditorContext`, and the editor instance
is replaced on every workspace/file switch (`TlaEditor` remounts on
`fileSlug`). So when the outgoing editor was disposed as the incoming
canvas loaded, its `clearOpenMenus()` cleared the menu state for its
context — the same key the just-reopened switcher was using — and,
separately, the `globalEditor` flip changed the menu id out from under
the open dropdown. Either way the dropdown closed on its own.

The switcher is a sidebar-level control whose lifetime spans editor
remounts, so its open state should not be tied to any per-file editor.
This switches both the switcher and the layout that reads it to
`useGlobalMenuIsOpen` with a stable, editor-independent id.
Outside-click dismissal is unchanged: the dropdown's Radix `modal`
dismissable layer still closes it on any outside pointer-down, including
canvas clicks.

This also addresses the root cause behind several retry-loop workarounds
in the dotcom e2e `Sidebar` page object (added to tolerate "a settling
re-render dismissed the menu"); those are left in place here and can be
simplified separately.

### Change type

- [x] `bugfix`

### Test plan

1. On tldraw.com, be in a workspace with at least one other workspace
available in the switcher.
2. Open the workspace switcher and select a different workspace.
3. While the new canvas is still loading, immediately reopen the
workspace switcher.
4. Confirm it stays open (previously it dismissed itself once the canvas
finished loading).

- [ ] Unit tests
- [x] End to end tests

### Release notes

- Fix the tldraw.com sidebar workspace switcher dismissing itself when
reopened right after switching workspaces.

### Code changes

| Section | LOC change |
| ------- | ---------- |
| Tests   | +35 / -0   |
| Apps    | +10 / -4   |
…8283)

For [#8203](#8203).

Bound arrow terminals break during undo/redo of shape deletions. This
draft PR documents the issue and the two approaches for fixing it.

### Problem

When a bound shape is deleted, `onBeforeIsolateFromShape` converts the
arrow terminal from a binding to an absolute page position before the
binding is removed. That is correct during a live delete.

During undo/redo, though, history replay restores a diff that already
contains the correct terminal values. The store's `beforeDelete` side
effect fires again during replay, `onBeforeIsolateFromShape` runs again,
and the arrow terminal is recomputed from intermediate replay state
instead of preserving the restored value. The result is that the arrow
ends up in the wrong place.

The core issue is that `onBeforeIsolateFromShape` cannot currently
distinguish a live edit from a history replay.

### Approaches explored

**1. Explicit provenance on isolation callbacks**

Pass replay context to `onBeforeIsolateFromShape` explicitly, e.g. a
`source` on the isolation callback options. This gives the hook the
information it actually cares about and keeps the decision at the
callback boundary.

The downside here is that it requires API and plumbing changes outside
the arrow binding util.

**2. `HistoryManager.isReplaying()` (implemented in this branch)**

Add a dedicated `isReplaying()` signal to HistoryManager, set only while
replaying diffs in `_undo()` / `redo()`, and have ArrowBindingUtil bail
out of `onBeforeIsolateFromShape` when that signal is true.

This is the smallest practical fix and avoids changing semantics
elsewhere. It is also more precise than reusing `isPaused()`, because
replay and `{ history: 'ignore' }` are different cases.

The downside is that it is more implicit than explicit callback
provenance, and ArrowBindingUtil has to reach across an API boundary to
ask history whether replay is in progress.

### Change type

- [x] `bugfix`

### Test plan

1. Create two shapes and connect them with an arrow
2. Delete the target shape — arrow terminal should snap to an absolute
position
3. Undo — arrow should return to its original bound position
4. Redo — arrow terminal should match the position from step 2

- [x] Unit tests
- [ ] End to end tests

### Release notes

- Fix arrows jumping to incorrect positions after undoing or redoing
shape deletions involving bindings.

### API changes

- Added `HistoryManager.isReplaying()` (internal)

### Code changes

| Section         | LOC change |
| --------------- | ---------- |
| Core code       | +25 / -10  |
| Tests           | +32 / -0   |
| Automated files | +1 / -1    |

---------

Co-authored-by: Claude Fable 5 <noreply@anthropic.com>
In order to stop the dotcom dev stack from leaving orphaned processes
and Docker containers behind when it exits, this PR adds teardown
handling to the dotcom dev scripts and the dotcom e2e suite. Previously,
ending a `yarn dev-app` session (or a CI e2e run) could leave processes
holding the fixed dev ports (3000, 8787, 4848, 9339, 7654) and Docker
containers holding 6432/6543, which then made the next `yarn dev-app`
fail on a port conflict.

Root cause: `yarn dev-app` runs under lazyrepo, which has no signal
handling of its own, so on exit each script is responsible for tearing
down what it spawned — and several did not. The zero-cache workers are
the worst case: they run in their own process groups, so a
terminal/group signal never reaches them and the zero CLI keeps holding
port 4848.

What changed:

- **`dev-app.ts`** — run vite as a managed child and kill it on
SIGINT/SIGTERM/SIGHUP/exit (was `exec`-ed with no handlers, so vite
orphaned and held port 3000).
- **`internal/scripts/workers/dev.ts`** — add
`MiniflareMonitor.dispose()` plus signal/exit handlers so wrangler and
its `workerd` child are stopped (previously only killed on a
segfault-triggered restart).
- **`sync-worker/dev.ts`** — add SIGHUP and exit handlers alongside the
existing SIGINT/SIGTERM forwarding.
- **`zero-cache/dev.ts`** — on shutdown, reap the whole descendant
process tree by PID (crossing the separate process groups the zero-cache
workers live in) and run a detached `docker compose down
--remove-orphans`.
- **dotcom e2e** — add a CI-only Playwright `globalTeardown` that runs
`yarn dev-app:clean`. Playwright reaps the Node processes it spawned,
but the Docker stack is daemon-managed and survives, so it leaked
between runs. Gated to CI so a locally reused dev server is never torn
down.

Known limitation: for interactive Ctrl+C the teardown is now much more
reliable but still racy, because lazyrepo's wrapper layers (yarn → tsx →
nodemon) can die and reparent their children to `launchd` before the
orchestrator walks the tree. The deterministic guarantees remain the
existing port guards (clear error on next start) and `yarn
dev-app:clean` / `:clean:all`, plus the new CI e2e teardown.

### Change type

- [x] `other`

### Test plan

1. Run `yarn dev-app` and confirm the stack starts (client on 3000,
sync-worker 8787, zero 4848, workers 9339, migrations 7654, Docker on
6432/6543).
2. Stop it with Ctrl+C and confirm the Node dev ports are released and
the next `yarn dev-app` starts without a port-in-use error.
3. Confirm the dotcom client unit tests still pass: `yarn workspace
dotcom test run scripts/dev-app.test.ts`.

- [x] Unit tests
- [ ] End to end tests

### Code changes

| Section         | LOC change |
| --------------- | ---------- |
| Tests           | +24 / -0   |
| Apps            | +102 / -11 |
| Config/tooling  | +25 / -2   |
In order to give developers an accurate error when their tldraw client
and sync server are on mismatched versions, this PR makes the server
report `SERVER_TOO_OLD` (instead of always `CLIENT_TOO_OLD`) when the
client is actually running a newer schema than the server. Closes #6169.

Previously, when the server couldn't reconcile a client's schema
(`getMigrationsSince` returned an error), it always rejected the session
with `CLIENT_TOO_OLD` — even when the server was the one running the
older SDK. The connect handshake now inspects the schemas: if the
client's `schemaVersion`, or any of its sequence versions, is ahead of
the server's, the session is rejected with `SERVER_TOO_OLD`; otherwise
it stays `CLIENT_TOO_OLD`.

The pre-existing "older client we can't migrate down to" case (a
migration isn't record-scoped or has no `down` migration) still reports
`CLIENT_TOO_OLD`.

### Change type

- [x] `bugfix`

### Test plan

1. Run a newer `tldraw` client (e.g. 3.12) against an older
`@tldraw/sync` server (e.g. 3.7).
2. Observe the client now receives `SERVER_TOO_OLD` instead of
`CLIENT_TOO_OLD`.

- [x] Unit tests

### Release notes

- Fix sync version-mismatch handshake reporting `CLIENT_TOO_OLD` when
the server is actually the one running an older version; it now
correctly reports `SERVER_TOO_OLD`.

### Code changes

| Section   | LOC change |
| --------- | ---------- |
| Core code | +29 / -2   |
| Tests     | +42 / -0   |
@pull pull Bot locked and limited conversation to collaborators Jun 16, 2026
@pull pull Bot added the ⤵️ pull label Jun 16, 2026
@pull pull Bot merged commit 93acb3a into code:main Jun 16, 2026
@pull pull Bot had a problem deploying to bemo-canary June 16, 2026 15:13 Failure
@pull pull Bot had a problem deploying to deploy-production June 16, 2026 15:13 Failure
@pull pull Bot had a problem deploying to vsce publish June 16, 2026 15:13 Failure
@pull pull Bot had a problem deploying to bemo-canary June 16, 2026 15:13 Failure
@pull pull Bot had a problem deploying to deploy-staging June 16, 2026 15:13 Error
@pull pull Bot had a problem deploying to deploy-staging June 16, 2026 15:13 Failure
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants