chore(mcp): capture detached dashboard daemon stdio for diagnostics#40518
Closed
Skn0tt wants to merge 13 commits intomicrosoft:mainfrom
Closed
chore(mcp): capture detached dashboard daemon stdio for diagnostics#40518Skn0tt wants to merge 13 commits intomicrosoft:mainfrom
Skn0tt wants to merge 13 commits intomicrosoft:mainfrom
Conversation
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
dfa91ad to
ad8d024
Compare
This comment has been minimized.
This comment has been minimized.
Investigating recurring `Cannot read properties of undefined (reading 'endpoint')` failures of `tests/mcp/dashboard.spec.ts` on windows-latest-firefox runs. The dashboard daemon is currently spawned detached with `stdio: 'ignore'`, so when its chromium launch fails or hangs we have zero visibility — only the downstream `connectToDashboard` timeout shows up. This change adds an opt-in log file: - `cli show` (and `--kill`/`--annotate`) now redirect detached daemon stdout+stderr to `PWTEST_DASHBOARD_DAEMON_LOG` when set; otherwise behavior is unchanged. - The MCP `cli` test fixture sets that env var to a per-test file under the test outputDir, bumps `DEBUG=pw:browser*` so chromium launch errors surface, and on test failure attaches the log to the test result so it appears in CI artifacts. Diagnostic only; no production behavior change.
Surfaces .err logs from cli-session browser daemons and the http MCP server's pw:browser* output on test failure so we can diagnose the windows-firefox cli-session/http timeouts.
…t race browser_navigate resolves on the page's load event, which can fire before fire-and-forget fetches in an inline script complete. Wrap the assertions in expect(...).toPass() so the snapshot is retried until all expected requests are visible. Observed flake on windows-msedge.
Daemon .err files live at daemon/${workspaceDirHash}/${session}.err, but the
flat readdir loop only saw the top-level entries, so they were never attached
on failure.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
867a789 to
388d119
Compare
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
Annotate path was spawning the dashboard daemon with stdio:'ignore', bypassing the existing log-capture. Use detachedStdio() there too, and add lifecycle logs around openDashboardApp / acquireSingleton / innerOpenDashboardApp / bind() so failures on Windows reveal whether the daemon bailed out via the singleton "already running" path.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
The previous diagnostic relied on a daemon log file existing on disk, but on firefox-windows the file never appeared, leaving us blind to whether the cli even reached detachedStdio. Attach a debug summary that includes path/stat of the log file plus stdout/stderr from every cli invocation, and write a stderr marker before opening the log file so we can tell whether detachedStdio ran and whether openSync threw.
The dashboard cli's --annotate path runs openDashboardApp in-process and uses stdout for annotation output (parsed as JSON in some tests). The previous diagnostic console.log calls in dashboardApp.ts polluted that stdout, causing many of the dashboard.spec.ts CI failures (e.g. "SyntaxError: Unexpected token 'd', \"[dashboardA\"... is not valid JSON" and "expect(received).toMatch(expected)" with received string "[dashboardApp pid=... openDashboardApp start..."). Switch all diagnostic logs to console.error. Original console.log calls for annotate output and the "Listening on <url>" port-mode message are preserved.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
The dashboard daemon uses PLAYWRIGHT_SOCKETS_DIR to acquire a singleton "already running" lock. The MCP test fixture was setting this dir per worker, while every other piece of state (registry, daemon dir, settings file) was per-test. As a result, when the daemon spawned by test A had not fully exited by the time test B started, test B's daemon would fail acquireSingleton with "already running" and exit silently without binding its bindTitle - causing tests to time out waiting for the bindTitle to appear in the registry. Diagnostic logging on Windows confirmed: [dashboardApp pid=7828] openDashboardApp start, bindTitle=...fe6c87... [dashboardApp pid=7828] acquireSingleton failed: Error: already running Suffixing the sockets dir with testId gives each test its own singleton namespace, so previous tests' daemons cannot block new ones.
This comment has been minimized.
This comment has been minimized.
When the kill RPC is sent, the daemon ACKs by ending its socket and then runs gracefullyProcessExitDoNotHang(0), which can take up to 30s (it has to close any open browsers first). Meanwhile the named pipe or unix socket stays bound to the daemon process - so a caller that relied on cli show --kill returning to mean the singleton is gone could race against the next acquireSingleton and get Error: already running. Poll the socket after the kill ACK until net.connect errors (nothing is listening), with a 30s ceiling. On timeout, throw rather than returning silently - a fast, clear failure surfaces in the right test and prevents the previous-test-corrupting-the-next-test pattern. Also revert the per-test PLAYWRIGHT_SOCKETS_DIR workaround so we can verify whether this fix alone is sufficient on Windows CI.
…quisition
The daemon kill handler did not call server.close(), so the server kept
listening on the socket throughout gracefullyProcessExitDoNotHang (which
can take up to 30s while browsers shut down). A racing acquireSingleton()
in another process would connect successfully, treat the dying daemon as
alive, and bail out with already running.
Fix:
- Call server?.close() in the kill handler so the binding stops
accepting new connections immediately, even though the existing kill
RPC connection drains normally and graceful exit continues.
- Rewrite acquireSingleton() as a bounded retry loop. Treat any listen
error (not just EADDRINUSE) as in use and probe the holder via
connect(); only connect-success counts as alive. This is robust to
libuv mapping Windows pipe errors to EACCES/EBUSY/ENOENT instead of
EADDRINUSE.
- Drop runKillClient socket-release polling: it could not work because
it observed connect-success while graceful shutdown was in flight.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
Contributor
Test results for "MCP"1 failed 6875 passed, 1015 skipped Merge workflow run. |
Contributor
Test results for "tests 1"3 flaky41646 passed, 784 skipped Merge workflow run. |
Member
Author
|
fix is in #40626 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Diagnostic-only change to surface why
tests/mcp/dashboard.spec.tsfails withCannot read properties of undefined (reading 'endpoint')on windows-latest-firefox runs.The dashboard daemon is spawned detached with
stdio: 'ignore', so chromium launch failures inside it are completely invisible — we only see the downstreamconnectToDashboardtimeout.This PR opts the daemon into writing stdout+stderr to a per-test log file (via
PWTEST_DASHBOARD_DAEMON_LOG), bumpsDEBUG=pw:browser*for the cli child env, and attaches the log on test failure.Not for merge — open as draft so we can read the daemon log from the failing windows-firefox MCP runs.