Skip to content

validate variants: parallel mutation testing on sidecars#319

Draft
schurchleycci wants to merge 10 commits into
mainfrom
assess-parallel-mutation-sidecars
Draft

validate variants: parallel mutation testing on sidecars#319
schurchleycci wants to merge 10 commits into
mainfrom
assess-parallel-mutation-sidecars

Conversation

@schurchleycci
Copy link
Copy Markdown
Contributor

@schurchleycci schurchleycci commented May 8, 2026

Experimental branch — exploring using sidecars as the execution layer for /chunk-testing-gaps.

What this does

Adds chunk validate variants <variants-file>: reads a JSON array of {id, description, patch} objects, spins up one sidecar per variant in parallel, applies the patch, runs the configured remote validate commands, and returns a JSON results array with kill/survive scoring.

The idea is that /chunk-testing-gaps generates the variants file and this command does the execution.

Depends on

Rebased onto #315 — includes a fix for connection drops when multiple sidecars boot simultaneously (WebSocket 502, SSH "unexpected packet", EOF during handshake).

Validated

Ran against 3 real mutations of internal/variants/variants.go with --parallel 3. All three sidecars synced, patches applied, and task test ran to completion. Two mutations were correctly killed by existing tests; one survived (no test covers the kill path with a real command run — useful gap to know about).

🤖 Generated with Claude Code

schurchleycci and others added 10 commits May 7, 2026 15:05
Retry opening the SSH session up to 12 times (5s apart) so a freshly
created sidecar has time for its SSH service to become ready. Run
git fetch origin on the sidecar before git reset --hard so the merge
base commit is always available, even when the sidecar was booted from
an older snapshot. Add a status message for the fetch step so users
see progress rather than an unexplained pause.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Replace blocklist isTransientSSHError with net.Error allowlist (safer default)
- Drop git fetch before sync; use rev-parse origin/HEAD on sidecar instead of local MergeBase
- Emit status message on first SSH retry so users know why the CLI is waiting
- Add tests for isTransientSSHError

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
When a sidecarImage is configured in validation config but no active sidecar
exists, validate now creates one automatically rather than falling back to local
execution. Extracted resolvePerCommandSidecarID to consolidate the routing logic.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…context

The previous design keyed sidecar state files to CLAUDE_SESSION_ID from
the environment, but Claude Code never sets that variable, so concurrent
sessions always shared sidecar.json and the isolation never worked.

The fix introduces internal/session to own the context key, parses the
session ID from the Stop hook payload in cmd/validate (where it was
already available), and threads it through context to the sidecar state
functions. Interactive CLI commands pass context without a session ID
and continue using the shared sidecar.json.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
You know I love a good "context deadline exceeded" but this call is
idempotent and, in my case, had successfully stored the key such that
retry succeeded instantly.

There's likely a better approach here, but I don't think this makes the
situation worse...
Adds `chunk validate variants <variants-file>` which reads a JSON array
of {id, description, patch} objects, fans out to parallel sidecars (one
per variant), applies the patch, runs configured remote commands, and
returns a JSON array of kill/survive results.

Intended as the execution layer for /chunk-testing-gaps: the skill
generates the variants file, this command scores each mutant.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
When multiple sidecars boot simultaneously, ExecOverSSH can fail with
connection-level errors even after OpenSession succeeds — the SSH daemon
accepts the session but drops the channel before the first command runs.
Three observed patterns:
- io.EOF during SSH handshake (crypto/ssh)
- WebSocket 502 (sidecar proxy not yet ready)
- "unexpected packet in response to channel open" (SSH service partially booted)

Fix: introduce connError, a typed wrapper for all dial-level failures from
dialSSH and NewSession. Extend isTransientSSHError to detect connError
alongside net.Error and io.EOF. Add a retry in Sync that re-opens the
session and retries syncWorkspace when a transient connection error occurs
after the initial sync attempt.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants