validate variants: parallel mutation testing on sidecars#319
Draft
schurchleycci wants to merge 10 commits into
Draft
validate variants: parallel mutation testing on sidecars#319schurchleycci wants to merge 10 commits into
schurchleycci wants to merge 10 commits into
Conversation
Retry opening the SSH session up to 12 times (5s apart) so a freshly created sidecar has time for its SSH service to become ready. Run git fetch origin on the sidecar before git reset --hard so the merge base commit is always available, even when the sidecar was booted from an older snapshot. Add a status message for the fetch step so users see progress rather than an unexplained pause. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Replace blocklist isTransientSSHError with net.Error allowlist (safer default) - Drop git fetch before sync; use rev-parse origin/HEAD on sidecar instead of local MergeBase - Emit status message on first SSH retry so users know why the CLI is waiting - Add tests for isTransientSSHError Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
When a sidecarImage is configured in validation config but no active sidecar exists, validate now creates one automatically rather than falling back to local execution. Extracted resolvePerCommandSidecarID to consolidate the routing logic. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…context The previous design keyed sidecar state files to CLAUDE_SESSION_ID from the environment, but Claude Code never sets that variable, so concurrent sessions always shared sidecar.json and the isolation never worked. The fix introduces internal/session to own the context key, parses the session ID from the Stop hook payload in cmd/validate (where it was already available), and threads it through context to the sidecar state functions. Interactive CLI commands pass context without a session ID and continue using the shared sidecar.json. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
You know I love a good "context deadline exceeded" but this call is idempotent and, in my case, had successfully stored the key such that retry succeeded instantly. There's likely a better approach here, but I don't think this makes the situation worse...
Adds `chunk validate variants <variants-file>` which reads a JSON array
of {id, description, patch} objects, fans out to parallel sidecars (one
per variant), applies the patch, runs configured remote commands, and
returns a JSON array of kill/survive results.
Intended as the execution layer for /chunk-testing-gaps: the skill
generates the variants file, this command scores each mutant.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
When multiple sidecars boot simultaneously, ExecOverSSH can fail with connection-level errors even after OpenSession succeeds — the SSH daemon accepts the session but drops the channel before the first command runs. Three observed patterns: - io.EOF during SSH handshake (crypto/ssh) - WebSocket 502 (sidecar proxy not yet ready) - "unexpected packet in response to channel open" (SSH service partially booted) Fix: introduce connError, a typed wrapper for all dial-level failures from dialSSH and NewSession. Extend isTransientSSHError to detect connError alongside net.Error and io.EOF. Add a retry in Sync that re-opens the session and retries syncWorkspace when a transient connection error occurs after the initial sync attempt. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Experimental branch — exploring using sidecars as the execution layer for
/chunk-testing-gaps.What this does
Adds
chunk validate variants <variants-file>: reads a JSON array of{id, description, patch}objects, spins up one sidecar per variant in parallel, applies the patch, runs the configured remote validate commands, and returns a JSON results array with kill/survive scoring.The idea is that
/chunk-testing-gapsgenerates the variants file and this command does the execution.Depends on
Rebased onto #315 — includes a fix for connection drops when multiple sidecars boot simultaneously (WebSocket 502, SSH "unexpected packet", EOF during handshake).
Validated
Ran against 3 real mutations of
internal/variants/variants.gowith--parallel 3. All three sidecars synced, patches applied, andtask testran to completion. Two mutations were correctly killed by existing tests; one survived (no test covers the kill path with a real command run — useful gap to know about).🤖 Generated with Claude Code