fix(processes): fix multiple clawx processes running concurently#589
Merged
fix(processes): fix multiple clawx processes running concurently#589
Conversation
Co-authored-by: Haze <hazeone@users.noreply.github.com>
Co-authored-by: Haze <hazeone@users.noreply.github.com>
Co-authored-by: Haze <hazeone@users.noreply.github.com>
Co-authored-by: Haze <hazeone@users.noreply.github.com>
Co-authored-by: Haze <hazeone@users.noreply.github.com>
|
Cursor Agent can help with this pull request. Just |
Co-authored-by: Haze <hazeone@users.noreply.github.com>
Co-authored-by: Haze <hazeone@users.noreply.github.com>
Co-authored-by: Haze <hazeone@users.noreply.github.com>
Co-authored-by: Haze <hazeone@users.noreply.github.com>
Co-authored-by: Haze <hazeone@users.noreply.github.com>
Co-authored-by: Haze <hazeone@users.noreply.github.com>
Closed
Address residual process-leak and port-conflict risks without changing normal gateway restart behavior. Changes: - Use Windows process-tree termination (taskkill /T) in known-PID path of agent-deletion gateway restart. - Prevents child-process residue when a stale gateway is force-restarted. - Add GatewayManager.forceTerminateOwnedProcessForQuit() for quit-time timeout fallback. - before-quit now attempts an owned-process force termination if graceful stop times out. - Scope is intentionally narrow: only owned process is eligible. - Harden process instance lock release semantics. - release() now verifies current lock owner before deleting lock file. - Avoids deleting a lock that was re-acquired by another process during handover races. - Add regression tests: - agents route test verifies Windows known-PID restart path uses taskkill /T. - process-instance-lock test verifies release does not remove lock after ownership change. Why this is minimal and safe: - No public API changes. - No behavioral changes to normal reload/restart orchestration. - Focused only on leak-prone timeout and Windows kill semantics. Validation: - Ran vitest for: - tests/unit/agents-routes.test.ts - tests/unit/process-instance-lock.test.ts - tests/unit/gateway-supervisor.test.ts - tests/unit/main-quit-lifecycle.test.ts - tests/unit/signal-quit.test.ts - Result: 5 test files passed, 14 tests passed.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR addresses the issue of multiple ClawX processes running concurrently and causing port conflicts, particularly after abnormal application termination. It hardens the application's process lifecycle management to ensure reliable single-instance operation and proper cleanup of child processes.
Key changes include:
requestSingleInstanceLock()to prevent multiple instances from starting, even if the lock file is malformed or stale.supervisorto use process-tree termination on Windows (taskkill /F /T) for owned gateway processes and adds a post-cleanup port release wait.Related Issue(s)
Addresses the root cause of multiple ClawX processes and port conflicts described in the initial task.
Type of Change
Validation
The following checks were performed:
process-instance-lock,main-quit-lifecycle, andgateway-supervisorwere run and passed.pnpm run typecheckpassed.127.0.0.1:18789after startup.clawx.instance.lockfile.SIGINT/SIGTERMhandling to ensure proper lock release and gateway shutdown.Checklist