Skip to content

fix(core): eliminate TOCTOU in MachineManager::create by holding write lock#151

Open
AprilNEA wants to merge 1 commit intomasterfrom
fix/machine-create-toctou
Open

fix(core): eliminate TOCTOU in MachineManager::create by holding write lock#151
AprilNEA wants to merge 1 commit intomasterfrom
fix/machine-create-toctou

Conversation

@AprilNEA
Copy link
Copy Markdown
Member

Summary

  • Take write lock at the start of create() instead of read-then-write
  • Existence check and insert are now atomic with respect to other writers
  • Prevents concurrent creates from duplicating VMs and orphaning filesystem directories

Test plan

  • Concurrent arcbox machine create from CLI + desktop doesn't duplicate

Closes ABX-230

Copilot AI review requested due to automatic review settings March 31, 2026 12:30
@linear
Copy link
Copy Markdown

linear bot commented Mar 31, 2026

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses a race in MachineManager::create() where concurrent creates using the same machine name could both pass an existence check and then overwrite each other on insert, potentially leaving orphaned VM resources/directories.

Changes:

  • Acquire the machines write lock at the start of create() to make the existence check + insert atomic.
  • Reuse the same write guard for the final insert() to avoid the read-then-write TOCTOU window.

Comment on lines +317 to +321
// Hold the write lock for the entire create operation to prevent TOCTOU
// races: without this, two concurrent creates with the same name could
// both pass the existence check before either inserts.
let mut machines = self.machines.write().map_err(|_| CoreError::LockPoisoned)?;

Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

create() now holds a std::sync::RwLock write guard while doing filesystem I/O (create_dir_all), VM creation, and persistence writes. That can block all readers/writers of machines (e.g., list/get/start) for the entire duration of a potentially slow operation, impacting daemon responsiveness. Consider reserving the name under the lock (e.g., track an in-progress set or insert a temporary "creating" entry) and then releasing the lock while performing I/O, or using an atomic filesystem reservation (e.g., create_dir/lockfile with create_new) to avoid the TOCTOU without a long-lived global write lock.

Copilot uses AI. Check for mistakes.
Comment on lines +317 to 323
// Hold the write lock for the entire create operation to prevent TOCTOU
// races: without this, two concurrent creates with the same name could
// both pass the existence check before either inserts.
let mut machines = self.machines.write().map_err(|_| CoreError::LockPoisoned)?;

if machines.contains_key(&config.name) {
return Err(CoreError::already_exists(config.name));
Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change is specifically about preventing concurrent creates from duplicating/overwriting machines, but there’s no test asserting the new behavior. Since this file already has async tests, please add a concurrency test that runs two create() calls with the same name in parallel and asserts exactly one succeeds (the other returns already-exists) and that only one machine is registered/persisted.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants