Add AddressSpaceProvider trait for multi-process address space management#758
Add AddressSpaceProvider trait for multi-process address space management#758wdcui wants to merge 1 commit intowdcui/pr3f-raw-messagefrom
Conversation
…ment Introduce address_space.rs with AddressSpaceProvider trait, ForkedAddressSpace enum, and AddressSpaceError. The trait defines create/destroy/fork/activate/with_address_space operations with NotSupported defaults for platforms that lack multi-process support. Add Provider supertrait bound. Implement (with defaults) for all platform crates and mock.
jaybosamiya-ms
left a comment
There was a problem hiding this comment.
I've added a few comments. There were other comments in #743 that I'd added to this bit of the code before it was split, and some might be handled, others don't seem to have been handled. I am going to ignore the comments that I'd spent time writing in #743 and have only looked at the design of the current one.
Top level comment is that there is a lot of coupling between specific shim implementations and specific platform implementations happening here. We should be able to clean this up through a nicer design, imho, such that all the bits are actually sitting primarily at the platform side, and the shim should mostly be unaware of needing to handle CoW or whether there are "shared" or "independent".
More detailed comments inline.
| /// A new VA-range partition is assigned to the child. Parent memory is | ||
| /// shared; the shim is responsible for copying pages as needed. | ||
| SharedWithParent(Id), |
There was a problem hiding this comment.
When are pages expected to be copied by the shim? Why not have the platform itself copy the pages over if so? Alternatively, even for "SharedWithParent", it should be possible to often set up some copy-on-first-access behavior by the platform, right?
| /// This is an **optional** trait — platforms that do not yet support | ||
| /// multi-process may leave all methods at the default (which returns | ||
| /// [`AddressSpaceError::NotSupported`]). |
There was a problem hiding this comment.
I don't like the design of a full trait filled with methods with NotSupported. In one of the other PRs, I commented a better design, which is to have most of the methods be required methods, but also provide a trivial impl that can be reused
| /// # Associated Type | ||
| /// | ||
| /// `AddressSpaceId` is an opaque, lightweight handle that identifies one | ||
| /// address space. It must be `Copy + Eq + Send + Sync` so it can be stored | ||
| /// inside process contexts and passed across threads. |
There was a problem hiding this comment.
Doc strings for associated types should be at the associated type, not at the top level of the trait
| /// Opaque identifier for an address space. | ||
| type AddressSpaceId: Copy + Eq + Send + Sync + core::fmt::Debug; |
There was a problem hiding this comment.
This is in conflict with #759 which needs a concrete u64 for its opaque identifier, whereas this one defines it at being truly opaque. Not sure what the intention is.
I am also not convinced that + core::fmt::Debug is good to have in the associated type, but this is not a strong belief of mine; it is more of a question about "do we need to do this in more places"?
| /// | ||
| /// After this call, `id` is invalid and must not be reused. | ||
| fn destroy_address_space(&self, id: Self::AddressSpaceId) -> Result<(), AddressSpaceError> { | ||
| let _ = id; | ||
| Err(AddressSpaceError::NotSupported) | ||
| } |
There was a problem hiding this comment.
If AddreessSpaceId is not maintained as a Copy but instead is a true opaque linear token, then we wouldn't need to do this through documentation, and can instead rely on the Rust compiler and type system to guarantee that we are using it correctly. That would be a more long-term easier to maintain abstraction.
| /// Whether the platform requires eager copy-on-write snapshots during | ||
| /// fork instead of lazy page-fault-driven CoW. | ||
| /// | ||
| /// When `true`, the shim eagerly copies all writable guest pages before | ||
| /// spawning the forked child and restores them after the child execs or | ||
| /// exits. When `false` (the default), the shim marks writable pages | ||
| /// read-only and lazily snapshots individual pages on first write fault. | ||
| /// | ||
| /// Platforms where the exception/fault handler shares the guest address | ||
| /// space must set this to `true` because a CoW fault inside the handler | ||
| /// itself would be fatal. | ||
| const EAGER_COW_ON_FORK: bool = false; |
There was a problem hiding this comment.
I'm not sure this should be a shim-level task, right?
Also, generally speaking, bool-based config flags are often a smell of bad abstraction.
In this case, as far as I can tell, the issue is wrt exception/fault handler: why can't the platform just do an eager copy of the exception/fault handler page, and then everything can just work through normal CoW faults, right? That would also lead to better performance than forcing literally everything to become eager just to handle one page? The platform already should know at least a little bit about the exception handling behavior, because of our userpointer design (which we have not finished finalizing the fully nice version, but nonetheless, should already iirc have the necessary functionality exposable directly there).
| impl AddressSpaceProvider for MockPlatform { | ||
| // All methods default to `Err(NotSupported)`, which is correct for the | ||
| // mock platform (single-process only). | ||
| type AddressSpaceId = u32; | ||
| } |
There was a problem hiding this comment.
If it is all unsupported, using core::convert::Infallible or similar is the better move, no?
| use thiserror::Error; | ||
| use zerocopy::{FromBytes, IntoBytes}; | ||
|
|
||
| pub use address_space::*; |
There was a problem hiding this comment.
Prefer not using wildcard imports
| impl<Host: HostInterface> litebox::platform::AddressSpaceProvider for LinuxKernel<Host> { | ||
| // All methods default to `Err(NotSupported)` — real implementation comes | ||
| // when LVBS multi-process (separate page tables) is added. | ||
| type AddressSpaceId = u32; | ||
| } |
There was a problem hiding this comment.
If the real implementation comes later, then those bits should still be marked as unimplemented!() or similar, not be silently marked as NotSupported.
| impl litebox::platform::AddressSpaceProvider for LinuxUserland { | ||
| type AddressSpaceId = u32; | ||
| } |
There was a problem hiding this comment.
Do we not plan to support multi process on Linux userland? Why is there a comment on LVBS but not here, for example? Similarly for other platforms.
Summary
AddressSpaceProvidertrait (new fileplatform/address_space.rs) for per-process address space management:create_address_space,switch_address_space,destroy_address_space,EAGER_COW_ON_FORKconstant.Providersupertrait bound+ AddressSpaceProvider.Stacked on #756 (RawMessageProvider) due to adjacent impl blocks in platform crates.
Split from #743.