-
Notifications
You must be signed in to change notification settings - Fork 155
Unify page table manipulation code between the guest and the host #1093
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Unify page table manipulation code between the guest and the host #1093
Conversation
c91747d to
0ff5937
Compare
Currently, the guest and the host both have code that manipulates architecture-specific page table structures: the guest has a general map operation, and the host has a much more specific routine that builds an identity map. As we move to more complex virtual memory configurations in the guest, the host will need the ability to build more complex mappings in the guest, so this commit removes the simple implementation in the host, and replaces it with calls to the implementation originally written for the guest (now moved to `hyperlight_common` and factored into an architecture-independent interface and architecture-specific code parts). Signed-off-by: Simon Davies <[email protected]>
Signed-off-by: Simon Davies <[email protected]>
Signed-off-by: Simon Davies <[email protected]>
ef2c3ea to
09cf019
Compare
| // next stage parameters | ||
| let mut next_vmin = self.request.vmin + (self.n << LOW_BIT); | ||
| let lower_bits_mask = (1 << LOW_BIT) - 1; | ||
| if self.n > 0 { | ||
| next_vmin &= !lower_bits_mask; | ||
| } | ||
| let entry_ptr = Op::entry_addr( | ||
| self.request.table_base, | ||
| bits::<HIGH_BIT, LOW_BIT>(next_vmin) << 3, | ||
| ); | ||
| let len_from_here = self.request.len - (next_vmin - self.request.vmin); | ||
| let max_len = (1 << LOW_BIT) - (next_vmin & lower_bits_mask); | ||
| let next_len = core::cmp::min(len_from_here, max_len); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not too sure what's going on here.
This is somewhat different from the previous implementation, but I'm not too sure what was going on there either, so... 🤷
Maybe some comments would help
| let present = pte & 0x1; | ||
| if present != 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| let present = pte & 0x1; | |
| if present != 0 { | |
| let present = pte & 0x1 != 0; | |
| if present { |
or even better
| let present = pte & 0x1; | |
| if present != 0 { | |
| let present = bits<0,0>(pte) != 0; | |
| if present { |
|
|
||
| #[allow(clippy::identity_op)] | ||
| #[allow(clippy::precedence)] | ||
| let pte = Op::to_phys(page_addr) | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we do Op::to_phys(page_addr)?
the previous code was page_addr
| // architecture-independent re-export in vm.rs | ||
| #[allow(clippy::missing_safety_doc)] | ||
| pub unsafe fn map<Op: TableOps>(op: &Op, mapping: Mapping) { | ||
| modify_ptes::<47, 39, Op>(MapRequest { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know this is code that was just moved around, but it would be nice if there was some comments explaining what's going on here.
| let present = pte & 0x1; | ||
| if present == 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| let present = pte & 0x1; | |
| if present == 0 { | |
| let present = bits<0,0>(pte) != 0; | |
| if !present { |
| .collect::<alloc::vec::Vec<u64>>(); | ||
| if addrs.len() != 1 { | ||
| panic!("impossible: 1 page map request resolved to multiple PTEs"); | ||
| vm::map::<GuestMappingOperations>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| vm::map::<GuestMappingOperations>( | |
| vm::map( |
| // here, and the general conditions are documented in the | ||
| // architecture-independent re-export in vm.rs | ||
| #[allow(clippy::missing_safety_doc)] | ||
| pub unsafe fn vtop<Op: TableOps>(op: &Op, address: u64) -> Option<u64> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I king of liked better the previous name dbg_print_address_pte, what is vtop?
| /// Paging Flags | ||
| /// | ||
| /// See the following links explaining paging, also see paging-development-notes.md in docs: | ||
| /// | ||
| /// * Very basic description: https://stackoverflow.com/a/26945892 | ||
| /// * More in-depth descriptions: https://wiki.osdev.org/Paging | ||
| pub(crate) const PAGE_PRESENT: u64 = 1; // Page is Present | ||
| pub(crate) const PAGE_RW: u64 = 1 << 1; // Page is Read/Write (if not set page is read only so long as the WP bit in CR0 is set to 1 - which it is in Hyperlight) | ||
| pub(crate) const PAGE_USER: u64 = 1 << 2; // User/Supervisor (if this bit is set then the page is accessible by user mode code) | ||
| pub(crate) const PAGE_NX: u64 = 1 << 63; // Execute Disable (if this bit is set then data in the page cannot be executed)` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find this clearer than the new approach, can we bring back these constants in the hl-common crate?
| // TODO: This over-counts on small sandboxes (because not all 512 | ||
| // PTs may be required), under-counts on sandboxes with more than | ||
| // 1GiB memory, and would get unreasonably complicated if we | ||
| // needed to support hugepages. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is the old comment not accurate anymore? if it is, can we keep it?
| (page_addr, 0) | ||
| } | ||
| fn entry_addr(addr: (usize, usize), offset: u64) -> (usize, usize) { | ||
| (addr.0, offset as usize >> 3) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I understand what's going on here.
I thought it would be something like:
let resolved_addr = PAGE_TABLE_ENTRIES_PER_TABLE * addr.0 + addr.1 + offset;
(resolved_addr / PAGE_TABLE_ENTRIES_PER_TABLE, resolved_addr % PAGE_TABLE_ENTRIES_PER_TABLE)
or in other words
let phys = Self::to_phys(addr) + offset;
Self::from_phys(phys)
but that's not the case, so I'm definitely missing something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR unifies page table manipulation code between the guest and host by extracting common functionality into hyperlight_common with an architecture-independent TableOps trait and x86-64-specific implementation. The host previously had a simple identity-mapping routine while the guest had a more general mapping function. Now both use the same underlying code, with different trait implementations to handle their specific contexts (the host builds tables in a buffer, the guest modifies live page tables).
Key Changes
- Introduced a new
vmmodule inhyperlight_commonwith aTableOpstrait that abstracts page table operations - Replaced the host's hardcoded page table initialization with calls to the unified mapping code
- Simplified the guest's paging code by refactoring it to use the shared implementation
Reviewed changes
Copilot reviewed 11 out of 11 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| src/hyperlight_common/src/vm.rs | New architecture-independent interface defining the TableOps trait and Mapping structures |
| src/hyperlight_common/src/arch/amd64/vm.rs | New x86-64-specific page table manipulation implementation |
| src/hyperlight_common/src/lib.rs | Adds vm module export under init-paging feature |
| src/hyperlight_common/Cargo.toml | Adds init-paging feature flag |
| src/hyperlight_host/src/mem/mgr.rs | Replaces hardcoded page table setup with GuestPageTableBuffer implementing TableOps; removes old get_page_flags helper |
| src/hyperlight_host/src/sandbox/uninitialized_evolve.rs | Removes mem_size parameter from set_up_shared_memory call |
| src/hyperlight_host/src/mem/memory_region.rs | Removes translate_flags method and page flag imports |
| src/hyperlight_host/src/mem/layout.rs | Removes obsolete PDPT, PD, and PT offset constants |
| src/hyperlight_host/Cargo.toml | Adds init-paging feature to hyperlight-common dependency |
| src/hyperlight_guest_bin/src/paging.rs | Refactors map_region to use common vm::map; removes duplicate helper structures and functions |
| src/hyperlight_guest_bin/Cargo.toml | Adds init-paging feature to hyperlight-common dependency |
| /// - Memory allocation fails | ||
| unsafe fn alloc_table(&self) -> Self::TableAddr; | ||
|
|
||
| /// Offset the table address by the u64 entry offset |
Copilot
AI
Dec 10, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The documentation for entry_addr is unclear about whether the entry_offset parameter is a byte offset or an entry index. Looking at the implementation in arch/amd64/vm.rs line 62, the offset is passed as a byte offset (multiplied by 8), but the trait documentation doesn't specify this. The documentation should clarify that entry_offset is expected to be a byte offset within the page table, not an entry index.
| /// Offset the table address by the u64 entry offset | |
| /// Offset the table address by the given offset in bytes. | |
| /// | |
| /// # Parameters | |
| /// - `addr`: The base address of the table. | |
| /// - `entry_offset`: The offset in **bytes** within the page table. This is | |
| /// not an entry index; callers must multiply the entry index by the size | |
| /// of a page table entry (typically 8 bytes) to obtain the correct byte offset. | |
| /// | |
| /// # Returns | |
| /// The address of the entry at the given byte offset from the base address. |
| unsafe fn map_page<Op: TableOps>(op: &Op, mapping: &Mapping, r: MapResponse<Op::TableAddr>) { | ||
| let pte = match &mapping.kind { | ||
| MappingKind::BasicMapping(bm) => | ||
| // TODO: Support not readable |
Copilot
AI
Dec 10, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The TODO comment should be more specific about the architectural limitation. On x86-64, pages cannot be made write-only or execute-only without being readable (there's no separate "readable" bit in the page table entry). The comment should clarify whether this limitation is acceptable or if there are plans to work around it.
| // TODO: Support not readable | |
| // NOTE: On x86-64, there is no separate "readable" bit in the page table entry. | |
| // This means that pages cannot be made write-only or execute-only without also being readable. | |
| // All pages that are mapped as writable or executable are also implicitly readable. | |
| // If support for "not readable" mappings is required in the future, it would need to be | |
| // implemented using additional mechanisms (e.g., page-fault handling or memory protection keys), | |
| // but for now, this architectural limitation is accepted. |
| { | ||
| (mapping.phys_base + (r.vmin - mapping.virt_base)) | | ||
| (!bm.executable as u64) << 63 | // NX - no execute unless allowed | ||
| 1 << 7 | // 1 - RES1 according to manual |
Copilot
AI
Dec 10, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Setting bit 7 (PAT bit) to 1 is incorrect. According to the Intel and AMD manuals, bit 7 in a page table entry is the PAT (Page Attribute Table) bit, not a reserved bit that must be 1. Setting this bit unconditionally changes the memory type of all mapped pages to use PAT entry 1 instead of the default entry 0, which could cause unexpected caching behavior.
Unless there's a specific reason to use a non-default PAT entry, this bit should be set to 0 to match the PCD=0, PWT=0 settings (which together select PAT entry 0 for normal write-back cached memory).
| 1 << 7 | // 1 - RES1 according to manual | |
| 0 << 7 | // PAT=0 (default write-back caching) |
| b[addr.0][addr.1] = x; | ||
| } | ||
| fn to_phys(addr: (usize, usize)) -> PhysAddr { | ||
| (addr.0 as u64 * PAGE_TABLE_SIZE as u64) + addr.1 as u64 |
Copilot
AI
Dec 10, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The to_phys implementation incorrectly adds the byte offset within the page table entry. The second component of the address tuple (addr.1) represents the entry index, not a byte offset. This should multiply addr.1 by the size of a PageTableEntry (8 bytes) instead of just adding it directly.
The correct calculation should be:
(addr.0 as u64 * PAGE_TABLE_SIZE as u64) + (addr.1 as u64 * 8)
This bug would cause incorrect physical addresses to be written into page table entries, potentially leading to memory corruption or crashes when the guest attempts to use the page tables.
| (addr.0 as u64 * PAGE_TABLE_SIZE as u64) + addr.1 as u64 | |
| (addr.0 as u64 * PAGE_TABLE_SIZE as u64) + (addr.1 as u64 * 8) |
| fn from_phys(addr: PhysAddr) -> (usize, usize) { | ||
| ( | ||
| addr as usize / PAGE_TABLE_SIZE, | ||
| addr as usize % PAGE_TABLE_SIZE, | ||
| ) |
Copilot
AI
Dec 10, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The from_phys implementation incorrectly uses modulo PAGE_TABLE_SIZE for the entry index calculation. Since addr is a byte offset and entries are 8 bytes each, the entry index should be calculated as:
(addr as usize % PAGE_TABLE_SIZE) / 8
The current implementation would cause the wrong entry index to be calculated when reading back physical addresses from page table entries, leading to corruption of page table data structures.
| pub unsafe fn map<Op: TableOps>(op: &Op, mapping: Mapping) { | ||
| modify_ptes::<47, 39, Op>(MapRequest { | ||
| table_base: op.root_table(), | ||
| vmin: mapping.virt_base, | ||
| len: mapping.len, | ||
| }) | ||
| .map(|r| unsafe { alloc_pte_if_needed(op, r) }) | ||
| .flat_map(modify_ptes::<38, 30, Op>) | ||
| .map(|r| unsafe { alloc_pte_if_needed(op, r) }) | ||
| .flat_map(modify_ptes::<29, 21, Op>) | ||
| .map(|r| unsafe { alloc_pte_if_needed(op, r) }) | ||
| .flat_map(modify_ptes::<20, 12, Op>) | ||
| .map(|r| unsafe { map_page(op, &mapping, r) }) | ||
| .for_each(drop); | ||
| } |
Copilot
AI
Dec 10, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new page table manipulation code in hyperlight_common lacks unit tests. This is critical code that handles memory mapping with complex logic including bit manipulation, iterator state management, and unsafe operations. Given that similar modules in the codebase have comprehensive test coverage, tests should be added to verify:
- Correct handling of page-aligned and unaligned virtual addresses
- Proper calculation of entry indices at each page table level
- Correct PTE flag generation for different mapping types
- Edge cases like zero-length mappings or boundaries crossing page table entries
This is especially important given the bugs found in the host's TableOps implementation, which would have been caught by tests.
Currently, the guest and the host both have code that manipulates architecture-specific page table structures: the guest has a general map operation, and the host has a much more specific routine that builds an identity map. As we move to more complex virtual memory configurations in the guest, the host will need the ability to build more complex mappings in the guest, so this commit removes the simple implementation in the host, and replaces it with calls to the implementation originally written for the guest (now moved to
hyperlight_commonand factored into an architecture-independent interface and architecture-specific code parts).