Unify page table manipulation code between the guest and the host #1093

simongdavies · 2025-12-09T21:25:08Z

Currently, the guest and the host both have code that manipulates architecture-specific page table structures: the guest has a general map operation, and the host has a much more specific routine that builds an identity map. As we move to more complex virtual memory configurations in the guest, the host will need the ability to build more complex mappings in the guest, so this commit removes the simple implementation in the host, and replaces it with calls to the implementation originally written for the guest (now moved to hyperlight_common and factored into an architecture-independent interface and architecture-specific code parts).

Currently, the guest and the host both have code that manipulates architecture-specific page table structures: the guest has a general map operation, and the host has a much more specific routine that builds an identity map. As we move to more complex virtual memory configurations in the guest, the host will need the ability to build more complex mappings in the guest, so this commit removes the simple implementation in the host, and replaces it with calls to the implementation originally written for the guest (now moved to `hyperlight_common` and factored into an architecture-independent interface and architecture-specific code parts). Signed-off-by: Simon Davies <[email protected]>

Signed-off-by: Simon Davies <[email protected]>

jprendes · 2025-12-10T16:23:21Z

src/hyperlight_common/src/arch/amd64/vm.rs

+        // next stage parameters
+        let mut next_vmin = self.request.vmin + (self.n << LOW_BIT);
+        let lower_bits_mask = (1 << LOW_BIT) - 1;
+        if self.n > 0 {
+            next_vmin &= !lower_bits_mask;
+        }
+        let entry_ptr = Op::entry_addr(
+            self.request.table_base,
+            bits::<HIGH_BIT, LOW_BIT>(next_vmin) << 3,
+        );
+        let len_from_here = self.request.len - (next_vmin - self.request.vmin);
+        let max_len = (1 << LOW_BIT) - (next_vmin & lower_bits_mask);
+        let next_len = core::cmp::min(len_from_here, max_len);


I'm not too sure what's going on here.
This is somewhat different from the previous implementation, but I'm not too sure what was going on there either, so... 🤷

Maybe some comments would help

jprendes · 2025-12-10T16:27:50Z

src/hyperlight_common/src/arch/amd64/vm.rs

+    let present = pte & 0x1;
+    if present != 0 {


Suggested change

let present = pte & 0x1;

if present != 0 {

let present = pte & 0x1 != 0;

if present {

or even better

Suggested change

let present = pte & 0x1;

if present != 0 {

let present = bits<0,0>(pte) != 0;

if present {

jprendes · 2025-12-10T20:47:08Z

src/hyperlight_common/src/arch/amd64/vm.rs

+
+    #[allow(clippy::identity_op)]
+    #[allow(clippy::precedence)]
+    let pte = Op::to_phys(page_addr) |


why do we do Op::to_phys(page_addr)?
the previous code was page_addr

jprendes · 2025-12-10T21:06:20Z

src/hyperlight_common/src/arch/amd64/vm.rs

+// architecture-independent re-export in vm.rs
+#[allow(clippy::missing_safety_doc)]
+pub unsafe fn map<Op: TableOps>(op: &Op, mapping: Mapping) {
+    modify_ptes::<47, 39, Op>(MapRequest {


I know this is code that was just moved around, but it would be nice if there was some comments explaining what's going on here.

jprendes · 2025-12-10T21:07:59Z

src/hyperlight_common/src/arch/amd64/vm.rs

+    let present = pte & 0x1;
+    if present == 0 {


Suggested change

let present = pte & 0x1;

if present == 0 {

let present = bits<0,0>(pte) != 0;

if !present {

jprendes · 2025-12-10T21:31:24Z

src/hyperlight_guest_bin/src/paging.rs

-    .collect::<alloc::vec::Vec<u64>>();
-    if addrs.len() != 1 {
-        panic!("impossible: 1 page map request resolved to multiple PTEs");
+        vm::map::<GuestMappingOperations>(


Suggested change

vm::map::<GuestMappingOperations>(

vm::map(

jprendes · 2025-12-10T21:33:23Z

src/hyperlight_common/src/arch/amd64/vm.rs

+// here, and the general conditions are documented in the
+// architecture-independent re-export in vm.rs
+#[allow(clippy::missing_safety_doc)]
+pub unsafe fn vtop<Op: TableOps>(op: &Op, address: u64) -> Option<u64> {


I king of liked better the previous name dbg_print_address_pte, what is vtop?

jprendes · 2025-12-10T21:39:35Z

src/hyperlight_host/src/mem/mgr.rs

-        /// Paging Flags
-        ///
-        /// See the following links explaining paging, also see paging-development-notes.md in docs:
-        ///
-        /// * Very basic description: https://stackoverflow.com/a/26945892
-        /// * More in-depth descriptions: https://wiki.osdev.org/Paging
-        pub(crate) const PAGE_PRESENT: u64 = 1; // Page is Present
-        pub(crate) const PAGE_RW: u64 = 1 << 1; // Page is Read/Write (if not set page is read only so long as the WP bit in CR0 is set to 1 - which it is in Hyperlight)
-        pub(crate) const PAGE_USER: u64 = 1 << 2; // User/Supervisor (if this bit is set then the page is accessible by user mode code)
-        pub(crate) const PAGE_NX: u64 = 1 << 63; // Execute Disable (if this bit is set then data in the page cannot be executed)`


I find this clearer than the new approach, can we bring back these constants in the hl-common crate?

jprendes · 2025-12-10T21:40:32Z

src/hyperlight_host/src/mem/layout.rs

+    // TODO: This over-counts on small sandboxes (because not all 512
+    // PTs may be required), under-counts on sandboxes with more than
+    // 1GiB memory, and would get unreasonably complicated if we
+    // needed to support hugepages.


is the old comment not accurate anymore? if it is, can we keep it?

jprendes · 2025-12-10T21:49:40Z

src/hyperlight_host/src/mem/mgr.rs

+        (page_addr, 0)
+    }
+    fn entry_addr(addr: (usize, usize), offset: u64) -> (usize, usize) {
+        (addr.0, offset as usize >> 3)


I'm not sure I understand what's going on here.
I thought it would be something like:

let resolved_addr = PAGE_TABLE_ENTRIES_PER_TABLE * addr.0 + addr.1 + offset; (resolved_addr / PAGE_TABLE_ENTRIES_PER_TABLE, resolved_addr % PAGE_TABLE_ENTRIES_PER_TABLE)

or in other words

let phys = Self::to_phys(addr) + offset; Self::from_phys(phys)

but that's not the case, so I'm definitely missing something.

Copilot

Pull request overview

This PR unifies page table manipulation code between the guest and host by extracting common functionality into hyperlight_common with an architecture-independent TableOps trait and x86-64-specific implementation. The host previously had a simple identity-mapping routine while the guest had a more general mapping function. Now both use the same underlying code, with different trait implementations to handle their specific contexts (the host builds tables in a buffer, the guest modifies live page tables).

Key Changes

Introduced a new vm module in hyperlight_common with a TableOps trait that abstracts page table operations
Replaced the host's hardcoded page table initialization with calls to the unified mapping code
Simplified the guest's paging code by refactoring it to use the shared implementation

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 6 comments.

Show a summary per file

File	Description
src/hyperlight_common/src/vm.rs	New architecture-independent interface defining the TableOps trait and Mapping structures
src/hyperlight_common/src/arch/amd64/vm.rs	New x86-64-specific page table manipulation implementation
src/hyperlight_common/src/lib.rs	Adds vm module export under init-paging feature
src/hyperlight_common/Cargo.toml	Adds init-paging feature flag
src/hyperlight_host/src/mem/mgr.rs	Replaces hardcoded page table setup with GuestPageTableBuffer implementing TableOps; removes old get_page_flags helper
src/hyperlight_host/src/sandbox/uninitialized_evolve.rs	Removes mem_size parameter from set_up_shared_memory call
src/hyperlight_host/src/mem/memory_region.rs	Removes translate_flags method and page flag imports
src/hyperlight_host/src/mem/layout.rs	Removes obsolete PDPT, PD, and PT offset constants
src/hyperlight_host/Cargo.toml	Adds init-paging feature to hyperlight-common dependency
src/hyperlight_guest_bin/src/paging.rs	Refactors map_region to use common vm::map; removes duplicate helper structures and functions
src/hyperlight_guest_bin/Cargo.toml	Adds init-paging feature to hyperlight-common dependency

Copilot · 2025-12-10T22:10:21Z

src/hyperlight_common/src/vm.rs

+    /// - Memory allocation fails
+    unsafe fn alloc_table(&self) -> Self::TableAddr;
+
+    /// Offset the table address by the u64 entry offset


The documentation for entry_addr is unclear about whether the entry_offset parameter is a byte offset or an entry index. Looking at the implementation in arch/amd64/vm.rs line 62, the offset is passed as a byte offset (multiplied by 8), but the trait documentation doesn't specify this. The documentation should clarify that entry_offset is expected to be a byte offset within the page table, not an entry index.

Suggested change

/// Offset the table address by the u64 entry offset

/// Offset the table address by the given offset in bytes.

///

/// # Parameters

/// - `addr`: The base address of the table.

/// - `entry_offset`: The offset in **bytes** within the page table. This is

/// not an entry index; callers must multiply the entry index by the size

/// of a page table entry (typically 8 bytes) to obtain the correct byte offset.

///

/// # Returns

/// The address of the entry at the given byte offset from the base address.

Copilot · 2025-12-10T22:10:21Z

src/hyperlight_common/src/arch/amd64/vm.rs

+unsafe fn map_page<Op: TableOps>(op: &Op, mapping: &Mapping, r: MapResponse<Op::TableAddr>) {
+    let pte = match &mapping.kind {
+        MappingKind::BasicMapping(bm) =>
+        // TODO: Support not readable


The TODO comment should be more specific about the architectural limitation. On x86-64, pages cannot be made write-only or execute-only without being readable (there's no separate "readable" bit in the page table entry). The comment should clarify whether this limitation is acceptable or if there are plans to work around it.

Suggested change

// TODO: Support not readable

// NOTE: On x86-64, there is no separate "readable" bit in the page table entry.

// This means that pages cannot be made write-only or execute-only without also being readable.

// All pages that are mapped as writable or executable are also implicitly readable.

// If support for "not readable" mappings is required in the future, it would need to be

// implemented using additional mechanisms (e.g., page-fault handling or memory protection keys),

// but for now, this architectural limitation is accepted.

Copilot · 2025-12-10T22:10:22Z

src/hyperlight_common/src/arch/amd64/vm.rs

+        {
+            (mapping.phys_base + (r.vmin - mapping.virt_base)) |
+                (!bm.executable as u64) << 63 | // NX - no execute unless allowed
+                1 << 7 | // 1   - RES1 according to manual


Setting bit 7 (PAT bit) to 1 is incorrect. According to the Intel and AMD manuals, bit 7 in a page table entry is the PAT (Page Attribute Table) bit, not a reserved bit that must be 1. Setting this bit unconditionally changes the memory type of all mapped pages to use PAT entry 1 instead of the default entry 0, which could cause unexpected caching behavior.

Unless there's a specific reason to use a non-default PAT entry, this bit should be set to 0 to match the PCD=0, PWT=0 settings (which together select PAT entry 0 for normal write-back cached memory).

Suggested change

1 << 7 | // 1 - RES1 according to manual

0 << 7 | // PAT=0 (default write-back caching)

Copilot · 2025-12-10T22:10:22Z

src/hyperlight_host/src/mem/mgr.rs

+        b[addr.0][addr.1] = x;
+    }
+    fn to_phys(addr: (usize, usize)) -> PhysAddr {
+        (addr.0 as u64 * PAGE_TABLE_SIZE as u64) + addr.1 as u64


The to_phys implementation incorrectly adds the byte offset within the page table entry. The second component of the address tuple (addr.1) represents the entry index, not a byte offset. This should multiply addr.1 by the size of a PageTableEntry (8 bytes) instead of just adding it directly.

The correct calculation should be:
(addr.0 as u64 * PAGE_TABLE_SIZE as u64) + (addr.1 as u64 * 8)

This bug would cause incorrect physical addresses to be written into page table entries, potentially leading to memory corruption or crashes when the guest attempts to use the page tables.

Suggested change

(addr.0 as u64 * PAGE_TABLE_SIZE as u64) + addr.1 as u64

(addr.0 as u64 * PAGE_TABLE_SIZE as u64) + (addr.1 as u64 * 8)

Copilot · 2025-12-10T22:10:23Z

src/hyperlight_host/src/mem/mgr.rs

+    fn from_phys(addr: PhysAddr) -> (usize, usize) {
+        (
+            addr as usize / PAGE_TABLE_SIZE,
+            addr as usize % PAGE_TABLE_SIZE,
+        )


The from_phys implementation incorrectly uses modulo PAGE_TABLE_SIZE for the entry index calculation. Since addr is a byte offset and entries are 8 bytes each, the entry index should be calculated as:
(addr as usize % PAGE_TABLE_SIZE) / 8

The current implementation would cause the wrong entry index to be calculated when reading back physical addresses from page table entries, leading to corruption of page table data structures.

Copilot · 2025-12-10T22:10:23Z

src/hyperlight_common/src/arch/amd64/vm.rs

+pub unsafe fn map<Op: TableOps>(op: &Op, mapping: Mapping) {
+    modify_ptes::<47, 39, Op>(MapRequest {
+        table_base: op.root_table(),
+        vmin: mapping.virt_base,
+        len: mapping.len,
+    })
+    .map(|r| unsafe { alloc_pte_if_needed(op, r) })
+    .flat_map(modify_ptes::<38, 30, Op>)
+    .map(|r| unsafe { alloc_pte_if_needed(op, r) })
+    .flat_map(modify_ptes::<29, 21, Op>)
+    .map(|r| unsafe { alloc_pte_if_needed(op, r) })
+    .flat_map(modify_ptes::<20, 12, Op>)
+    .map(|r| unsafe { map_page(op, &mapping, r) })
+    .for_each(drop);
+}


The new page table manipulation code in hyperlight_common lacks unit tests. This is critical code that handles memory mapping with complex logic including bit manipulation, iterator state management, and unsafe operations. Given that similar modules in the codebase have comprehensive test coverage, tests should be added to verify:

Correct handling of page-aligned and unaligned virtual addresses

Proper calculation of entry indices at each page table level

Correct PTE flag generation for different mapping types

Edge cases like zero-length mappings or boundaries crossing page table entries

This is especially important given the bugs found in the host's TableOps implementation, which would have been caught by tests.

simongdavies requested a review from danbugs as a code owner December 9, 2025 21:25

simongdavies added the kind/enhancement For PRs adding features, improving functionality, docs, tests, etc. label Dec 9, 2025

simongdavies requested review from dblnz, devigned, ludfjig and syntactically as code owners December 9, 2025 21:25

simongdavies added the Guest-COW PRs that form part of the Guest-COW change label Dec 9, 2025

simongdavies requested review from jprendes and marosset as code owners December 9, 2025 21:25

simongdavies mentioned this pull request Dec 9, 2025

[DO NOT MERGE] Implement Copy On Write in the hyperlight guest #1091

Draft

7 tasks

simongdavies force-pushed the unify-page-table-code branch from c91747d to 0ff5937 Compare December 10, 2025 09:27

syntactically and others added 3 commits December 10, 2025 11:53

Only include vm mod when init-paging is enabled

2807608

Signed-off-by: Simon Davies <[email protected]>

fix clippy issue on windows

09cf019

Signed-off-by: Simon Davies <[email protected]>

simongdavies force-pushed the unify-page-table-code branch from ef2c3ea to 09cf019 Compare December 10, 2025 11:57

jprendes reviewed Dec 10, 2025

View reviewed changes

danbugs requested a review from Copilot December 10, 2025 22:05

Copilot started reviewing on behalf of danbugs December 10, 2025 22:05 View session

Copilot AI reviewed Dec 10, 2025

View reviewed changes

-    /// Offset the table address by the u64 entry offset
+    /// Offset the table address by the given offset in bytes.
+    ///
+    /// # Parameters
+    /// - `addr`: The base address of the table.
+    /// - `entry_offset`: The offset in **bytes** within the page table. This is
+    ///   not an entry index; callers must multiply the entry index by the size
+    ///   of a page table entry (typically 8 bytes) to obtain the correct byte offset.
+    ///
+    /// # Returns
+    /// The address of the entry at the given byte offset from the base address.

-        // TODO: Support not readable
+        // NOTE: On x86-64, there is no separate "readable" bit in the page table entry.
+        // This means that pages cannot be made write-only or execute-only without also being readable.
+        // All pages that are mapped as writable or executable are also implicitly readable.
+        // If support for "not readable" mappings is required in the future, it would need to be
+        // implemented using additional mechanisms (e.g., page-fault handling or memory protection keys),
+        // but for now, this architectural limitation is accepted.

	1 << 7 \| // 1 - RES1 according to manual
	0 << 7 \| // PAT=0 (default write-back caching)

	(addr.0 as u64 * PAGE_TABLE_SIZE as u64) + addr.1 as u64
	(addr.0 as u64 * PAGE_TABLE_SIZE as u64) + (addr.1 as u64 * 8)

Unify page table manipulation code between the guest and the host #1093

Are you sure you want to change the base?

Unify page table manipulation code between the guest and the host #1093

Uh oh!

Conversation

simongdavies commented Dec 9, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Key Changes

Reviewed changes

Uh oh!

Copilot AI Dec 10, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 10, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 10, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 10, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 10, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 10, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants