diff --git a/config-linux.md b/config-linux.md
index ddc30ac4d..504f6c203 100644
--- a/config-linux.md
+++ b/config-linux.md
@@ -189,6 +189,107 @@ In addition to any devices configured with this setting, the runtime MUST also s
* [`/dev/ptmx`][pts.4].
A [bind-mount or symlink of the container's `/dev/pts/ptmx`][devpts].
+## Network Devices
+
+Linux network devices are entities that send and receive data packets. They are
+not represented as files in the `/dev` directory. Instead, they are represented
+by the [`net_device`][net_device] data structure in the Linux kernel. Network
+devices can belong to only one network namespace and use a set of operations
+distinct from regular file operations. Network devices can be categorized as
+**physical** or **virtual**:
+
+* **Physical network devices** correspond to hardware interfaces, such as
+ Ethernet cards (e.g., `eth0`, `enp0s3`). They are directly associated with
+ physical network hardware.
+* **Virtual network devices** are software-defined interfaces, such as loopback
+ devices (`lo`), virtual Ethernet pairs (`veth`), bridges (`br0`), VLANs, and
+ MACVLANs. They are created and managed by the kernel and do not correspond
+ to physical hardware.
+
+This schema focuses solely on moving existing network devices identified by name
+from the host network namespace into the container network namespace. It does
+not cover the complexities of network device creation or network configuration,
+such as IP address assignment, routing, and DNS setup.
+
+**`netDevices`** (object, OPTIONAL) - A set of network devices that MUST be made
+available in the container. The runtime is responsible for moving these devices;
+the underlying mechanism is implementation-defined.
+
+The name of the network device is the entry key. Entry values are objects with
+the following properties:
+
+* **`name`** *(string, OPTIONAL)* - the name of the network device inside the
+ container namespace. If not specified, the host name is used.
+
+The runtime MUST check if moving the network interface to the container
+namespace is possible. If a network device with the specified name already
+exists in the container namespace, the runtime MUST [generate an error](runtime.md#errors),
+unless the user has provided a template by appending
+`%d` to the new name. In that case, the runtime MUST allow the move, and the
+kernel will generate a unique name for the interface within the container's
+network namespace.
+
+The runtime MUST preserve existing network interface attributes, including all
+permanent IP addresses (IFA_F_PERMANENT flag) of any family with global scope
+(RT_SCOPE_UNIVERSE value) as defined in [`RFC 3549 Section 2.3.3.2`][rfc3549].
+This ensures that only addresses intended for persistent, external communication
+are transferred.
+
+The runtime MUST set the network device state to "up" after moving it to the
+network namespace to allow the container to send and receive network traffic
+through that device.
+
+### Namespace Lifecycle and Container Termination
+
+The runtime MUST NOT actively manage the interface's lifecycle and configuration
+*within* the container's network namespace. This is because network interfaces
+are inherently tied to the network namespace itself, and their lifecycle is
+therefore managed by the owner of the network namespace. Typically, this
+ownership and management are handled by higher-level container runtime
+orchestrators, rather than the processes running directly within the container.
+
+The runtime **MUST NOT** attempt to move the interface out of the namespace
+before deletion. This design decision is based on the following:
+
+* **Namespace Ownership:** Network interfaces are tied to the network namespace,
+ which may not always be directly managed by the runtime.
+* **Abrupt Termination:** Even when the runtime manages the namespace, it cannot
+ reliably participate in its deletion if the container's processes terminate
+ abruptly (e.g., due to a crash) or run until completion.
+
+During the network namespace deletion the kernel's built-in namespace cleanup
+mechanisms take over, as described in [network_namespaces(7)][net_namespaces.7]:
+"When a network namespace is freed (i.e., when the last process in the namespace
+terminates), its physical network devices are moved back to the initial network
+namespace." All the network namespace migratable physical network devices are
+moved to the default network namespace, while virtual devices (veth, macvlan,
+...) are destroyed.
+
+If users require custom handling of interface lifecycle during namespace
+deletion, they can utilize existing features within the namespace orchestrator
+or employ post-stop hooks.
+
+**Physical Interface Renaming and Systemd**
+
+When a physical interface is renamed within a container and the container's
+network namespace is later deleted, the kernel will move the interface back to
+the root namespace with its renamed name. In case of a name conflict in the root
+namespace, the kernel will rename it to `dev%d`. To ensure predictable interface
+names in the root namespace, users can utilize systemd's `udevd` and `networkd`
+rules. Refer to [systemd Predictable Network Interface Names][predictable-network-interfaces-names]
+for more information on configuring predictable names.
+
+### Example
+
+#### Moving a device with a renamed interface inside the container:
+
+```json
+"netDevices": {
+ "eth0" : {
+ "name": "container_eth0"
+ }
+}
+
## Control groups
Also known as cgroups, they are used to restrict resource usage for a container and handle device access.
@@ -975,6 +1076,10 @@ subset of the available options.
[mknod.1]: https://man7.org/linux/man-pages/man1/mknod.1.html
[mknod.2]: https://man7.org/linux/man-pages/man2/mknod.2.html
[namespaces.7_2]: https://man7.org/linux/man-pages/man7/namespaces.7.html
+[net_device]: https://docs.kernel.org/networking/netdevices.html
+[net_namespaces.7]: https://man7.org/linux/man-pages/man7/network_namespaces.7.html
+[predictable-network-interfaces-names]: https://systemd.io/PREDICTABLE_INTERFACE_NAMES
+[rfc3549]: https://www.ietf.org/rfc/rfc3549.txt
[null.4]: https://man7.org/linux/man-pages/man4/null.4.html
[personality.2]: https://man7.org/linux/man-pages/man2/personality.2.html
[pts.4]: https://man7.org/linux/man-pages/man4/pts.4.html
diff --git a/features-linux.md b/features-linux.md
index 66d5c7996..a3488e5a7 100644
--- a/features-linux.md
+++ b/features-linux.md
@@ -228,3 +228,17 @@ Irrelevant to the availability of Intel RDT on the host operating system.
}
}
```
+
+## NetDevices
+
+**`netDevices`** (object, OPTIONAL) represents the runtime's implementation status of Linux network devices.
+
+* **`enabled`** (bool, OPTIONAL) represents whether the runtime supports the capability to move Linux network devices into the container's network namespace.
+
+### Example
+
+```json
+"netDevices": {
+ "enabled": true
+}
+```
diff --git a/schema/config-linux.json b/schema/config-linux.json
index 942679964..add4cf0e4 100644
--- a/schema/config-linux.json
+++ b/schema/config-linux.json
@@ -9,6 +9,12 @@
"$ref": "defs-linux.json#/definitions/Device"
}
},
+ "netDevices": {
+ "type": "object",
+ "additionalProperties": {
+ "$ref": "defs-linux.json#/definitions/NetDevice"
+ }
+ },
"uidMappings": {
"type": "array",
"items": {
diff --git a/schema/defs-linux.json b/schema/defs-linux.json
index 4bef06cdc..4bf73d0fb 100644
--- a/schema/defs-linux.json
+++ b/schema/defs-linux.json
@@ -189,6 +189,14 @@
}
}
},
+ "NetDevice": {
+ "type": "object",
+ "properties": {
+ "name": {
+ "type": "string"
+ }
+ }
+ },
"weight": {
"$ref": "defs.json#/definitions/uint16"
},
diff --git a/schema/features-linux.json b/schema/features-linux.json
index 0f4d21db3..fcf3df7d6 100644
--- a/schema/features-linux.json
+++ b/schema/features-linux.json
@@ -110,6 +110,14 @@
}
}
}
+ },
+ "netDevices": {
+ "type": "object",
+ "properties": {
+ "enabled": {
+ "type": "boolean"
+ }
+ }
}
}
}
diff --git a/schema/test/config/bad/linux-netdevice.json b/schema/test/config/bad/linux-netdevice.json
new file mode 100644
index 000000000..618d88432
--- /dev/null
+++ b/schema/test/config/bad/linux-netdevice.json
@@ -0,0 +1,13 @@
+{
+ "ociVersion": "1.0.0",
+ "root": {
+ "path": "rootfs"
+ },
+ "linux": {
+ "netDevices": {
+ "eth0": {
+ "name": 23
+ }
+ }
+ }
+}
diff --git a/schema/test/config/good/linux-netdevice.json b/schema/test/config/good/linux-netdevice.json
new file mode 100644
index 000000000..cec4d09aa
--- /dev/null
+++ b/schema/test/config/good/linux-netdevice.json
@@ -0,0 +1,15 @@
+{
+ "ociVersion": "1.0.0",
+ "root": {
+ "path": "rootfs"
+ },
+ "linux": {
+ "netDevices": {
+ "eth0": {
+ "name": "container_eth0"
+ },
+ "ens4": {},
+ "ens5": {}
+ }
+ }
+}
diff --git a/schema/test/features/good/runc.json b/schema/test/features/good/runc.json
index 8f5196243..fa6de7f97 100644
--- a/schema/test/features/good/runc.json
+++ b/schema/test/features/good/runc.json
@@ -182,6 +182,9 @@
},
"selinux": {
"enabled": true
+ },
+ "netDevices": {
+ "enabled": true
}
},
"annotations": {
diff --git a/specs-go/config.go b/specs-go/config.go
index 1aa0693b5..854290da2 100644
--- a/specs-go/config.go
+++ b/specs-go/config.go
@@ -236,6 +236,8 @@ type Linux struct {
Namespaces []LinuxNamespace `json:"namespaces,omitempty"`
// Devices are a list of device nodes that are created for the container
Devices []LinuxDevice `json:"devices,omitempty"`
+ // NetDevices are key-value pairs, keyed by network device name on the host, moved to the container's network namespace.
+ NetDevices map[string]LinuxNetDevice `json:"netDevices,omitempty"`
// Seccomp specifies the seccomp security settings for the container.
Seccomp *LinuxSeccomp `json:"seccomp,omitempty"`
// RootfsPropagation is the rootfs mount propagation mode for the container.
@@ -491,6 +493,12 @@ type LinuxDevice struct {
GID *uint32 `json:"gid,omitempty"`
}
+// LinuxNetDevice represents a single network device to be added to the container's network namespace
+type LinuxNetDevice struct {
+ // Name of the device in the container namespace
+ Name string `json:"name,omitempty"`
+}
+
// LinuxDeviceCgroup represents a device rule for the devices specified to
// the device controller
type LinuxDeviceCgroup struct {
diff --git a/specs-go/features/features.go b/specs-go/features/features.go
index 949f532b6..d8eb169dc 100644
--- a/specs-go/features/features.go
+++ b/specs-go/features/features.go
@@ -48,6 +48,7 @@ type Linux struct {
Selinux *Selinux `json:"selinux,omitempty"`
IntelRdt *IntelRdt `json:"intelRdt,omitempty"`
MountExtensions *MountExtensions `json:"mountExtensions,omitempty"`
+ NetDevices *NetDevices `json:"netDevices,omitempty"`
}
// Cgroup represents the "cgroup" field.
@@ -143,3 +144,10 @@ type IDMap struct {
// Nil value means "unknown", not "false".
Enabled *bool `json:"enabled,omitempty"`
}
+
+// NetDevices represents the "netDevices" field.
+type NetDevices struct {
+ // Enabled is true if network devices support is compiled in.
+ // Nil value means "unknown", not "false".
+ Enabled *bool `json:"enabled,omitempty"`
+}