diff --git a/docs/advanced/settings.md b/docs/advanced/settings.md index 5a37e5fa61..e0a18de678 100644 --- a/docs/advanced/settings.md +++ b/docs/advanced/settings.md @@ -78,7 +78,7 @@ For more information, see the **Certificate Rotation** section of the [Rancher]( ### `backup-target` -**Definition**: Custom backup target used to store VM backups. +**Definition**: Custom backup target used to store VM backups. For more information, see the [Longhorn documentation](https://longhorn.io/docs/1.6.0/snapshots-and-backups/backup-and-restore/set-backup-target/#set-up-aws-s3-backupstore). @@ -122,7 +122,7 @@ https://172.16.0.1/v3/import/w6tp7dgwjj549l88pr7xmxb4x6m54v5kcplvhbp9vv2wzqrrjhr ### `containerd-registry` -**Definition**: Configuration of a private registry created for the Harvester cluster. +**Definition**: Configuration of a private registry created for the Harvester cluster. The value is stored in the `registries.yaml` file of each node (path: `/etc/rancher/rke2/registries.yaml`). For more information, see [Containerd Registry Configuration](https://docs.rke2.io/install/private_registry) in the RKE2 documentation. @@ -205,7 +205,7 @@ Changing this setting might cause single-node clusters to temporarily become una - Proxy URL for HTTPS requests: `"httpsProxy": "https://:@:"` - Comma-separated list of hostnames and/or CIDRs: `"noProxy": ""` -You must specify key information in the `noProxy` field if you configured the following options or settings: +You must specify key information in the `noProxy` field if you configured the following options or settings: | Configured option/setting | Required value in `noProxy` | Reason | | --- | --- | --- | @@ -252,7 +252,7 @@ debug **Definition**: Setting that enables and disables the Longhorn V2 Data Engine. -When set to `true`, Harvester automatically loads the kernel modules required by the Longhorn V2 Data Engine, and attempts to allocate 1024 × 2 MiB-sized huge pages (for example, 2 GiB of RAM) on all nodes. +When set to `true`, Harvester automatically loads the kernel modules required by the Longhorn V2 Data Engine, and attempts to allocate 1024 × 2 MiB-sized huge pages (for example, 2 GiB of RAM) on all nodes. Changing this setting automatically restarts RKE2 on all nodes but does not affect running virtual machine workloads. @@ -261,7 +261,7 @@ Changing this setting automatically restarts RKE2 on all nodes but does not affe If you encounter error messages that include the phrase "not enough hugepages-2Mi capacity", allow some time for the error to be resolved. If the error persists, reboot the affected nodes. To disable the Longhorn V2 Data Engine on specific nodes (for example, nodes with less processing and memory resources), go to the **Hosts** screen and add the following label to the target nodes: - + - label: `node.longhorn.io/disable-v2-data-engine` - value: `true` @@ -306,7 +306,7 @@ Changes to the server address list are applied to all nodes. **Definition**: Percentage of physical compute, memory, and storage resources that can be allocated for VM use. -Overcommitting is used to optimize physical resource allocation, particularly when VMs are not expected to fully consume the allocated resources most of the time. Setting values greater than 100% allows scheduling of multiple VMs even when physical resources are notionally fully allocated. +Overcommitting is used to optimize physical resource allocation, particularly when VMs are not expected to fully consume the allocated resources most of the time. Setting values greater than 100% allows scheduling of multiple VMs even when physical resources are notionally fully allocated. **Default values**: `{ "cpu":1600, "memory":150, "storage":200 }` @@ -515,7 +515,7 @@ If you misconfigure this setting and are unable to access the Harvester UI and A **Supported options and values**: -- `protocols`: Enabled protocols. +- `protocols`: Enabled protocols. - `ciphers`: Enabled ciphers. For more information about the supported options, see [`ssl-protocols`](https://kubernetes.github.io/ingress-nginx/user-guide/nginx-configuration/configmap/#ssl-protocols) and [`ssl-ciphers`](https://kubernetes.github.io/ingress-nginx/user-guide/nginx-configuration/configmap/#ssl-ciphers) in the Ingress-Nginx Controller documentation. @@ -686,7 +686,7 @@ When the cluster is upgraded in the future, the contents of the `value` field ma **Versions**: v1.2.0 and later -**Definition**: Additional namespaces that you can use when [generating a support bundle](../troubleshooting/harvester.md#generate-a-support-bundle). +**Definition**: Additional namespaces that you can use when [generating a support bundle](../troubleshooting/harvester.md#generate-a-support-bundle). By default, the support bundle only collects resources from the following predefined namespaces: @@ -729,7 +729,7 @@ You can specify a value greater than or equal to 0. When the value is 0, Harvest **Versions**: v1.3.1 and later -**Definition**: Number of minutes Harvester allows for collection of logs and configurations (Harvester) on the nodes for the support bundle. +**Definition**: Number of minutes Harvester allows for collection of logs and configurations (Harvester) on the nodes for the support bundle. If the collection process is not completed within the allotted time, Harvester still allows you to download the support bundle (without the uncollected data). You can specify a value greater than or equal to 0. When the value is 0, Harvester uses the default value. @@ -770,7 +770,7 @@ https://your.upgrade.checker-url/v99/checkupgrade **Supported options and fields**: - `imagePreloadOption`: Options for the image preloading phase. - + The full ISO contains the core operating system components and all required container images. Harvester can preload these container images to each node during installation and upgrades. When workloads are scheduled to management and worker nodes, the container images are ready to use. - `strategy`: Image preload strategy. @@ -786,10 +786,10 @@ https://your.upgrade.checker-url/v99/checkupgrade If you decide to use `skip`, ensure that the following requirements are met: - You have a private container registry that contains all required images. - - Your cluster has high-speed internet access and is able to pull all images from Docker Hub when necessary. - + - Your cluster has high-speed internet access and is able to pull all images from Docker Hub when necessary. + Note any potential internet service interruptions and how close you are to reaching your [Docker Hub rate limit](https://www.docker.com/increase-rate-limits/). Failure to download any of the required images may cause the upgrade to fail and may leave the cluster in a middle state. - + ::: - `parallel` (**experimental**): Nodes preload images in batches. You can adjust this using the `concurrency` option. @@ -839,7 +839,7 @@ https://your.upgrade.checker-url/v99/checkupgrade ### `vm-force-reset-policy` -**Definition**: Setting that allows you to force rescheduling of a VM when the node that it is running on becomes unavailable. +**Definition**: Setting that allows you to force rescheduling of a VM when the node that it is running on becomes unavailable. When the state of the node changes to `Not Ready`, the VM is force deleted and rescheduled to an available node after the configured number of seconds. @@ -856,6 +856,30 @@ When the node becomes unavailable or is powered off, the VM only restarts and do } ``` +### `vm-migration-network` + +**Definition**: Segregated network for VM migration traffic. + +By default, VM migration uses the management network, which is limited to a single interface and shared with cluster-wide workloads. If your implementation requires network segregation, you can use a [vm migration network](./vm-migration-network.md) to isolate VM migration in-cluster data traffic. + +:::info important + +Specify an IP range in the IPv4 CIDR format. The number of IPs must be equal to or large than the number of your cluster nodes. + +::: + +**Default value**: "" + +**Example**: + +``` +{ + "vlan": 100, + "clusterNetwork": "vm-migration", + "range": "192.168.1.0/24" +} +``` + ### `volume-snapshot-class` **Definition**: VolumeSnapshotClassName for the VolumeSnapshot and VolumeSnapshotContent when restoring a VM to a namespace that does not contain the source VM. diff --git a/docs/advanced/vm-migration-network.md b/docs/advanced/vm-migration-network.md new file mode 100644 index 0000000000..86faf91e36 --- /dev/null +++ b/docs/advanced/vm-migration-network.md @@ -0,0 +1,230 @@ +--- +sidebar_position: 12 +sidebar_label: VM Migration Network +title: "VM Migration Network" +--- + +If the user wishes to isolate VM migration traffic from the Kubernetes cluster network (i.e. the management network) or other cluster-wide workloads. Users can allocate a dedicated vm migration network to get better network bandwidth and performance. + +:::note + +- Avoid configuring KubeVirt configuration directly, as this can result in unexpected or unwanted system behavior. + +::: + +## Prerequisites + +There are some prerequisites before configuring the Harvester VM Migration Network setting. + +- Well-configured Cluster Network and VLAN Config. + - Users have to ensure the Cluster Network is configured and VLAN Config will cover all nodes and ensure the network connectivity is working and expected in all nodes. +- No VM Migration in progress before configuring the VM Migration Network setting. + +:::caution + +If the Harvester cluster was upgraded from v1.0.3, please check if Whereabouts CNI is installed properly before you move on to the next step. We will always recommend following this guide to check. [Issue 3168](https://github.com/harvester/harvester/issues/3168) describes that the Harvester cluster will not always install Whereabouts CNI properly. + +- Verify the `ippools.whereabouts.cni.cncf.io` CRD exists with the following command. + - `kubectl get crd ippools.whereabouts.cni.cncf.io` + +::: + +## Configuration Example + +- VLAN ID + - Please check with your network switch setting, and provide a dedicated VLAN ID for VM Migration Network. +- Well-configured Cluster Network and VLAN Config + - Please refer Networking page for more details and configure [Cluster Network](../networking/clusternetwork.md) and [VLAN Config](../networking/harvester-network.md). +- IP range for VM Migration Network + - IP range should not conflict or overlap with Kubernetes cluster networks(`10.42.0.0/16`, `10.43.0.0/16`, `10.52.0.0/16` and `10.53.0.0/16` are reserved). + - IP range should be in IPv4 CIDR format. + - Exclude IP addresses that KubeVirt pods and the VM migration network must not use. + +We will take the following configuration as an example to explain the details of the VM Migration Network + +- VLAN ID for VM Migration Network: `100` +- Cluster Network: `vm-migration` +- IP range: `192.168.1.0/24` +- Exclude Address: `192.168.1.1/32` + +### Harvester VM Migration Network Setting + +The [`vm-migration-network` setting](./settings.md#vm-migration-network) allows you to configure the network used to isolate in-cluster VM migration traffic when segregation is required. + +You can [enable](#enable-the-vm-migration-network) and [disable](#disable-the-vm-migration-network) the VM migration network using either the UI or the CLI. When the setting is enabled, you must construct a Multus `NetworkAttachmentDefinition` CRD by configuring certain fields. + +#### Web UI + +:::tip + +Using the Harvester UI to configure the `vm-migration-network` setting is strongly recommended. + +::: + +##### Enable the VM Migration Network + +1. Go to **Advanced > Settings > vm-migration-network**. + +1. Select **Enabled**. + +1. Configure the **VLAN ID**, **Cluster Network**, **IP Range**, and **Exclude** fields to construct a Multus `NetworkAttachmentDefinition` CRD. + +1. Click **Save**. + +![storage-network-enabled.png](/img/v1.4/storagenetwork/storage-network-enabled.png) + +##### Disable the VM Migration Network + +1. Go to **Advanced > Settings > vm-migration-network**. + +1. Select **Disabled**. + +1. Click **Save**. + +Once the VM migration network is disabled, KubeVirt starts using the mgmt network for VM migration related operations. + +![storage-network-disabled.png](/img/v1.4/storagenetwork/storage-network-disabled.png) + +#### CLI + +You can use the following command to configure the [`vm-migration-network` setting](./settings.md#vm-migration-network). + +```bash +kubectl edit settings.harvesterhci.io vm-migration-network +``` + +The value format is JSON string or empty string as shown in below: + +```json +{ + "vlan": 100, + "clusterNetwork": "vm-migration", + "range": "192.168.1.0/24", + "exclude":[ + "192.168.1.100/32" + ] +} +``` + +The full configuration is like this example: + +```yaml +apiVersion: harvesterhci.io/v1beta1 +kind: Setting +metadata: + name: vm-migration-network +value: '{"vlan":100,"clusterNetwork":"vm-migration","range":"192.168.1.0/24", "exclude":["192.168.1.100/32"]}' +``` + +When the VM migration network is disabled, the full configuration is as follows: + +```yaml +apiVersion: harvesterhci.io/v1beta1 +kind: Setting +metadata: + name: vm-migration-network +``` + +:::caution + +Harvester considers extra insignificant characters in a JSON string as a different configuration. + +Specifying a valid value in the `value` field enables the storage network. Deleting the `value` field disables the storage network. + +::: + +### After Applying Harvester VM Migration Network Setting + +Harvester will create a new NetworkAttachmentDefinition and update the KubeVirt configuration. + +Once the KubeVirt configuration is updated, KubeVirt will restart all `virt-handler` pods to apply the new network configuration. + +### Verify Configuration is Completed + +#### Step 1 + +Check if Harvester VM Migration Network setting's status is `True` and the type is `configured`. + +```bash +kubectl get settings.harvesterhci.io vm-migration-network -o yaml +``` + +Completed Setting Example: + +```yaml +apiVersion: harvesterhci.io/v1beta1 +kind: Setting +metadata: + annotations: + vm-migration-network.settings.harvesterhci.io/hash: ec8322fb6b741f94739cbb904fc73c3fda864d6d + vm-migration-network.settings.harvesterhci.io/net-attach-def: harvester-system/vm-migration-network-6flk7 + creationTimestamp: "2022-10-13T06:36:39Z" + generation: 51 + name: storage-network + resourceVersion: "154638" + uid: 2233ad63-ee52-45f6-a79c-147e48fc88db +status: + conditions: + - lastUpdateTime: "2022-10-13T13:05:17Z" + reason: Completed + status: "True" + type: configured +``` + +#### Step 2 + +Verify the readiness of all KubeVirt `virt-handler` pods, and confirm that their networks are correctly configured. + +Execute the following command to inspect a pod's details: + +```bash +kubectl -n harvester-system describe pod +``` + +#### Step 3 + +Check the `k8s.v1.cni.cncf.io/network-status` annotations and ensure that an interface named `migration0` exists, with an IP address within the designated IP range. + +Users could use the following command to show all `virt-handler` pods to verify. + +```bash +kubectl get pods -n harvester-system -l kubevirt.io=virt-handler -o yaml +``` + +Correct Network Example: + +```yaml +apiVersion: v1 +kind: Pod +metadata: + annotations: + cni.projectcalico.org/containerID: 004522bc8468ea707038b43813cce2fba144f0e97551d2d358808d57caf7b543 + cni.projectcalico.org/podIP: 10.52.2.122/32 + cni.projectcalico.org/podIPs: 10.52.2.122/32 + k8s.v1.cni.cncf.io/network-status: |- + [{ + "name": "k8s-pod-network", + "ips": [ + "10.52.2.122" + ], + "default": true, + "dns": {} + },{ + "name": "harvester-system/vm-migration-network-6flk7", + "interface": "migration0", + "ips": [ + "10.1.2.1" + ], + "mac": "c6:30:6f:02:52:3e", + "dns": {} + }] + k8s.v1.cni.cncf.io/networks: vm-migration-network-6flk7@migration0 + +Omitted... +``` + +## Best Practices + +- When configuring an [IP range](#configuration-example) for the VM migration network, ensure that the allocated IP addresses can service the future needs of the cluster. This is important because KubeVirt pods (`virt-handler`) stop running when new nodes are added to the cluster after the VM migration network is configured, and when the required number of IPs exceeds the allocated IPs. Resolving the issue involves reconfiguring the storage network with the correct IP range. + +- Configure the VM migration network on a non-`mgmt` cluster network to ensure complete separation of the VM migration traffic from the Kubernetes control plane traffic. Using `mgmt` is possible but not recommended because of the negative impact (resource and bandwidth contention) on the control plane network performance. Use `mgmt` only if your cluster has NIC-related constraints and if you can completely segregate the traffic.