|
| 1 | +# Troubleshooting Guide |
| 2 | + |
| 3 | +## Table of Contents |
| 4 | +- [Troubleshooting Windows](#troubleshooting-windows) |
| 5 | + - [Verify if your EKS Cluster is on the required Platform Version](#verify-if-your-eks-cluster-is-on-the-required-platform-version) |
| 6 | + - [Verify Windows IPAM is enabled in the ConfigMap](#verify-windows-ipam-is-enabled-in-the-configmap) |
| 7 | + - [Verify Node has the Resource Capacity](#verify-node-has-the-resource-capacity) |
| 8 | + - [Verify Pod has the resource limits](#verify-pod-has-the-resource-limit) |
| 9 | + - [Verify Pod has the IPv4 Address Annotation](#verify-pod-has-the-ipv4-address-annotation) |
| 10 | + - [Look for Issues on the Windows Host](#look-for-issues-on-the-windows-host) |
| 11 | +- [Troubleshooting Security Group for Pods](#troubleshooting-security-group-for-pods) |
| 12 | + - [Verify ENI Trunking is Enabled](#verify-eni-trunking-is-enabled) |
| 13 | + - [Verify Trunk ENI is created](#verify-trunk-eni-is-created) |
| 14 | + - [Verify Pod has the resource limit](#verify-pod-has-the-resource-limit) |
| 15 | + - [Verify Pod has the pod-eni annotation](#verify-pod-has-the-pod-eni-annotation) |
| 16 | + - [Check Issues with VPC CNI](#check-issues-with-vpc-cni) |
| 17 | +- [Troubleshooting Prefix Delegation for Windows](#troubleshooting-prefix-delegation-for-windows) |
| 18 | + - [Verify Windows prefix delegation is enabled in the ConfigMap](#verify-windows-prefix-delegation-is-enabled-in-the-configmap) |
| 19 | + - [Check both pod events and node events for any specific error](#check-both-pod-events-and-node-events-for-any-specific-error) |
| 20 | + - [Verify Node has the required Resource Capacity](#verify-node-has-the-required-resource-capacity) |
| 21 | + - [Verify Pod has the required resource limits](#verify-pod-has-the-required-resource-limits) |
| 22 | + - [Verify Pod has the required IPv4 Address Annotation](#verify-pod-has-the-required-ipv4-address-annotation) |
| 23 | + - [Verify the configuration options set for windows prefix delegation](#verify-the-configuration-options-set-for-windows-prefix-delegation) |
| 24 | + - [Look for networking issues on the Windows Host](#look-for-networking-issues-on-the-windows-host) |
| 25 | +- [List of Common Issues](#list-of-common-issues) |
| 26 | + - [PSP Blocking Controller Annotations](#psp-blocking-controller-annotations) |
| 27 | + - [Missing IAM Permissions on the Cluster Role](#missing-iam-permissions-on-the-cluster-role) |
| 28 | + - [ENI/IP Exhaustion](#eniip-exhaustion) |
| 29 | + - [Disable prefix delegation feature for Windows](#disable-prefix-delegation-feature-for-windows) |
| 30 | + |
1 | 31 | ## Troubleshooting Windows |
2 | 32 |
|
3 | 33 | Please follow the troubleshooting guide in the chronological order to debug issues with Windows Node and Pods. |
@@ -227,6 +257,80 @@ If the Pod is still stuck in `ContainerCreating` you can, |
227 | 257 | - Check the CNI Logs from the collected logs. |
228 | 258 | - Open an [Issue](https://github.com/aws/amazon-vpc-resource-controller-k8s/issues/new/choose) in this repository if the problem still persists. |
229 | 259 |
|
| 260 | +## Troubleshooting Prefix Delegation for Windows |
| 261 | +Please follow the troubleshooting steps here for issues with Windows Node and Pods when using `prefix delegation` mode. |
| 262 | + |
| 263 | +The following steps should be checked in chronological order to find out any issues with the workflow. |
| 264 | +### Verify Windows prefix delegation is enabled in the ConfigMap |
| 265 | + |
| 266 | +To get the ConfigMap and the data field |
| 267 | + |
| 268 | +```bash |
| 269 | +kubectl get configmaps -n kube-system amazon-vpc-cni -o custom-columns=":data" |
| 270 | +``` |
| 271 | + |
| 272 | +You should have the ConfigMap with the following data in the string, |
| 273 | +``` |
| 274 | +enable-windows-ipam:true enable-windows-prefix-delegation:true |
| 275 | +``` |
| 276 | + |
| 277 | +**Resolution** |
| 278 | + |
| 279 | +If the ConfigMap is missing or doesn't have the above field, you can create or update the `amazon-vpc-cni` ConfigMap with the required fields- |
| 280 | +``` |
| 281 | +enable-windows-ipam: "true" |
| 282 | +enable-windows-prefix-delegation: "true" |
| 283 | +``` |
| 284 | + |
| 285 | +**Note**: Windows IPAM needs to be enabled in order to use windows prefix delegation feature. |
| 286 | + |
| 287 | +### Check both pod events and node events for any specific error |
| 288 | +In case the controller encounters any error during it's prefix delegation workflow which needs to be acted upon by the customer, it will emit the errors as pod events and/or node events. Therefore, checking the same can be a good starting point to root cause the issue. |
| 289 | + |
| 290 | +You can obtain the pod events using the following command. |
| 291 | +```bash |
| 292 | +kubectl get events --all-namespaces |
| 293 | +``` |
| 294 | + |
| 295 | +In case there is any explicit error, the same needs to be looked into. |
| 296 | + |
| 297 | +For example, if the error states that there are insufficient space in the subnet to carve a /28 prefix, then the subnet needs to be looked into to ensure that /28 ranges are available which can be allocated as prefixes. |
| 298 | + |
| 299 | +### Verify Node has the required Resource Capacity |
| 300 | +Same as [Verify Node has the Resource Capacity](#verify-node-has-the-resource-capacity) |
| 301 | + |
| 302 | +### Verify Pod has the required resource limits |
| 303 | +Same as [Verify Pod has the resource limits](#verify-pod-has-the-resource-limit) |
| 304 | + |
| 305 | +### Verify Pod has the required IPv4 Address Annotation |
| 306 | +Same as [Verify Pod has the IPv4 Address Annotation](#verify-pod-has-the-ipv4-address-annotation) |
| 307 | + |
| 308 | +### Verify the configuration options set for windows prefix delegation |
| 309 | +Configuration options can be used to fine-tune the behaviour of prefix delegation on Windows. The details about the options are available [here](windows/prefix_delegation_config_options.md). |
| 310 | + |
| 311 | +To get the ConfigMap and the data field |
| 312 | + |
| 313 | +```bash |
| 314 | +kubectl get configmaps -n kube-system amazon-vpc-cni -o custom-columns=":data" |
| 315 | +``` |
| 316 | + |
| 317 | +If you see any of the following keys in the data- |
| 318 | +``` |
| 319 | +minimum-ip-target |
| 320 | +warm-ip-target |
| 321 | +warm-prefix-target |
| 322 | +``` |
| 323 | +Then the configuration options have been set. |
| 324 | + |
| 325 | +**Resolution** |
| 326 | + |
| 327 | +Verify if the configuration is correct as mentioned in the [documentation](windows/prefix_delegation_config_options.md). |
| 328 | + |
| 329 | +Alternatively, to isolate the issue, try removing the above keys from the config map. |
| 330 | + |
| 331 | +### Look for networking issues on the Windows Host |
| 332 | +Same as [Look for Issues on the Windows Host](#look-for-issues-on-the-windows-host) |
| 333 | + |
230 | 334 | ## List of Common Issues |
231 | 335 |
|
232 | 336 | ### PSP Blocking Controller Annotations |
@@ -265,3 +369,21 @@ aws ec2 describe-subnets --subnet-id subnet-id-here |
265 | 369 | ``` |
266 | 370 |
|
267 | 371 | From the response you can look for how many IPv4 address are available in the Subnet from the field `AvailableIpAddressCount` |
| 372 | + |
| 373 | +### Disable prefix delegation feature for Windows |
| 374 | + |
| 375 | +You should check if the feature is enabled via ConfigMap. To get the ConfigMap and the data field |
| 376 | + |
| 377 | +```bash |
| 378 | +kubectl get configmaps -n kube-system amazon-vpc-cni -o custom-columns=":data" |
| 379 | +``` |
| 380 | + |
| 381 | +If have the ConfigMap with the following data in the string, |
| 382 | +``` |
| 383 | +enable-windows-prefix-delegation:true |
| 384 | +``` |
| 385 | +then the feature is enabled. |
| 386 | + |
| 387 | +**Resolution** |
| 388 | + |
| 389 | +You can disable the feature by editing your config map and setting `enable-windows-prefix-delegation` as `"false"`. |
0 commit comments