Commit 3a17365
committed
🐛 Do not overwrite global http.DefaultClient TLSConfig
The current TLS configuration was overriding the TLSConfig for the global
`http.DefaultClient`. This call is being used by controllers such as the
`ExtensionConfig` controller which calls this function from multiple
concurrent workers. This leads to a race where the TLS `ServerName` is
configured differently to that of the URL it is trying to call and X509
validation fails. An example can be seen from the CAPI logs below:
```
E1126 12:43:22.449064 1 controller.go:347] "Reconciler error" err="failed to discover ExtensionConfig extension-config-a: failed to discover extension \"extension-config-a\": http call failed: Post \"https://extension-config-a-runtimehooks.extension-config-a-system.svc:443/hooks.runtime.cluster.x-k8s.io/v1alpha1/discovery?timeout=10s\": tls: failed to verify certificate: x509: certificate is valid for extension-config-a-runtimehooks.extension-config-a-system.svc, extension-config-a-runtimehooks.extension-config-a-system.svc.cluster.local, not extension-config-b.extension-config-b-system.svc" controller="extensionconfig" controllerGroup="runtime.cluster.x-k8s.io" controllerKind="ExtensionConfig" ExtensionConfig="extension-config-a" namespace="" name="extension-config-a" reconcileID="dfd00b69-3666-4818-b4a0-52eb1c391848"
E1126 12:53:42.919995 1 controller.go:347] "Reconciler error" err="failed to discover ExtensionConfig extension-config-b: failed to discover extension \"extension-config-b\": http call failed: Post \"https://extension-config-b.extension-config-b-system.svc:443/hooks.runtime.cluster.x-k8s.io/v1alpha1/discovery?timeout=10s\": tls: failed to verify certificate: x509: certificate is valid for extension-config-b.extension-config-b-system.svc, extension-config-b.extension-config-b-system.svc.cluster.local, not extension-config-a-runtimehooks.extension-config-a-system.svc" controller="extensionconfig" controllerGroup="runtime.cluster.x-k8s.io" controllerKind="ExtensionConfig" ExtensionConfig="extension-config-b" namespace="" name="extension-config-b" reconcileID="4cc93a96-cfcf-49f8-8276-b3725fc8e1b8"
```
Notice how the URL and the expected hostname are swapped in each log
indicating a race (TLSConfig being reconfigured in the middle of the
call by different worker threads.1 parent 32414fc commit 3a17365
1 file changed
+3
-1
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
518 | 518 | | |
519 | 519 | | |
520 | 520 | | |
521 | | - | |
| 521 | + | |
| 522 | + | |
| 523 | + | |
522 | 524 | | |
523 | 525 | | |
524 | 526 | | |
| |||
0 commit comments