You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/user-guide/cluster-configuration.md
+50Lines changed: 50 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -22,6 +22,56 @@ This guide provides instructions for Day‑2 configuration change use cases, to
22
22
23
23
## Use Cases
24
24
25
+
Day 2 configuration changes are supported for both hardware configuration updates and policy parameter changes. The system supports retry scenarios even after previous configuration attempts have timed out or failed.
26
+
27
+
### Hardware Configuration Timeouts and Retry
28
+
29
+
Hardware configuration timeout detection is handled by the Metal3 hardware plugin. When a configuration operation times out or fails, the system supports retry through spec changes.
30
+
31
+
#### Retry Mechanism
32
+
33
+
***Configuration timeouts/failures**: Can be retried by updating the ProvisioningRequest spec
34
+
***Provisioning timeouts/failures**: Cannot be retried; the ProvisioningRequest must be deleted and recreated
35
+
***Retry mechanism**: Uses `ConfigTransactionId` (set to ProvisioningRequest generation) to track
36
+
configuration changes. When the ProvisioningRequest spec changes, the generation increments, creating
37
+
a new `ConfigTransactionId`. The system compares this with `ObservedConfigTransactionId` to detect
38
+
spec changes and trigger new configuration attempts.
39
+
***Terminal state override**: The system allows clearing terminal states (timeout/failed) when the ProvisioningRequest is in pending state due to spec changes, **except for hardware provisioning timeouts/failures which require deleting and recreating the ProvisioningRequest**.
40
+
41
+
#### Troubleshooting Configuration Timeouts
42
+
43
+
When hardware configuration times out, the timeout is detected by the Metal3 plugin and communicated back to the O-Cloud Manager via callbacks. To troubleshoot:
44
+
45
+
1.**Check configuration status**:
46
+
47
+
```console
48
+
oc get provisioningrequest <UUID> -o yaml
49
+
```
50
+
51
+
Look for `HardwareConfigured` condition with `reason: TimedOut`
52
+
53
+
2.**Check NodeAllocationRequest status**:
54
+
55
+
```console
56
+
oc get nodeallocationrequest -A
57
+
```
58
+
59
+
Look for timeout conditions in the Metal3 plugin namespace
Copy file name to clipboardExpand all lines: docs/user-guide/cluster-provisioning.md
+11-3Lines changed: 11 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -337,24 +337,32 @@ Default timeouts:
337
337
338
338
- Hardware provisioning: 90m
339
339
- Cluster installation: 90m
340
-
- Cluster configuration 30m
340
+
- Cluster configuration: 30m
341
341
342
-
These timeouts can be configured in their respective ConfigMaps or resource spec fields. The timeout value should be a duration string. For example:
342
+
#### Hardware Provisioning Timeout
343
343
344
-
For hardware provisioning, set in the `spec.templates.hwTemplate` hardware template resource:
344
+
Hardware provisioning timeout detection is handled by the Metal3 hardware plugin. The timeout is configured in the `HardwareTemplate` resource and passed to the `NodeAllocationRequest`. When a timeout occurs, the plugin sends a callback to the O-Cloud Manager with the timeout status.
345
+
346
+
Configure hardware provisioning timeout in the `spec.templates.hwTemplate` hardware template resource:
345
347
346
348
``` yaml
347
349
spec:
348
350
hardwareProvisioningTimeout: "100m"
349
351
```
350
352
353
+
If not specified, the default timeout value (90m) will be applied.
354
+
355
+
#### Cluster Installation Timeout
356
+
351
357
For cluster installation, set in the `spec.templates.clusterInstanceDefaults` ConfigMap:
352
358
353
359
```yaml
354
360
data:
355
361
clusterInstallationTimeout: "100m"
356
362
```
357
363
364
+
#### Cluster Configuration Timeout
365
+
358
366
For cluster configuration, set in the `spec.templates.policyTemplateDefaults` ConfigMap:
0 commit comments