You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
- Clarified the access address for the Primary Object Storage in the disaster recovery steps.
- Renamed the "Primary-Standby Switchover Procedure" section to "Failover" for better clarity.
- Expanded the "Disaster Recovery" section to include recovery steps for the original Primary Harbor.
- Added details on automatic start/stop mechanisms for the disaster recovery instance, including configuration and script examples for managing Harbor and PostgreSQL instances.
Copy file name to clipboardExpand all lines: docs/en/solutions/How_to_perform_disaster_recovery_for_harbor.md
+161-3Lines changed: 161 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -108,7 +108,7 @@ You need to create a CephObjectStoreUser in advance to obtain the access credent
108
108
You only need to create the CephObjectStoreUser on the Primary Object Storage. The user information will be automatically synchronized to the Secondary Object Storage through the disaster recovery replication mechanism.
109
109
:::
110
110
111
-
2. This `PRIMARY_OBJECT_STORAGE_ADDRESS` is the access address of the Object Storage, you can get it from the step [Configure External Access for Primary Zone](https://docs.alauda.io/container_platform/4.1/storage/storagesystem_ceph/how_to/disaster_recovery/dr_object.html#configure-external-access-for-primary-zone) of `Object Storage Disaster Recovery`.
111
+
2. This `PRIMARY_OBJECT_STORAGE_ADDRESS` is the access address of the Object Storage, you can get it from the step [Configure External Access for Primary Zone](https://docs.alauda.io/container_platform/4.1/storage/storagesystem_ceph/how_to/disaster_recovery/dr_object.html#address) of `Object Storage Disaster Recovery`.
112
112
113
113
3. Create a Harbor registry bucket on Primary Object Storage using mc, in this example, the bucket name is `harbor-registry`.
114
114
@@ -279,7 +279,7 @@ spec:
279
279
replicas: 0
280
280
```
281
281
282
-
### Primary-Standby Switchover Procedure in Disaster Scenarios
282
+
### Failover
283
283
284
284
1. First confirm that all Primary Harbor components are not in working state, otherwise stop all Primary Harbor components first.
285
285
2. Promote Secondary PostgreSQL to Primary PostgreSQL. Refer to `PostgreSQL Hot Standby Cluster Configuration Guide`, the switchover procedure.
@@ -311,7 +311,27 @@ spec:
311
311
5. Test image push and pull to verify that Harbor is working properly.
312
312
6. Switch external access addresses to Secondary Harbor.
313
313
314
-
### Disaster Recovery Data Check
314
+
### Disaster Recovery
315
+
316
+
When the primary cluster recovers from a disaster, you can restore the original Primary Harbor to operate as a Secondary Harbor. Follow these steps to perform the recovery:
317
+
318
+
1. Set the replica count of all Harbor components to 0.
319
+
2. Configure the original Primary PostgreSQL to operate as Secondary PostgreSQL according to the `PostgreSQL Hot Standby Cluster Configuration Guide`.
320
+
3. Convert the original Primary Object Storage to Secondary Object Storage.
321
+
322
+
```bash
323
+
# From within the recovered zone, pull the latest realm configuration from the current master zone:
# Make the recovered zone the master and default zone:
327
+
radosgw-admin zone modify --rgw-realm=<realm-name> --rgw-zonegroup=<zone-group-name> --rgw-zone=<primary-zone-name> --master
328
+
```
329
+
330
+
After completing these steps, the original Primary Harbor will operate as a Secondary Harbor.
331
+
332
+
If you need to restore the original Primary Harbor to continue operating as the Primary Harbor, follow the Failover procedure to promote the current Secondary Harbor to Primary Harbor, and then configure the new Primary Harbor to operate as Secondary Harbor.
333
+
334
+
### Data sync check
315
335
316
336
Check the synchronization status of Object Storage and PostgreSQL to ensure that the disaster recovery is successful.
317
337
@@ -353,3 +373,141 @@ The RTO represents the maximum acceptable downtime during disaster recovery. Thi
353
373
The operational steps are similar to building a Harbor disaster recovery solution with `Alauda Build of Rook-Ceph` and `Alauda support for PostgreSQL`. Simply replace Object Storage and PostgreSQL with other object storage and PostgreSQL solutions.
354
374
355
375
Ensure that the Object Storage and PostgreSQL solutions support disaster recovery capabilities.
376
+
377
+
## Automatic Start/Stop of Disaster Recovery Instance
378
+
379
+
This mechanism enables automatic activation of the Secondary Harbor instance when a disaster occurs. It supports custom check mechanisms through user-defined scripts and provides control over Harbor dependency configurations.
380
+
381
+
```mermaid
382
+
flowchart TD
383
+
Start[Monitoring Program] --> CheckScript[Check if Instance Should Start]
### How to Configure and Run the Auto Start/Stop Program
389
+
390
+
1. Prepare the configuration file `config.yaml`:
391
+
392
+
```yaml
393
+
check_script: /path/to/check.sh # Path to the script that checks if the instance should start
394
+
start_script: /path/to/start.sh # Path to the script that starts the Harbor instance
395
+
stop_script: /path/to/stop.sh # Path to the script that stops the Harbor instance
396
+
check_interval: 30s
397
+
failure_threshold: 3
398
+
script_timeout: 10s
399
+
```
400
+
401
+
2. Create the corresponding script files:
402
+
403
+
- **check.sh**: This script must be customized based on your internal implementation. It should returnexit code 0 when the current cluster instance should be started, and a non-zero exit code otherwise. The following is a simple DNS IP check example (do not use directly in production):
404
+
405
+
```bash
406
+
HARBOR_DOMAIN="${HARBOR_DOMAIN:-}"
407
+
HARBOR_IP="${HARBOR_IP:-}"
408
+
409
+
RESOLVED_IP=$(nslookup "$HARBOR_DOMAIN"2>/dev/null | grep -A 1 "Name:"| grep "Address:"| awk '{print $2}'| head -n 1)
410
+
if [ "$RESOLVED_IP"="$HARBOR_IP" ];then
411
+
exit 0
412
+
else
413
+
exit 1
414
+
fi
415
+
```
416
+
417
+
- **start.sh**: The start script should include checks for Harbor dependencies and the startup of the Harbor instance.
418
+
419
+
```bash
420
+
# Check and control dependencies, such as verifying if the database is the primary instance
### `Alauda support for PostgreSQL` Start/Stop Script Examples
450
+
451
+
When using the `Alauda support forPostgreSQL` solution with the `PostgreSQL Hot Standby Cluster Configuration Guide` to configure a disaster recovery cluster, you need to configure replication informationin both Primary and Secondary PostgreSQL clusters. This ensures that during automatic failover, you only need to modify `clusterReplication.isReplica` and `numberOfInstances` to complete the switchover:
452
+
453
+
**Primary Configuration:**
454
+
455
+
```yaml
456
+
clusterReplication:
457
+
enabled: true
458
+
isReplica: false
459
+
peerHost: 192.168.130.206 # Secondary cluster node IP
460
+
peerPort: 31661 # Secondary cluster NodePort
461
+
replSvcType: NodePort
462
+
bootstrapSecret: standby-bootstrap-secret
463
+
```
464
+
465
+
The `standby-bootstrap-secret` should be configured according to the `Standby Cluster Configuration` section in the `PostgreSQL Hot Standby Cluster Configuration Guide`, using the same value as the Secondary cluster.
466
+
467
+
**Secondary Configuration:**
468
+
469
+
```yaml
470
+
clusterReplication:
471
+
enabled: true
472
+
isReplica: true
473
+
peerHost: 192.168.12.108 # Primary cluster node IP
### Alauda Build of Rook-Ceph Start/Stop Script Examples
496
+
497
+
- **Start Script Example**: For more details, refer to [Object Storage Disaster Recovery](https://docs.alauda.io/container_platform/4.1/storage/storagesystem_ceph/how_to/disaster_recovery/dr_object.html)
0 commit comments