Add restore validation feature: restores to special keyspace allowing validating backup/restore in single cluster (space willing) #12573

saintstack · 2025-11-21T20:49:11Z

This is an implementation of a Neethu design (design is included in the PR).

Add restore validation feature with simplified backup gap fix

Implements restore validation using audit_storage to verify backup/restore
correctness. Includes a minimal fix for the backup gap bug.

Key components:
- ValidateRestore audit type: compares source keys against restored keys
  at \xff\x02/rlog/ prefix in storage server
- DD audit fixes: propagate validation errors, handle DD failover correctly
- RestoreValidation and BackupAndRestoreValidation workloads for testing
- Simplified backup gap fix: prevent snapshot from finishing in the same
  iteration it dispatches the last tasks (single flag + one check)

Backup gap bug fix (FileBackupAgent.actor.cpp):
The original dispatcher marks ranges as DONE when selecting them for dispatch,
then immediately checks if all ranges are done. This causes snapshots to finish
before the dispatched tasks complete, creating gaps in backup coverage.

The fix adds a dispatchedInThisIteration flag. If tasks were dispatched in
this iteration, the completion check is skipped, ensuring at least one full
loop between dispatch and completion. This minimal change prevents premature
snapshot completion without complex state tracking.

Here is running the new simulation included here 100k times:

20251121-184138-stack-1458b890ad727389 compressed=True data_size=55635346 duration=3717292 ended=100000 fail_fast=10 max_runs=100000 pass=100000 priority=100 remaining=0 runtime=0:44:53 sanity=False started=100000 stopped=20251121-192631 submitted=20251121-184138 timeout=5400 username=stack

I ran all tests 100k times and looks like hangs on the end at 99975 or so. Looking to see if related.

Also verified the feature manually.

foundationdb-ci · 2025-11-21T21:13:15Z

Result of foundationdb-pr-clang-ide on Linux RHEL 9

Commit ID: 2526933
Duration 0:23:53
Result: ❌ FAILED
Error: Error while executing command: ninja -v -C build_output -j ${NPROC} all. Reason: exit status 1
Build Log terminal output (available for 30 days)
Build Workspace zip file of the working directory (available for 30 days)

foundationdb-ci · 2025-11-21T21:19:06Z

Result of foundationdb-pr-macos-m1 on macOS Ventura 13.x

Commit ID: 2526933
Duration 0:29:44
Result: ❌ FAILED
Error: Error while executing command: ssh -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -i ${HOME}/.ssh_key ec2-user@${MAC_EC2_HOST} /opt/homebrew/bin/bash --login -c ./build_pr_macos.sh. Reason: exit status 1
Build Log terminal output (available for 30 days)
Build Workspace zip file of the working directory (available for 30 days)

foundationdb-ci · 2025-11-21T21:34:36Z

Result of foundationdb-pr-clang-arm on Linux CentOS 7

Commit ID: 2526933
Duration 0:45:15
Result: ✅ SUCCEEDED
Error: N/A
Build Log terminal output (available for 30 days)
Build Workspace zip file of the working directory (available for 30 days)

foundationdb-ci · 2025-11-21T21:35:17Z

Result of foundationdb-pr-macos on macOS Ventura 13.x

Commit ID: 2526933
Duration 0:45:54
Result: ❌ FAILED
Error: Error while executing command: ssh -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -i ${HOME}/.ssh_key ec2-user@${MAC_EC2_HOST} /usr/local/bin/bash --login -c ./build_pr_macos.sh. Reason: exit status 1
Build Log terminal output (available for 30 days)
Build Workspace zip file of the working directory (available for 30 days)

foundationdb-ci · 2025-11-21T21:51:51Z

Result of foundationdb-pr-clang-arm on Linux CentOS 7

Commit ID: 5c53ce8
Duration 0:04:04
Result: ❌ FAILED
Error: Error while executing command: if [[ $(git diff --shortstat 2> /dev/null | tail -n1) == "" ]]; then echo "CODE FORMAT CLEAN"; else echo "CODE FORMAT NOT CLEAN"; echo; echo "THE FOLLOWING FILES NEED TO BE FORMATTED"; echo; git ls-files -m; echo; if [[ $FDB_VERSION =~ 7\.\3. ]]; then echo skip; else exit 1; fi; fi. Reason: exit status 1
Build Log terminal output (available for 30 days)
Build Workspace zip file of the working directory (available for 30 days)

foundationdb-ci · 2025-11-21T21:52:06Z

Result of foundationdb-pr-clang on Linux RHEL 9

Commit ID: 5c53ce8
Duration 0:04:22
Result: ❌ FAILED
Error: Error while executing command: if [[ $(git diff --shortstat 2> /dev/null | tail -n1) == "" ]]; then echo "CODE FORMAT CLEAN"; else echo "CODE FORMAT NOT CLEAN"; echo; echo "THE FOLLOWING FILES NEED TO BE FORMATTED"; echo; git ls-files -m; echo; if [[ $FDB_VERSION =~ 7\.\3. ]]; then echo skip; else exit 1; fi; fi. Reason: exit status 1
Build Log terminal output (available for 30 days)
Build Workspace zip file of the working directory (available for 30 days)

foundationdb-ci · 2025-11-21T21:52:10Z

Result of foundationdb-pr-clang-ide on Linux RHEL 9

Commit ID: 5c53ce8
Duration 0:04:23
Result: ❌ FAILED
Error: Error while executing command: if [[ $(git diff --shortstat 2> /dev/null | tail -n1) == "" ]]; then echo "CODE FORMAT CLEAN"; else echo "CODE FORMAT NOT CLEAN"; echo; echo "THE FOLLOWING FILES NEED TO BE FORMATTED"; echo; git ls-files -m; echo; if [[ $FDB_VERSION =~ 7\.\3. ]]; then echo skip; else exit 1; fi; fi. Reason: exit status 1
Build Log terminal output (available for 30 days)
Build Workspace zip file of the working directory (available for 30 days)

foundationdb-ci · 2025-11-21T21:52:11Z

Result of foundationdb-pr on Linux RHEL 9

Commit ID: 5c53ce8
Duration 0:04:23
Result: ❌ FAILED
Error: Error while executing command: if [[ $(git diff --shortstat 2> /dev/null | tail -n1) == "" ]]; then echo "CODE FORMAT CLEAN"; else echo "CODE FORMAT NOT CLEAN"; echo; echo "THE FOLLOWING FILES NEED TO BE FORMATTED"; echo; git ls-files -m; echo; if [[ $FDB_VERSION =~ 7\.\3. ]]; then echo skip; else exit 1; fi; fi. Reason: exit status 1
Build Log terminal output (available for 30 days)
Build Workspace zip file of the working directory (available for 30 days)

foundationdb-ci · 2025-11-21T21:52:15Z

Result of foundationdb-pr-cluster-tests on Linux RHEL 9

Commit ID: 5c53ce8
Duration 0:04:29
Result: ❌ FAILED
Error: Error while executing command: if [[ $(git diff --shortstat 2> /dev/null | tail -n1) == "" ]]; then echo "CODE FORMAT CLEAN"; else echo "CODE FORMAT NOT CLEAN"; echo; echo "THE FOLLOWING FILES NEED TO BE FORMATTED"; echo; git ls-files -m; echo; if [[ $FDB_VERSION =~ 7\.\3. ]]; then echo skip; else exit 1; fi; fi. Reason: exit status 1
Build Log terminal output (available for 30 days)
Build Workspace zip file of the working directory (available for 30 days)
Cluster Test Logs zip file of the test logs (available for 30 days)

foundationdb-ci · 2025-11-21T22:06:02Z

Result of foundationdb-pr-cluster-tests on Linux RHEL 9

Commit ID: 2526933
Duration 1:16:37
Result: ✅ SUCCEEDED
Error: N/A
Build Log terminal output (available for 30 days)
Build Workspace zip file of the working directory (available for 30 days)
Cluster Test Logs zip file of the test logs (available for 30 days)

foundationdb-ci · 2025-11-21T22:29:22Z

Result of foundationdb-pr-clang-ide on Linux RHEL 9

Commit ID: 761b593
Duration 0:27:25
Result: ✅ SUCCEEDED
Error: N/A
Build Log terminal output (available for 30 days)
Build Workspace zip file of the working directory (available for 30 days)

foundationdb-ci · 2025-11-21T22:48:40Z

Result of foundationdb-pr-clang-arm on Linux CentOS 7

Commit ID: 761b593
Duration 0:46:43
Result: ✅ SUCCEEDED
Error: N/A
Build Log terminal output (available for 30 days)
Build Workspace zip file of the working directory (available for 30 days)

foundationdb-ci · 2025-11-21T22:59:30Z

Result of foundationdb-pr-clang on Linux RHEL 9

Commit ID: 2526933
Duration 2:10:08
Result: ❌ FAILED
Error: Error while executing command: if python3 -m joshua.joshua list --stopped | grep ${ENSEMBLE_ID} | grep -q 'pass=10[0-9][0-9][0-9]'; then echo PASS; else echo FAIL && exit 1; fi. Reason: exit status 1
Build Log terminal output (available for 30 days)
Build Workspace zip file of the working directory (available for 30 days)

foundationdb-ci · 2025-11-21T23:06:41Z

Result of foundationdb-pr on Linux RHEL 9

Commit ID: 2526933
Duration 2:17:20
Result: ❌ FAILED
Error: Error while executing command: if python3 -m joshua.joshua list --stopped | grep ${ENSEMBLE_ID} | grep -q 'pass=10[0-9][0-9][0-9]'; then echo PASS; else echo FAIL && exit 1; fi. Reason: exit status 1
Build Log terminal output (available for 30 days)
Build Workspace zip file of the working directory (available for 30 days)

foundationdb-ci · 2025-11-21T23:13:03Z

Result of foundationdb-pr-cluster-tests on Linux RHEL 9

Commit ID: 761b593
Duration 1:11:06
Result: ✅ SUCCEEDED
Error: N/A
Build Log terminal output (available for 30 days)
Build Workspace zip file of the working directory (available for 30 days)
Cluster Test Logs zip file of the test logs (available for 30 days)

foundationdb-ci · 2025-11-22T00:15:56Z

Result of foundationdb-pr-clang on Linux RHEL 9

Commit ID: 761b593
Duration 2:14:02
Result: ❌ FAILED
Error: Error while executing command: if python3 -m joshua.joshua list --stopped | grep ${ENSEMBLE_ID} | grep -q 'pass=10[0-9][0-9][0-9]'; then echo PASS; else echo FAIL && exit 1; fi. Reason: exit status 1
Build Log terminal output (available for 30 days)
Build Workspace zip file of the working directory (available for 30 days)

foundationdb-ci · 2025-11-22T00:28:43Z

Result of foundationdb-pr on Linux RHEL 9

Commit ID: 761b593
Duration 2:26:47
Result: ❌ FAILED
Error: Error while executing command: if python3 -m joshua.joshua list --stopped | grep ${ENSEMBLE_ID} | grep -q 'pass=10[0-9][0-9][0-9]'; then echo PASS; else echo FAIL && exit 1; fi. Reason: exit status 1
Build Log terminal output (available for 30 days)
Build Workspace zip file of the working directory (available for 30 days)

foundationdb-ci · 2025-11-22T01:14:09Z

Result of foundationdb-pr-clang-ide on Linux RHEL 9

Commit ID: dcca4b7
Duration 0:24:30
Result: ✅ SUCCEEDED
Error: N/A
Build Log terminal output (available for 30 days)
Build Workspace zip file of the working directory (available for 30 days)

foundationdb-ci · 2025-11-22T01:25:28Z

Result of foundationdb-pr-macos-m1 on macOS Ventura 13.x

Commit ID: dcca4b7
Duration 0:35:52
Result: ✅ SUCCEEDED
Error: N/A
Build Log terminal output (available for 30 days)
Build Workspace zip file of the working directory (available for 30 days)

foundationdb-ci · 2025-11-22T01:35:19Z

Result of foundationdb-pr-clang-arm on Linux CentOS 7

Commit ID: dcca4b7
Duration 0:45:37
Result: ✅ SUCCEEDED
Error: N/A
Build Log terminal output (available for 30 days)
Build Workspace zip file of the working directory (available for 30 days)

foundationdb-ci · 2025-11-22T01:42:57Z

Result of foundationdb-pr-macos on macOS Ventura 13.x

Commit ID: dcca4b7
Duration 0:53:20
Result: ✅ SUCCEEDED
Error: N/A
Build Log terminal output (available for 30 days)
Build Workspace zip file of the working directory (available for 30 days)

foundationdb-ci · 2025-11-22T01:58:49Z

Result of foundationdb-pr-clang on Linux RHEL 9

Commit ID: dcca4b7
Duration 1:09:12
Result: ✅ SUCCEEDED
Error: N/A
Build Log terminal output (available for 30 days)
Build Workspace zip file of the working directory (available for 30 days)

foundationdb-ci · 2025-11-22T02:03:52Z

Result of foundationdb-pr-cluster-tests on Linux RHEL 9

Commit ID: dcca4b7
Duration 1:14:15
Result: ✅ SUCCEEDED
Error: N/A
Build Log terminal output (available for 30 days)
Build Workspace zip file of the working directory (available for 30 days)
Cluster Test Logs zip file of the test logs (available for 30 days)

foundationdb-ci · 2025-11-22T03:07:42Z

Result of foundationdb-pr on Linux RHEL 9

Commit ID: dcca4b7
Duration 2:18:06
Result: ❌ FAILED
Error: Error while executing command: if python3 -m joshua.joshua list --stopped | grep ${ENSEMBLE_ID} | grep -q 'pass=10[0-9][0-9][0-9]'; then echo PASS; else echo FAIL && exit 1; fi. Reason: exit status 1
Build Log terminal output (available for 30 days)
Build Workspace zip file of the working directory (available for 30 days)

Implements restore validation using audit_storage to verify backup/restore correctness. Includes a minimal fix for the backup gap bug. Key components: - ValidateRestore audit type: compares source keys against restored keys at \xff\x02/rlog/ prefix in storage server - DD audit fixes: propagate validation errors, handle DD failover correctly - RestoreValidation and BackupAndRestoreValidation workloads for testing - Simplified backup gap fix: prevent snapshot from finishing in the same iteration it dispatches the last tasks (single flag + one check) Backup gap bug fix (FileBackupAgent.actor.cpp): The original dispatcher marks ranges as DONE when selecting them for dispatch, then immediately checks if all ranges are done. This causes snapshots to finish before the dispatched tasks complete, creating gaps in backup coverage. The fix adds a dispatchedInThisIteration flag. If tasks were dispatched in this iteration, the completion check is skipped, ensuring at least one full loop between dispatch and completion. This minimal change prevents premature snapshot completion without complex state tracking.

…hard

When too many wrong_shard_server errors occur (stale shard location data), throw audit_storage_failed instead of audit_storage_cancelled. This ensures the audit is properly marked as Failed in the database rather than staying stuck in Running state. Also add a delay before retrying to let data distribution stabilize.

The restore API can return success before all restored data is fully committed and visible to readers. Add a 5-second delay after restore completes before setting the completion marker. This prevents the validation audit from running too early and finding false mismatches due to in-flight commits.

The delay in the error path could interfere with actor cleanup or cause issues in other audit types. The retry itself should be sufficient to allow data distribution to stabilize.

When rangeLocations[].servers is empty, we were breaking out of the inner loop but continuing execution, which led to using uninitialized targetServer variable at line 4538. This caused crashes/undefined behavior. Fix: Set taskRangeBegin to skip the entire range and continue the loop, avoiding the use of uninitialized targetServer.

The actor compiler was confused by using 'state' as a loop variable name since 'state' is a keyword in actor code. Renamed to 'auditState' to avoid the conflict.

Instead of adding recursive retry actors that can multiply and cause hangs, let wrong_shard_server errors propagate up to be handled by the higher-level error handlers. This prevents concurrent actors from all incrementing retryCount simultaneously and creating retry storms.

Even if the servers map is non-empty, individual DC server vectors could be empty. This would cause randomInt(0, 0) and out-of-bounds access. Skip empty DC server vectors to prevent crashes.

After skipping empty dcServers vectors, if storageServersToCheck is still empty, it means all DC server lists were empty. In this case, targetServer would never be initialized. Skip the entire shard to prevent using uninitialized targetServer.

…iled When all audit states are Running or Failed and skipped, totalCount remains 0. The CompleteRatio calculation then divides by zero, causing a floating point exception (SIGFPE) and process crash with exit code -2. This was the root cause of the -2 crashes in general test runs. The crashes occurred when ValidateHA or ValidateReplica audits (used in general tests) hit DD failovers and temporarily had all states in Running/Failed status.

The targetServer was only set when dcid == 0, but dcid gets incremented even for empty DC server lists (via continue). So if the first DC had an empty server list, dcid would be 1 when we encounter the first non-empty DC, and targetServer would never be set, causing a crash when accessed. Fixed by using a targetServerSet flag instead of checking dcid == 0. Now targetServer is set on the FIRST non-empty DC, regardless of index. This was the root cause of -2 crashes in general test runs.

foundationdb-ci · 2025-11-23T06:16:52Z

Result of foundationdb-pr-clang on Linux RHEL 9

Commit ID: 77618b6
Duration 0:04:25
Result: ❌ FAILED
Error: Error while executing command: if [[ $(git diff --shortstat 2> /dev/null | tail -n1) == "" ]]; then echo "CODE FORMAT CLEAN"; else echo "CODE FORMAT NOT CLEAN"; echo; echo "THE FOLLOWING FILES NEED TO BE FORMATTED"; echo; git ls-files -m; echo; if [[ $FDB_VERSION =~ 7\.\3. ]]; then echo skip; else exit 1; fi; fi. Reason: exit status 1
Build Log terminal output (available for 30 days)
Build Workspace zip file of the working directory (available for 30 days)

foundationdb-ci · 2025-11-23T06:16:55Z

Result of foundationdb-pr-cluster-tests on Linux RHEL 9

Commit ID: 77618b6
Duration 0:04:27
Result: ❌ FAILED
Error: Error while executing command: if [[ $(git diff --shortstat 2> /dev/null | tail -n1) == "" ]]; then echo "CODE FORMAT CLEAN"; else echo "CODE FORMAT NOT CLEAN"; echo; echo "THE FOLLOWING FILES NEED TO BE FORMATTED"; echo; git ls-files -m; echo; if [[ $FDB_VERSION =~ 7\.\3. ]]; then echo skip; else exit 1; fi; fi. Reason: exit status 1
Build Log terminal output (available for 30 days)
Build Workspace zip file of the working directory (available for 30 days)
Cluster Test Logs zip file of the test logs (available for 30 days)

foundationdb-ci · 2025-11-23T06:16:56Z

Result of foundationdb-pr-clang-ide on Linux RHEL 9

Commit ID: 77618b6
Duration 0:04:25
Result: ❌ FAILED
Error: Error while executing command: if [[ $(git diff --shortstat 2> /dev/null | tail -n1) == "" ]]; then echo "CODE FORMAT CLEAN"; else echo "CODE FORMAT NOT CLEAN"; echo; echo "THE FOLLOWING FILES NEED TO BE FORMATTED"; echo; git ls-files -m; echo; if [[ $FDB_VERSION =~ 7\.\3. ]]; then echo skip; else exit 1; fi; fi. Reason: exit status 1
Build Log terminal output (available for 30 days)
Build Workspace zip file of the working directory (available for 30 days)

foundationdb-ci · 2025-11-23T06:17:00Z

Result of foundationdb-pr-clang-arm on Linux CentOS 7

Commit ID: 77618b6
Duration 0:04:27
Result: ❌ FAILED
Error: Error while executing command: if [[ $(git diff --shortstat 2> /dev/null | tail -n1) == "" ]]; then echo "CODE FORMAT CLEAN"; else echo "CODE FORMAT NOT CLEAN"; echo; echo "THE FOLLOWING FILES NEED TO BE FORMATTED"; echo; git ls-files -m; echo; if [[ $FDB_VERSION =~ 7\.\3. ]]; then echo skip; else exit 1; fi; fi. Reason: exit status 1
Build Log terminal output (available for 30 days)
Build Workspace zip file of the working directory (available for 30 days)

foundationdb-ci · 2025-11-23T06:17:07Z

Result of foundationdb-pr on Linux RHEL 9

Commit ID: 77618b6
Duration 0:04:38
Result: ❌ FAILED
Error: Error while executing command: if [[ $(git diff --shortstat 2> /dev/null | tail -n1) == "" ]]; then echo "CODE FORMAT CLEAN"; else echo "CODE FORMAT NOT CLEAN"; echo; echo "THE FOLLOWING FILES NEED TO BE FORMATTED"; echo; git ls-files -m; echo; if [[ $FDB_VERSION =~ 7\.\3. ]]; then echo skip; else exit 1; fi; fi. Reason: exit status 1
Build Log terminal output (available for 30 days)
Build Workspace zip file of the working directory (available for 30 days)

foundationdb-ci · 2025-11-23T06:48:56Z

Result of foundationdb-pr-macos-m1 on macOS Ventura 13.x

Commit ID: 77618b6
Duration 0:36:25
Result: ✅ SUCCEEDED
Error: N/A
Build Log terminal output (available for 30 days)
Build Workspace zip file of the working directory (available for 30 days)

foundationdb-ci · 2025-11-23T07:05:51Z

Result of foundationdb-pr-macos on macOS Ventura 13.x

Commit ID: 77618b6
Duration 0:53:19
Result: ✅ SUCCEEDED
Error: N/A
Build Log terminal output (available for 30 days)
Build Workspace zip file of the working directory (available for 30 days)

foundationdb-ci · 2025-11-23T18:07:03Z

Result of foundationdb-pr-clang-ide on Linux RHEL 9

Commit ID: c7a1faf
Duration 0:27:10
Result: ✅ SUCCEEDED
Error: N/A
Build Log terminal output (available for 30 days)
Build Workspace zip file of the working directory (available for 30 days)

foundationdb-ci · 2025-11-23T18:17:23Z

Result of foundationdb-pr-macos-m1 on macOS Ventura 13.x

Commit ID: c7a1faf
Duration 0:37:32
Result: ✅ SUCCEEDED
Error: N/A
Build Log terminal output (available for 30 days)
Build Workspace zip file of the working directory (available for 30 days)

foundationdb-ci · 2025-11-23T18:25:44Z

Result of foundationdb-pr-clang-arm on Linux CentOS 7

Commit ID: c7a1faf
Duration 0:45:53
Result: ✅ SUCCEEDED
Error: N/A
Build Log terminal output (available for 30 days)
Build Workspace zip file of the working directory (available for 30 days)

foundationdb-ci · 2025-11-23T18:32:50Z

Result of foundationdb-pr-macos on macOS Ventura 13.x

Commit ID: c7a1faf
Duration 0:52:58
Result: ✅ SUCCEEDED
Error: N/A
Build Log terminal output (available for 30 days)
Build Workspace zip file of the working directory (available for 30 days)

foundationdb-ci · 2025-11-23T18:52:36Z

Result of foundationdb-pr-cluster-tests on Linux RHEL 9

Commit ID: c7a1faf
Duration 1:12:47
Result: ✅ SUCCEEDED
Error: N/A
Build Log terminal output (available for 30 days)
Build Workspace zip file of the working directory (available for 30 days)
Cluster Test Logs zip file of the test logs (available for 30 days)

foundationdb-ci · 2025-11-23T19:59:03Z

Result of foundationdb-pr on Linux RHEL 9

Commit ID: c7a1faf
Duration 2:19:12
Result: ❌ FAILED
Error: Error while executing command: if python3 -m joshua.joshua list --stopped | grep ${ENSEMBLE_ID} | grep -q 'pass=10[0-9][0-9][0-9]'; then echo PASS; else echo FAIL && exit 1; fi. Reason: exit status 1
Build Log terminal output (available for 30 days)
Build Workspace zip file of the working directory (available for 30 days)

foundationdb-ci · 2025-11-23T20:05:02Z

Result of foundationdb-pr-clang on Linux RHEL 9

Commit ID: c7a1faf
Duration 2:25:10
Result: ❌ FAILED
Error: Error while executing command: if python3 -m joshua.joshua list --stopped | grep ${ENSEMBLE_ID} | grep -q 'pass=10[0-9][0-9][0-9]'; then echo PASS; else echo FAIL && exit 1; fi. Reason: exit status 1
Build Log terminal output (available for 30 days)
Build Workspace zip file of the working directory (available for 30 days)

saintstack · 2025-11-23T23:12:19Z

Looks like current origin/main is crashing in joshua. Seems unrelated to this PR (at least, I've tried main three times now -- gcc and clang -- and I get the below). Will come back here after we figure out whats going on in main (Need joshua to log seed, test name, and whether buggify at a very minimum...even on crash).

tCode="0" Note="Skipped - joshua_logtool not enabled"/></Test>\n'
0x100003274e05025040000 -2 joshua-rhel9-agent-251123200133-93-4rpnz 9018441363354346796 ''
0x100003274e05f99ac0000 -2 joshua-rhel9-agent-251123200239-22-k7fqz 2006119660150738464 ''
0x100003274e06e4a770000 -2 joshua-rhel9-agent-251123200133-46-sqb25 1810916910545336240 ''
0x100003274e07479560000 -2 joshua-rhel9-agent-251123200345-15-5f8ff 2942176255541202493 ''
0x100003274f664b7ea0000 -2 joshua-rhel9-agent-251123200919-1-675w6 3263954106056185624 ''
0x100003274f665a5580000 -2 joshua-rhel9-agent-251123200345-50-xrhr5 7096576560104484783 ''
0x100003274f665edd40000 -2 joshua-rhel9-agent-251123200811-19-6z95r 8102761657477446275 ''
0x100003274f666eae70000 -2 joshua-rhel9-agent-251123200239-49-h824f 94963975006375793 ''
0x100003274f667d9060000 -2 joshua-rhel9-agent-251123200558-99-7d7kt 1602491654500515484 ''
0x100003274f66a7dee0000 -2 joshua-rhel9-agent-251123200919-48-7dvj6 3478247052274315810 ''
Ensemble stopped

saintstack requested review from akankshamahajan15 and neethuhaneesha November 21, 2025 20:49

michael stack added 4 commits November 21, 2025 22:50

Fix ValidateRestore audit hanging when no storage servers found for s…

b8be20e

…hard

Formatting

b83b5c0

michael stack added 10 commits November 21, 2025 22:50

Remove delay from wrong_shard_server error handler

6e67cd7

The delay in the error path could interfere with actor cleanup or cause issues in other audit types. The retry itself should be sufficient to allow data distribution to stabilize.

Fix compilation error: rename loop variable from 'state' to 'auditState'

78f2f50

The actor compiler was confused by using 'state' as a loop variable name since 'state' is a keyword in actor code. Renamed to 'auditState' to avoid the conflict.

Formatting

cc061f0

Add safety check for empty dcServers vector in ValidateRestore audit

821dd18

Even if the servers map is non-empty, individual DC server vectors could be empty. This would cause randomInt(0, 0) and out-of-bounds access. Skip empty DC server vectors to prevent crashes.

saintstack force-pushed the restore-validation-simple branch from dcca4b7 to 77618b6 Compare November 23, 2025 06:12

Formatting

c7a1faf

Add restore validation feature: restores to special keyspace allowing validating backup/restore in single cluster (space willing) #12573

Are you sure you want to change the base?

Add restore validation feature: restores to special keyspace allowing validating backup/restore in single cluster (space willing) #12573

Uh oh!

Conversation

saintstack commented Nov 21, 2025

Uh oh!

foundationdb-ci commented Nov 21, 2025

Result of foundationdb-pr-clang-ide on Linux RHEL 9

Uh oh!

foundationdb-ci commented Nov 21, 2025

Result of foundationdb-pr-macos-m1 on macOS Ventura 13.x

Uh oh!

foundationdb-ci commented Nov 21, 2025

Result of foundationdb-pr-clang-arm on Linux CentOS 7

Uh oh!

foundationdb-ci commented Nov 21, 2025

Result of foundationdb-pr-macos on macOS Ventura 13.x

Uh oh!

foundationdb-ci commented Nov 21, 2025

Result of foundationdb-pr-clang-arm on Linux CentOS 7

Uh oh!

foundationdb-ci commented Nov 21, 2025

Result of foundationdb-pr-clang on Linux RHEL 9

Uh oh!

foundationdb-ci commented Nov 21, 2025

Result of foundationdb-pr-clang-ide on Linux RHEL 9

Uh oh!

foundationdb-ci commented Nov 21, 2025

Result of foundationdb-pr on Linux RHEL 9

Uh oh!

foundationdb-ci commented Nov 21, 2025

Result of foundationdb-pr-cluster-tests on Linux RHEL 9

Uh oh!

foundationdb-ci commented Nov 21, 2025

Result of foundationdb-pr-cluster-tests on Linux RHEL 9

Uh oh!

foundationdb-ci commented Nov 21, 2025

Result of foundationdb-pr-clang-ide on Linux RHEL 9

Uh oh!

foundationdb-ci commented Nov 21, 2025

Result of foundationdb-pr-clang-arm on Linux CentOS 7

Uh oh!

foundationdb-ci commented Nov 21, 2025

Result of foundationdb-pr-clang on Linux RHEL 9

Uh oh!

foundationdb-ci commented Nov 21, 2025

Result of foundationdb-pr on Linux RHEL 9

Uh oh!

foundationdb-ci commented Nov 21, 2025

Result of foundationdb-pr-cluster-tests on Linux RHEL 9

Uh oh!

foundationdb-ci commented Nov 22, 2025

Result of foundationdb-pr-clang on Linux RHEL 9

Uh oh!

foundationdb-ci commented Nov 22, 2025

Result of foundationdb-pr on Linux RHEL 9

Uh oh!

foundationdb-ci commented Nov 22, 2025

Result of foundationdb-pr-clang-ide on Linux RHEL 9

Uh oh!

foundationdb-ci commented Nov 22, 2025

Result of foundationdb-pr-macos-m1 on macOS Ventura 13.x

Uh oh!

foundationdb-ci commented Nov 22, 2025

Result of foundationdb-pr-clang-arm on Linux CentOS 7

Uh oh!

foundationdb-ci commented Nov 22, 2025

Result of foundationdb-pr-macos on macOS Ventura 13.x

Uh oh!

foundationdb-ci commented Nov 22, 2025

Result of foundationdb-pr-clang on Linux RHEL 9

Uh oh!

foundationdb-ci commented Nov 22, 2025

Result of foundationdb-pr-cluster-tests on Linux RHEL 9

Uh oh!

foundationdb-ci commented Nov 22, 2025

Result of foundationdb-pr on Linux RHEL 9

Uh oh!

foundationdb-ci commented Nov 23, 2025