Skip to content

fix(snapshot): allow /sandbox/.openclaw-data symlinks in safeTarExtract#2488

Open
BenediktSchackenberg wants to merge 2 commits intoNVIDIA:mainfrom
BenediktSchackenberg:fix/safeTarExtract-symlink-audit-2317
Open

fix(snapshot): allow /sandbox/.openclaw-data symlinks in safeTarExtract#2488
BenediktSchackenberg wants to merge 2 commits intoNVIDIA:mainfrom
BenediktSchackenberg:fix/safeTarExtract-symlink-audit-2317

Conversation

@BenediktSchackenberg
Copy link
Copy Markdown
Contributor

@BenediktSchackenberg BenediktSchackenberg commented Apr 26, 2026

Problem

nemoclaw snapshot create, rebuild, and backup-all all fail on 0.0.22 with:

SECURITY: tar extraction blocked: post-extraction symlink audit failed:
symlink escape: workspace/media -> /sandbox/.openclaw-data/media

PR #2163 hardened safeTarExtract against tar-slip path traversal by rejecting any symlink whose target resolves outside the extraction root. This correctly blocks /etc/passwd etc., but over-rejects legitimate intra-sandbox symlinks created by Dockerfile.base for the .openclaw / .openclaw-data split. When extracted on the host, /sandbox/.openclaw-data/media doesn't exist on the host and isn't within the extraction root — so the audit falsely rejects it.

Fix

When the symlink target is an absolute path starting with /sandbox/.openclaw-data/ or /sandbox/.hermes-data/, also resolve it relative to the extraction root (strip the leading /sandbox/ prefix, resolve under targetDir). This mirrors how the symlink would resolve inside the sandbox container where these paths exist.

Absolute paths outside these known prefixes are still rejected as before. The security boundary is unchanged.

Tests

Added two regression tests:

23/23 tests pass.

Fixes #2317

Signed-off-by: Benedikt Schackenberg 6381261+BenediktSchackenberg@users.noreply.github.com

Summary by CodeRabbit

  • Bug Fixes

    • Improved symlink validation during archive extraction to correctly allow symlinks that map into authorized sandbox data directories while still blocking attempts to escape containment; error messages now report the resolved path shown to users.
  • Tests

    • Added regression tests verifying allowed symlinks to sandbox data and continued rejection of symlinks pointing outside or using traversal.

nemoclaw snapshot create / rebuild / backup-all fail on 0.0.22 because
safeTarExtract's post-extraction symlink audit rejects absolute symlinks
whose target (/sandbox/.openclaw-data/media, etc.) does not exist on the
host. These are legitimate intra-sandbox symlinks created by Dockerfile.base
for the .openclaw / .openclaw-data state split.

Fix: when resolving an absolute symlink target, if the path starts with
/sandbox/.openclaw-data/ or /sandbox/.hermes-data/, also check whether
it falls within the extraction root when the /sandbox/ prefix is stripped
and resolved relative to that root. This mirrors how the symlink resolves
inside the sandbox container (where /sandbox/.openclaw-data/* exists).

Absolute paths outside these known prefixes (e.g. /etc/passwd) are still
rejected as before. The security boundary is preserved.

Added two regression tests:
- Allows /sandbox/.openclaw-data/media symlinks (the NVIDIA#2317 repro case)
- Still blocks /etc/passwd and other non-/sandbox/.openclaw-data targets

Fixes NVIDIA#2317

Signed-off-by: Benedikt Schackenberg <6381261+BenediktSchackenberg@users.noreply.github.com>
Copilot AI review requested due to automatic review settings April 26, 2026 19:46
@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented Apr 26, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 26, 2026

📝 Walkthrough

Walkthrough

Refines symlink escape detection to allow absolute symlinks that target canonical sandbox data prefixes by mapping /sandbox/ to the extraction root; otherwise rejects external targets. Tests add one known-safe assertion and two rejecting assertions covering traversal and external absolute targets.

Changes

Cohort / File(s) Summary
Symlink Escape Audit Logic
src/lib/sandbox-state.ts
Keep standard resolution as resolvedRelative for containment checks; handle absolute symlink targets starting with /sandbox/.openclaw-data/ or /sandbox/.hermes-data/ by normalizing and mapping /sandbox/ to the extraction root (dirPath); permit when mapped path is inside dirPath; use resolvedRelative in violation messages.
Symlink Security Tests
test/security-sandbox-tar-traversal.test.ts
Extend regression coverage for #2317: add one assertion that accepts absolute symlinks into /sandbox/.openclaw-data/* under the mapped extraction-root logic, and two assertions that still reject symlinks whose absolute targets are outside the allowed sandbox-data prefixes (including .. traversal).

Sequence Diagram(s)

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

Possibly related PRs

  • #2308: Modifies symlink-audit logic in src/lib/sandbox-state.ts to permit intra-sandbox absolute symlink targets during safeTarExtract.

Suggested reviewers

  • ericksoa

Poem

🐰 I hopped through paths both short and long,
Mapped sandbox roots where symlinks belong.
If targets stay home, I cheer and sing,
Else I thump my foot—no outside thing!
Safe digs only — hop, secure, and strong.

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'fix(snapshot): allow /sandbox/.openclaw-data symlinks in safeTarExtract' directly and specifically describes the main change: enabling support for /sandbox/.openclaw-data symlinks in the safeTarExtract function, which is the core objective of the PR.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
test/security-sandbox-tar-traversal.test.ts (1)

417-439: Security guardrail test looks correct.

This ensures absolute symlinks outside the allowed prefixes remain blocked.

Consider adding a test for path traversal within the allowed prefix to ensure the isWithinRoot check catches it:

🧪 Optional: Add path traversal edge case test
it("regression `#2317`: blocks path traversal via allowed prefix (/sandbox/.openclaw-data/../../../etc/passwd)", async () => {
  const { safeTarExtract } = await loadSandboxState();
  const workDir = fs.mkdtempSync(path.join(os.tmpdir(), "nemoclaw-2317-traversal-"));
  try {
    const targetDir = path.join(workDir, "backup");
    fs.mkdirSync(targetDir, { recursive: true });

    // Attempt escape via traversal within the allowed prefix
    const tar = buildTar([
      {
        path: "workspace/escape",
        type: "2",
        linkTarget: "/sandbox/.openclaw-data/../../../etc/passwd",
      },
    ]);

    const result = safeTarExtract(tar, targetDir);
    expect(result.success).toBe(false);
    expect(result.error).toContain("symlink");
  } finally {
    fs.rmSync(workDir, { recursive: true, force: true });
  }
});
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/security-sandbox-tar-traversal.test.ts` around lines 417 - 439, Add a
new test case in security-sandbox-tar-traversal.test.ts that mirrors the
absolute-symlink test but targets a symlink whose linkTarget is an
allowed-prefix path that attempts traversal (e.g.,
"/sandbox/.openclaw-data/../../../etc/passwd"); use the same helpers buildTar
and safeTarExtract and the same workDir/targetDir setup used in the existing
test, call safeTarExtract and assert result.success is false and result.error
contains "symlink" to verify the isWithinRoot check rejects
traversal-from-within-allowed-prefix.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@test/security-sandbox-tar-traversal.test.ts`:
- Around line 417-439: Add a new test case in
security-sandbox-tar-traversal.test.ts that mirrors the absolute-symlink test
but targets a symlink whose linkTarget is an allowed-prefix path that attempts
traversal (e.g., "/sandbox/.openclaw-data/../../../etc/passwd"); use the same
helpers buildTar and safeTarExtract and the same workDir/targetDir setup used in
the existing test, call safeTarExtract and assert result.success is false and
result.error contains "symlink" to verify the isWithinRoot check rejects
traversal-from-within-allowed-prefix.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: f8d238aa-e0bd-4219-8bf6-1556bf14836a

📥 Commits

Reviewing files that changed from the base of the PR and between f439189 and 21c0c17.

📒 Files selected for processing (2)
  • src/lib/sandbox-state.ts
  • test/security-sandbox-tar-traversal.test.ts

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adjusts the host-side safeTarExtract post-extraction symlink audit to avoid rejecting sandbox-internal absolute symlinks (notably /sandbox/.openclaw-data/** and /sandbox/.hermes-data/**) that are legitimate in restored container context, and adds regression tests for the failure reported in #2317.

Changes:

  • Update auditExtractedSymlinks() to treat certain /sandbox/... absolute symlink targets as safe by additionally mapping them under the extraction root.
  • Add regression tests to ensure /sandbox/.openclaw-data/** symlinks are allowed while clearly unsafe absolute targets (e.g. /etc/passwd) remain blocked.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
src/lib/sandbox-state.ts Extends symlink-audit logic with special handling for known sandbox data-dir absolute symlink targets.
test/security-sandbox-tar-traversal.test.ts Adds regression coverage for allowing /sandbox/.openclaw-data/** symlinks while still rejecting unsafe absolute symlinks.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/lib/sandbox-state.ts
Comment on lines +296 to +306
const SANDBOX_DATA_PREFIXES = ["/sandbox/.openclaw-data/", "/sandbox/.hermes-data/"];
const resolvedInArchive =
path.isAbsolute(linkTarget) &&
SANDBOX_DATA_PREFIXES.some((p) => linkTarget.startsWith(p))
? path.resolve(dirPath, linkTarget.replace(/^\//, ""))
: null;

const inAnyAllowedRoot =
allowedRoots.some((root) => isWithinRoot(resolvedRelative, root)) ||
(resolvedInArchive !== null && isWithinRoot(resolvedInArchive, dirPath));

Copy link

Copilot AI Apr 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resolvedInArchive is derived from dirPath, so isWithinRoot(resolvedInArchive, dirPath) will always be true for any non-null value. That means any absolute symlink starting with one of the allowlisted prefixes is unconditionally accepted, even if the target contains traversal segments like /sandbox/.openclaw-data/../../etc/passwd (which actually resolves to /etc/passwd). Consider normalizing the target (e.g. with path.posix.normalize), stripping the intended /sandbox/ (or data-dir) prefix correctly, then resolving and validating containment so .. segments can’t bypass the audit.

Copilot uses AI. Check for mistakes.
Comment thread src/lib/sandbox-state.ts Outdated
Comment on lines +296 to +300
const SANDBOX_DATA_PREFIXES = ["/sandbox/.openclaw-data/", "/sandbox/.hermes-data/"];
const resolvedInArchive =
path.isAbsolute(linkTarget) &&
SANDBOX_DATA_PREFIXES.some((p) => linkTarget.startsWith(p))
? path.resolve(dirPath, linkTarget.replace(/^\//, ""))
Copy link

Copilot AI Apr 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment says the code “strip[s] the leading /sandbox/ prefix”, but linkTarget.replace(/^\//, "") only removes the first slash (producing sandbox/...). If this mapping is intended to mirror sandbox-internal resolution under the extraction root, it should remove the /sandbox/ (and likely the full /sandbox/.openclaw-data/ or /sandbox/.hermes-data/ prefix) so paths map to the expected locations inside targetDir.

Suggested change
const SANDBOX_DATA_PREFIXES = ["/sandbox/.openclaw-data/", "/sandbox/.hermes-data/"];
const resolvedInArchive =
path.isAbsolute(linkTarget) &&
SANDBOX_DATA_PREFIXES.some((p) => linkTarget.startsWith(p))
? path.resolve(dirPath, linkTarget.replace(/^\//, ""))
const SANDBOX_PREFIX = "/sandbox/";
const SANDBOX_DATA_PREFIXES = ["/sandbox/.openclaw-data/", "/sandbox/.hermes-data/"];
const resolvedInArchive =
path.isAbsolute(linkTarget) &&
SANDBOX_DATA_PREFIXES.some((p) => linkTarget.startsWith(p))
? path.resolve(dirPath, linkTarget.startsWith(SANDBOX_PREFIX) ? linkTarget.slice(SANDBOX_PREFIX.length) : linkTarget)

Copilot uses AI. Check for mistakes.
Comment on lines +417 to +439
it("regression #2317: still blocks absolute symlinks outside /sandbox/.openclaw-data", async () => {
const { safeTarExtract } = await loadSandboxState();
const workDir = fs.mkdtempSync(path.join(os.tmpdir(), "nemoclaw-2317-block-"));
try {
const targetDir = path.join(workDir, "backup");
fs.mkdirSync(targetDir, { recursive: true });

// /etc/passwd should still be rejected — not in /sandbox/.openclaw-data/
const tar = buildTar([
{
path: "evil-link",
type: "2",
linkTarget: "/etc/passwd",
},
]);

const result = safeTarExtract(tar, targetDir);
expect(result.success).toBe(false);
expect(result.error).toContain("symlink");
} finally {
fs.rmSync(workDir, { recursive: true, force: true });
}
});
Copy link

Copilot AI Apr 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These regression tests don’t cover the important edge case where a symlink target is under an allowlisted prefix but normalizes outside it (e.g. /sandbox/.openclaw-data/../../etc/passwd). Given the new prefix-based handling in auditExtractedSymlinks, adding a test for this case would help prevent a traversal bypass from being inadvertently allowed.

Copilot uses AI. Check for mistakes.
…traversal bypass

Copilot review on NVIDIA#2488 correctly identified that the previous fix could
be bypassed: a target like /sandbox/.openclaw-data/../../etc/passwd starts
with the allowed prefix but after normalization resolves to /etc/passwd.

Fix: apply path.posix.normalize() to the symlink target before the prefix
check, so traversal sequences are collapsed and cannot be used to sneak
past the allowlist.

Added regression test: /sandbox/.openclaw-data/../../etc/passwd is blocked.

Per Copilot + CodeRabbit review on NVIDIA#2488.

Signed-off-by: Benedikt Schackenberg <6381261+BenediktSchackenberg@users.noreply.github.com>
@BenediktSchackenberg
Copy link
Copy Markdown
Contributor Author

Fixed both Copilot review points:

  1. Path normalization — now applies path.posix.normalize() to the symlink target before the prefix check, so /sandbox/.openclaw-data/../../etc/passwd is normalized to /etc/passwd and correctly rejected
  2. Traversal edge case test — added blocks path traversal within allowed prefix test that asserts the traversal bypass is blocked

24/24 tests pass including the new case.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/lib/sandbox-state.ts`:
- Around line 296-310: The allowlist is bypassed because the generic
allowedRoots check can match absolute /sandbox/* paths before the
archive-specific mapping is applied; update the logic in the
SANDBOX_DATA_PREFIXES block so you first canonicalize and check for the archive
prefix (using normalizedTarget and SANDBOX_DATA_PREFIXES), then map only the
leading "/sandbox/" segment off normalizedTarget (e.g. replace(/^\/*sandbox\//,
"")) to produce resolvedInArchive, and ensure inAnyAllowedRoot uses isWithinRoot
against resolvedInArchive (when non-null) rather than relying on the
allowedRoots branch to accept raw /sandbox paths; also tighten the allowedRoots
check to operate on resolvedRelative only so generic roots cannot accidentally
accept sandbox-absolute targets.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: fc9fd8b1-2ba8-4dfb-bed3-4d317d7028f5

📥 Commits

Reviewing files that changed from the base of the PR and between 21c0c17 and 9f614c5.

📒 Files selected for processing (2)
  • src/lib/sandbox-state.ts
  • test/security-sandbox-tar-traversal.test.ts
🚧 Files skipped from review as they are similar to previous changes (1)
  • test/security-sandbox-tar-traversal.test.ts

Comment thread src/lib/sandbox-state.ts
Comment on lines +296 to +310
const SANDBOX_DATA_PREFIXES = ["/sandbox/.openclaw-data/", "/sandbox/.hermes-data/"];
// Normalize the target first to collapse any .. traversal segments
// (e.g. /sandbox/.openclaw-data/../../etc/passwd → /etc/passwd).
// Only then check the prefix — this prevents a traversal bypass
// where a crafted target starts with an allowed prefix but escapes it.
const normalizedTarget = path.posix.normalize(linkTarget);
const resolvedInArchive =
path.isAbsolute(normalizedTarget) &&
SANDBOX_DATA_PREFIXES.some((p) => normalizedTarget.startsWith(p))
? path.resolve(dirPath, normalizedTarget.replace(/^\//, ""))
: null;

const inAnyAllowedRoot =
allowedRoots.some((root) => isWithinRoot(resolvedRelative, root)) ||
(resolvedInArchive !== null && isWithinRoot(resolvedInArchive, dirPath));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Prefix allowlist is effectively bypassed by generic /sandbox root allowance.

The new allowlist check can still be bypassed because inAnyAllowedRoot accepts absolute /sandbox/* via the generic allowedRoots branch. Also, the archive remap strips only /, not /sandbox/, which does not match the intended mapping semantics.

🔧 Proposed fix
@@
-          const SANDBOX_DATA_PREFIXES = ["/sandbox/.openclaw-data/", "/sandbox/.hermes-data/"];
+          const SANDBOX_DATA_PREFIXES = ["/sandbox/.openclaw-data/", "/sandbox/.hermes-data/"];
@@
-          const normalizedTarget = path.posix.normalize(linkTarget);
-          const resolvedInArchive =
-            path.isAbsolute(normalizedTarget) &&
-            SANDBOX_DATA_PREFIXES.some((p) => normalizedTarget.startsWith(p))
-              ? path.resolve(dirPath, normalizedTarget.replace(/^\//, ""))
-              : null;
+          const normalizedTarget = path.posix.normalize(linkTarget);
+          const isAbsoluteTarget = path.isAbsolute(normalizedTarget);
+          const resolvedInArchive =
+            isAbsoluteTarget && SANDBOX_DATA_PREFIXES.some((p) => normalizedTarget.startsWith(p))
+              ? path.resolve(dirPath, normalizedTarget.replace(/^\/sandbox\//, ""))
+              : null;
@@
-          const inAnyAllowedRoot =
-            allowedRoots.some((root) => isWithinRoot(resolvedRelative, root)) ||
+          const inAnyAllowedRoot =
+            (!isAbsoluteTarget && allowedRoots.some((root) => isWithinRoot(resolvedRelative, root))) ||
             (resolvedInArchive !== null && isWithinRoot(resolvedInArchive, dirPath));
@@
-  const symlinkViolations = auditExtractedSymlinks(targetDir, [targetDir, "/sandbox"]);
+  const symlinkViolations = auditExtractedSymlinks(targetDir, [targetDir]);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/lib/sandbox-state.ts` around lines 296 - 310, The allowlist is bypassed
because the generic allowedRoots check can match absolute /sandbox/* paths
before the archive-specific mapping is applied; update the logic in the
SANDBOX_DATA_PREFIXES block so you first canonicalize and check for the archive
prefix (using normalizedTarget and SANDBOX_DATA_PREFIXES), then map only the
leading "/sandbox/" segment off normalizedTarget (e.g. replace(/^\/*sandbox\//,
"")) to produce resolvedInArchive, and ensure inAnyAllowedRoot uses isWithinRoot
against resolvedInArchive (when non-null) rather than relying on the
allowedRoots branch to accept raw /sandbox paths; also tighten the allowedRoots
check to operate on resolvedRelative only so generic roots cannot accidentally
accept sandbox-absolute targets.

@wscurran wscurran added bug Something isn't working NemoClaw CLI Use this label to identify issues with the NemoClaw command-line interface (CLI). labels Apr 27, 2026
@wscurran
Copy link
Copy Markdown
Contributor

✨ Thanks for submitting this pull request that proposes a way to fix a bug in the safeTarExtract function, allowing symlinks in /sandbox/.openclaw-data to be extracted correctly.


Related open PRs:


Related open issues:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working NemoClaw CLI Use this label to identify issues with the NemoClaw command-line interface (CLI).

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[NemoClaw 0.0.22] snapshot create/rebuild/backup-all blocked by safeTarExtract symlink audit on internal .openclaw-data/* symlinks (PR #2163)

3 participants