fix(snapshot): allow /sandbox/.openclaw-data symlinks in safeTarExtract#2488
fix(snapshot): allow /sandbox/.openclaw-data symlinks in safeTarExtract#2488BenediktSchackenberg wants to merge 2 commits intoNVIDIA:mainfrom
Conversation
nemoclaw snapshot create / rebuild / backup-all fail on 0.0.22 because safeTarExtract's post-extraction symlink audit rejects absolute symlinks whose target (/sandbox/.openclaw-data/media, etc.) does not exist on the host. These are legitimate intra-sandbox symlinks created by Dockerfile.base for the .openclaw / .openclaw-data state split. Fix: when resolving an absolute symlink target, if the path starts with /sandbox/.openclaw-data/ or /sandbox/.hermes-data/, also check whether it falls within the extraction root when the /sandbox/ prefix is stripped and resolved relative to that root. This mirrors how the symlink resolves inside the sandbox container (where /sandbox/.openclaw-data/* exists). Absolute paths outside these known prefixes (e.g. /etc/passwd) are still rejected as before. The security boundary is preserved. Added two regression tests: - Allows /sandbox/.openclaw-data/media symlinks (the NVIDIA#2317 repro case) - Still blocks /etc/passwd and other non-/sandbox/.openclaw-data targets Fixes NVIDIA#2317 Signed-off-by: Benedikt Schackenberg <6381261+BenediktSchackenberg@users.noreply.github.com>
📝 WalkthroughWalkthroughRefines symlink escape detection to allow absolute symlinks that target canonical sandbox data prefixes by mapping Changes
Sequence Diagram(s)Estimated code review effort🎯 3 (Moderate) | ⏱️ ~22 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
test/security-sandbox-tar-traversal.test.ts (1)
417-439: Security guardrail test looks correct.This ensures absolute symlinks outside the allowed prefixes remain blocked.
Consider adding a test for path traversal within the allowed prefix to ensure the
isWithinRootcheck catches it:🧪 Optional: Add path traversal edge case test
it("regression `#2317`: blocks path traversal via allowed prefix (/sandbox/.openclaw-data/../../../etc/passwd)", async () => { const { safeTarExtract } = await loadSandboxState(); const workDir = fs.mkdtempSync(path.join(os.tmpdir(), "nemoclaw-2317-traversal-")); try { const targetDir = path.join(workDir, "backup"); fs.mkdirSync(targetDir, { recursive: true }); // Attempt escape via traversal within the allowed prefix const tar = buildTar([ { path: "workspace/escape", type: "2", linkTarget: "/sandbox/.openclaw-data/../../../etc/passwd", }, ]); const result = safeTarExtract(tar, targetDir); expect(result.success).toBe(false); expect(result.error).toContain("symlink"); } finally { fs.rmSync(workDir, { recursive: true, force: true }); } });🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/security-sandbox-tar-traversal.test.ts` around lines 417 - 439, Add a new test case in security-sandbox-tar-traversal.test.ts that mirrors the absolute-symlink test but targets a symlink whose linkTarget is an allowed-prefix path that attempts traversal (e.g., "/sandbox/.openclaw-data/../../../etc/passwd"); use the same helpers buildTar and safeTarExtract and the same workDir/targetDir setup used in the existing test, call safeTarExtract and assert result.success is false and result.error contains "symlink" to verify the isWithinRoot check rejects traversal-from-within-allowed-prefix.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@test/security-sandbox-tar-traversal.test.ts`:
- Around line 417-439: Add a new test case in
security-sandbox-tar-traversal.test.ts that mirrors the absolute-symlink test
but targets a symlink whose linkTarget is an allowed-prefix path that attempts
traversal (e.g., "/sandbox/.openclaw-data/../../../etc/passwd"); use the same
helpers buildTar and safeTarExtract and the same workDir/targetDir setup used in
the existing test, call safeTarExtract and assert result.success is false and
result.error contains "symlink" to verify the isWithinRoot check rejects
traversal-from-within-allowed-prefix.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: f8d238aa-e0bd-4219-8bf6-1556bf14836a
📒 Files selected for processing (2)
src/lib/sandbox-state.tstest/security-sandbox-tar-traversal.test.ts
There was a problem hiding this comment.
Pull request overview
This PR adjusts the host-side safeTarExtract post-extraction symlink audit to avoid rejecting sandbox-internal absolute symlinks (notably /sandbox/.openclaw-data/** and /sandbox/.hermes-data/**) that are legitimate in restored container context, and adds regression tests for the failure reported in #2317.
Changes:
- Update
auditExtractedSymlinks()to treat certain/sandbox/...absolute symlink targets as safe by additionally mapping them under the extraction root. - Add regression tests to ensure
/sandbox/.openclaw-data/**symlinks are allowed while clearly unsafe absolute targets (e.g./etc/passwd) remain blocked.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
src/lib/sandbox-state.ts |
Extends symlink-audit logic with special handling for known sandbox data-dir absolute symlink targets. |
test/security-sandbox-tar-traversal.test.ts |
Adds regression coverage for allowing /sandbox/.openclaw-data/** symlinks while still rejecting unsafe absolute symlinks. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| const SANDBOX_DATA_PREFIXES = ["/sandbox/.openclaw-data/", "/sandbox/.hermes-data/"]; | ||
| const resolvedInArchive = | ||
| path.isAbsolute(linkTarget) && | ||
| SANDBOX_DATA_PREFIXES.some((p) => linkTarget.startsWith(p)) | ||
| ? path.resolve(dirPath, linkTarget.replace(/^\//, "")) | ||
| : null; | ||
|
|
||
| const inAnyAllowedRoot = | ||
| allowedRoots.some((root) => isWithinRoot(resolvedRelative, root)) || | ||
| (resolvedInArchive !== null && isWithinRoot(resolvedInArchive, dirPath)); | ||
|
|
There was a problem hiding this comment.
resolvedInArchive is derived from dirPath, so isWithinRoot(resolvedInArchive, dirPath) will always be true for any non-null value. That means any absolute symlink starting with one of the allowlisted prefixes is unconditionally accepted, even if the target contains traversal segments like /sandbox/.openclaw-data/../../etc/passwd (which actually resolves to /etc/passwd). Consider normalizing the target (e.g. with path.posix.normalize), stripping the intended /sandbox/ (or data-dir) prefix correctly, then resolving and validating containment so .. segments can’t bypass the audit.
| const SANDBOX_DATA_PREFIXES = ["/sandbox/.openclaw-data/", "/sandbox/.hermes-data/"]; | ||
| const resolvedInArchive = | ||
| path.isAbsolute(linkTarget) && | ||
| SANDBOX_DATA_PREFIXES.some((p) => linkTarget.startsWith(p)) | ||
| ? path.resolve(dirPath, linkTarget.replace(/^\//, "")) |
There was a problem hiding this comment.
The comment says the code “strip[s] the leading /sandbox/ prefix”, but linkTarget.replace(/^\//, "") only removes the first slash (producing sandbox/...). If this mapping is intended to mirror sandbox-internal resolution under the extraction root, it should remove the /sandbox/ (and likely the full /sandbox/.openclaw-data/ or /sandbox/.hermes-data/ prefix) so paths map to the expected locations inside targetDir.
| const SANDBOX_DATA_PREFIXES = ["/sandbox/.openclaw-data/", "/sandbox/.hermes-data/"]; | |
| const resolvedInArchive = | |
| path.isAbsolute(linkTarget) && | |
| SANDBOX_DATA_PREFIXES.some((p) => linkTarget.startsWith(p)) | |
| ? path.resolve(dirPath, linkTarget.replace(/^\//, "")) | |
| const SANDBOX_PREFIX = "/sandbox/"; | |
| const SANDBOX_DATA_PREFIXES = ["/sandbox/.openclaw-data/", "/sandbox/.hermes-data/"]; | |
| const resolvedInArchive = | |
| path.isAbsolute(linkTarget) && | |
| SANDBOX_DATA_PREFIXES.some((p) => linkTarget.startsWith(p)) | |
| ? path.resolve(dirPath, linkTarget.startsWith(SANDBOX_PREFIX) ? linkTarget.slice(SANDBOX_PREFIX.length) : linkTarget) |
| it("regression #2317: still blocks absolute symlinks outside /sandbox/.openclaw-data", async () => { | ||
| const { safeTarExtract } = await loadSandboxState(); | ||
| const workDir = fs.mkdtempSync(path.join(os.tmpdir(), "nemoclaw-2317-block-")); | ||
| try { | ||
| const targetDir = path.join(workDir, "backup"); | ||
| fs.mkdirSync(targetDir, { recursive: true }); | ||
|
|
||
| // /etc/passwd should still be rejected — not in /sandbox/.openclaw-data/ | ||
| const tar = buildTar([ | ||
| { | ||
| path: "evil-link", | ||
| type: "2", | ||
| linkTarget: "/etc/passwd", | ||
| }, | ||
| ]); | ||
|
|
||
| const result = safeTarExtract(tar, targetDir); | ||
| expect(result.success).toBe(false); | ||
| expect(result.error).toContain("symlink"); | ||
| } finally { | ||
| fs.rmSync(workDir, { recursive: true, force: true }); | ||
| } | ||
| }); |
There was a problem hiding this comment.
These regression tests don’t cover the important edge case where a symlink target is under an allowlisted prefix but normalizes outside it (e.g. /sandbox/.openclaw-data/../../etc/passwd). Given the new prefix-based handling in auditExtractedSymlinks, adding a test for this case would help prevent a traversal bypass from being inadvertently allowed.
…traversal bypass Copilot review on NVIDIA#2488 correctly identified that the previous fix could be bypassed: a target like /sandbox/.openclaw-data/../../etc/passwd starts with the allowed prefix but after normalization resolves to /etc/passwd. Fix: apply path.posix.normalize() to the symlink target before the prefix check, so traversal sequences are collapsed and cannot be used to sneak past the allowlist. Added regression test: /sandbox/.openclaw-data/../../etc/passwd is blocked. Per Copilot + CodeRabbit review on NVIDIA#2488. Signed-off-by: Benedikt Schackenberg <6381261+BenediktSchackenberg@users.noreply.github.com>
|
Fixed both Copilot review points:
24/24 tests pass including the new case. |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/lib/sandbox-state.ts`:
- Around line 296-310: The allowlist is bypassed because the generic
allowedRoots check can match absolute /sandbox/* paths before the
archive-specific mapping is applied; update the logic in the
SANDBOX_DATA_PREFIXES block so you first canonicalize and check for the archive
prefix (using normalizedTarget and SANDBOX_DATA_PREFIXES), then map only the
leading "/sandbox/" segment off normalizedTarget (e.g. replace(/^\/*sandbox\//,
"")) to produce resolvedInArchive, and ensure inAnyAllowedRoot uses isWithinRoot
against resolvedInArchive (when non-null) rather than relying on the
allowedRoots branch to accept raw /sandbox paths; also tighten the allowedRoots
check to operate on resolvedRelative only so generic roots cannot accidentally
accept sandbox-absolute targets.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: fc9fd8b1-2ba8-4dfb-bed3-4d317d7028f5
📒 Files selected for processing (2)
src/lib/sandbox-state.tstest/security-sandbox-tar-traversal.test.ts
🚧 Files skipped from review as they are similar to previous changes (1)
- test/security-sandbox-tar-traversal.test.ts
| const SANDBOX_DATA_PREFIXES = ["/sandbox/.openclaw-data/", "/sandbox/.hermes-data/"]; | ||
| // Normalize the target first to collapse any .. traversal segments | ||
| // (e.g. /sandbox/.openclaw-data/../../etc/passwd → /etc/passwd). | ||
| // Only then check the prefix — this prevents a traversal bypass | ||
| // where a crafted target starts with an allowed prefix but escapes it. | ||
| const normalizedTarget = path.posix.normalize(linkTarget); | ||
| const resolvedInArchive = | ||
| path.isAbsolute(normalizedTarget) && | ||
| SANDBOX_DATA_PREFIXES.some((p) => normalizedTarget.startsWith(p)) | ||
| ? path.resolve(dirPath, normalizedTarget.replace(/^\//, "")) | ||
| : null; | ||
|
|
||
| const inAnyAllowedRoot = | ||
| allowedRoots.some((root) => isWithinRoot(resolvedRelative, root)) || | ||
| (resolvedInArchive !== null && isWithinRoot(resolvedInArchive, dirPath)); |
There was a problem hiding this comment.
Prefix allowlist is effectively bypassed by generic /sandbox root allowance.
The new allowlist check can still be bypassed because inAnyAllowedRoot accepts absolute /sandbox/* via the generic allowedRoots branch. Also, the archive remap strips only /, not /sandbox/, which does not match the intended mapping semantics.
🔧 Proposed fix
@@
- const SANDBOX_DATA_PREFIXES = ["/sandbox/.openclaw-data/", "/sandbox/.hermes-data/"];
+ const SANDBOX_DATA_PREFIXES = ["/sandbox/.openclaw-data/", "/sandbox/.hermes-data/"];
@@
- const normalizedTarget = path.posix.normalize(linkTarget);
- const resolvedInArchive =
- path.isAbsolute(normalizedTarget) &&
- SANDBOX_DATA_PREFIXES.some((p) => normalizedTarget.startsWith(p))
- ? path.resolve(dirPath, normalizedTarget.replace(/^\//, ""))
- : null;
+ const normalizedTarget = path.posix.normalize(linkTarget);
+ const isAbsoluteTarget = path.isAbsolute(normalizedTarget);
+ const resolvedInArchive =
+ isAbsoluteTarget && SANDBOX_DATA_PREFIXES.some((p) => normalizedTarget.startsWith(p))
+ ? path.resolve(dirPath, normalizedTarget.replace(/^\/sandbox\//, ""))
+ : null;
@@
- const inAnyAllowedRoot =
- allowedRoots.some((root) => isWithinRoot(resolvedRelative, root)) ||
+ const inAnyAllowedRoot =
+ (!isAbsoluteTarget && allowedRoots.some((root) => isWithinRoot(resolvedRelative, root))) ||
(resolvedInArchive !== null && isWithinRoot(resolvedInArchive, dirPath));@@
- const symlinkViolations = auditExtractedSymlinks(targetDir, [targetDir, "/sandbox"]);
+ const symlinkViolations = auditExtractedSymlinks(targetDir, [targetDir]);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/lib/sandbox-state.ts` around lines 296 - 310, The allowlist is bypassed
because the generic allowedRoots check can match absolute /sandbox/* paths
before the archive-specific mapping is applied; update the logic in the
SANDBOX_DATA_PREFIXES block so you first canonicalize and check for the archive
prefix (using normalizedTarget and SANDBOX_DATA_PREFIXES), then map only the
leading "/sandbox/" segment off normalizedTarget (e.g. replace(/^\/*sandbox\//,
"")) to produce resolvedInArchive, and ensure inAnyAllowedRoot uses isWithinRoot
against resolvedInArchive (when non-null) rather than relying on the
allowedRoots branch to accept raw /sandbox paths; also tighten the allowedRoots
check to operate on resolvedRelative only so generic roots cannot accidentally
accept sandbox-absolute targets.
|
✨ Thanks for submitting this pull request that proposes a way to fix a bug in the safeTarExtract function, allowing symlinks in /sandbox/.openclaw-data to be extracted correctly. Related open PRs: Related open issues: |
Problem
nemoclaw snapshot create,rebuild, andbackup-allall fail on 0.0.22 with:PR #2163 hardened
safeTarExtractagainst tar-slip path traversal by rejecting any symlink whose target resolves outside the extraction root. This correctly blocks/etc/passwdetc., but over-rejects legitimate intra-sandbox symlinks created byDockerfile.basefor the.openclaw/.openclaw-datasplit. When extracted on the host,/sandbox/.openclaw-data/mediadoesn't exist on the host and isn't within the extraction root — so the audit falsely rejects it.Fix
When the symlink target is an absolute path starting with
/sandbox/.openclaw-data/or/sandbox/.hermes-data/, also resolve it relative to the extraction root (strip the leading/sandbox/prefix, resolve undertargetDir). This mirrors how the symlink would resolve inside the sandbox container where these paths exist.Absolute paths outside these known prefixes are still rejected as before. The security boundary is unchanged.
Tests
Added two regression tests:
allows known-safe /sandbox/.openclaw-data symlinks in backup archives— the direct [NemoClaw 0.0.22]snapshot create/rebuild/backup-allblocked by safeTarExtract symlink audit on internal.openclaw-data/*symlinks (PR #2163) #2317 reprostill blocks absolute symlinks outside /sandbox/.openclaw-data—/etc/passwdmust stay blocked23/23 tests pass.
Fixes #2317
Signed-off-by: Benedikt Schackenberg 6381261+BenediktSchackenberg@users.noreply.github.com
Summary by CodeRabbit
Bug Fixes
Tests