Skip to content

fix(fs): reject zero-width and invisible Unicode chars in path components#1552

Merged
chaliy merged 1 commit intomainfrom
claude/threat-model-issue-IfuVb
May 6, 2026
Merged

fix(fs): reject zero-width and invisible Unicode chars in path components#1552
chaliy merged 1 commit intomainfrom
claude/threat-model-issue-IfuVb

Conversation

@chaliy
Copy link
Copy Markdown
Contributor

@chaliy chaliy commented May 6, 2026

Summary

Closes the four UNMITIGATED gaps in the threat model's Unicode section by extending find_unsafe_path_char() to reject the documented invisible / confusable code-point ranges. Without this, an attacker can stash content under a filename that looks visually identical to a benign one in any UI that renders the bytes (terminal, file picker, tool output).

Mitigates:

  • TM-UNI-003 — zero-width chars (U+200B-U+200D, U+2060, U+FEFF, U+180E)
  • TM-UNI-013 — deprecated format chars (U+206A-U+206F)
  • TM-UNI-012 — interlinear annotation markers (U+FFF9-U+FFFB)
  • TM-UNI-011 — tag block (U+E0000-U+E007F)

Variable names, script source, and command output are intentionally untouched — pass-through there matches Bash and is already covered by TM-UNI-004 / TM-UNI-005 (accepted risk).

Why

specs/threat-model.md already documented the gap and the fix shape ("extend find_unsafe_path_char()"). All four threats share a single mitigation point, so they collapse into one small, focused change.

How

  • crates/bashkit/src/fs/limits.rs: find_unsafe_path_char() gains four match/range checks for the new code-point ranges, each emitting a Display-only error label ((zero-width), (deprecated format), (interlinear annotation), (tag char)).
  • specs/threat-model.md: status flipped to MITIGATED in §11.2, §11.6, the Unicode summary table, and the Open (Medium) table.
  • Module-level threat list updated to point at the new mitigations.

Tests

Unit tests in limits.rs (11 new):

  • One _rejected test per code-point family (ZWSP, ZWNJ, ZWJ, Word Joiner, BOM, Mongolian Vowel Separator, deprecated format range, interlinear annotation range, tag block boundary samples).
  • test_validate_path_adjacent_chars_allowed — guards against over-blocking by asserting U+200A (HAIR SPACE), U+200E/F (LRM/RLM), and U+2070 (SUPERSCRIPT ZERO) still pass.

Integration tests in crates/bashkit/tests/unicode_security_tests.rs:

  • The 6 existing *_current_behavior tests in zero_width_chars and invisible_char_tests (which previously documented the gap with let _ = result;) are rewritten to assert expect_err, so they would fail if the mitigation regressed.

Verified locally:

  • cargo test -p bashkit --lib: 2207 pass.
  • cargo test -p bashkit --test unicode_security_tests: 71 pass.
  • cargo test -p bashkit --test threat_model_tests: 170 pass (1 pre-existing failure on main unrelated to this PR — builtin_parser_depth::threat_jq_moderate_nesting_works).
  • cargo test -p bashkit --test overlay_path_validation_tests --test custom_fs_tests --test symlink_overlay_security_tests: all pass.
  • cargo fmt --check, cargo clippy --all-targets -- -D warnings: clean.

Test plan

  • Unit tests for each new rejected code-point family
  • Adjacent-character allow-list test guards against over-blocking
  • Existing "current_behavior" gap tests rewritten to assert mitigation
  • No regression in path-validation integration tests
  • fmt + clippy clean

Generated by Claude Code

…ents

Closes the four UNMITIGATED gaps in the threat model's Unicode section by
extending find_unsafe_path_char() to reject the documented invisible /
confusable ranges. These chars produce visually-identical filenames that
let attackers stash content under names that look benign in any UI.

Mitigates:
- TM-UNI-003: zero-width chars (U+200B-U+200D, U+2060, U+FEFF, U+180E)
- TM-UNI-013: deprecated format chars (U+206A-U+206F)
- TM-UNI-012: interlinear annotation markers (U+FFF9-U+FFFB)
- TM-UNI-011: tag block (U+E0000-U+E007F)

Variable names, script source, and command output stay unaffected --
pass-through there matches Bash. Existing tests that documented the
"current behavior" gap are rewritten to assert rejection.
@cloudflare-workers-and-pages
Copy link
Copy Markdown

Deploying with  Cloudflare Workers  Cloudflare Workers

The latest updates on your project. Learn more about integrating Git with Workers.

Status Name Latest Commit Preview URL Updated (UTC)
✅ Deployment successful!
View logs
bashkit 1dbc1e2 Commit Preview URL

Branch Preview URL
May 06 2026, 04:10 AM

@chaliy chaliy merged commit dee8e4a into main May 6, 2026
34 checks passed
@chaliy chaliy deleted the claude/threat-model-issue-IfuVb branch May 6, 2026 04:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant