Skip to content

feat(builtins): port uutils env-default surface via virtual-env shim (TM-INF-024)#1592

Merged
chaliy merged 2 commits intomainfrom
claude/add-ls-env-support-yJR37
May 7, 2026
Merged

feat(builtins): port uutils env-default surface via virtual-env shim (TM-INF-024)#1592
chaliy merged 2 commits intomainfrom
claude/add-ls-env-support-yJR37

Conversation

@chaliy
Copy link
Copy Markdown
Contributor

@chaliy chaliy commented May 7, 2026

Summary

Closes the host-env side-channel that uutils' Arg::env(...) opened on ls
(TM-INF-024) and replaces it with a sandbox-correct virtual-env shim so the
env-default UX (TIME_STYLE, TABSIZE, future BLOCK_SIZE/LS_COLORS/…)
works against bashkit's ctx.env — never std::env.

What's the bug

bashkit-coreutils-port enabled clap's env cargo feature and emitted
Arg::new(..).env("TIME_STYLE") / .env("TABSIZE") verbatim. clap
resolved those defaults from the host process. Two leaks:

  1. Presence-probe: scripts inside the sandbox could detect whether
    the host had TIME_STYLE/TABSIZE set by observing ls behaviour.
  2. Availability: a host- or container-wide TIME_STYLE=long-iso
    tunneled into clap as a value source for an option ls hadn't
    ported yet, tripping the unsupported-option gate on every plain ls
    for unrelated tenants.

How — four-layer fix

  1. Codegen strips runtime .env(...) AND harvests it. The
    rewriter elides .env("FOO") chained calls from every Arg builder
    it ports and emits a sidecar
    pub static <UTIL>_ENV_DEFAULTS: &[clap_env::EnvDefault] table
    recording each stripped annotation as
    (arg_id, long, env_var, kind ∈ {Single, Bool, Multi}). Always
    emitted (possibly empty) so every util has a uniform surface.

  2. Virtual-env shim (crates/bashkit/src/builtins/clap_env.rs).
    apply_env_defaults(argv, defaults, ctx.env) -> argv' injects
    --<long> <value> for Single/Multi and --<long> for Bool
    when argv doesn't already specify the long flag. Matches clap's
    own "argv > env > default" precedence. Reads ctx.env only — no
    std::env access.

  3. Static guards.

    • no_clap_env_in_generated_parsers forbids runtime .env( calls
      in crates/bashkit/src/builtins/generated/*.rs.
    • every_generated_parser_emits_env_defaults_table enforces the
      uniform sidecar surface.
    • ls_env_defaults_surface_matches_uutils pins ls's expected
      env-default rows so a regen drift surfaces.
  4. Defence-in-depth. Workspace clap dep drops the env cargo
    feature, so a re-introduced .env(...) fails to compile rather
    than silently re-opening the channel.

Why a shim, not just stripping

Stripping alone (the first commit on this branch) closes the channel
but throws away the env-default UX uutils depends on. The shim
restores it with the sandbox boundary intact: every uutils
env-default lights up automatically as bashkit grows support for the
underlying option, no per-port wiring required.

Tests

  • 9 unit tests on clap_env::apply_env_defaults — precedence,
    argv-overrides-env, missing/empty env value, kind handling
    (Single/Bool/Multi), program-name slot, std::env isolation.
  • ls_honors_virtual_env_time_stylectx.env["TIME_STYLE"]
    reaches clap.
  • ls_argv_time_style_overrides_virtual_env — argv beats env.
  • ls_ignores_host_time_style_and_tabsize (#[serial]) — host
    std::env does NOT leak.
  • Static: no_clap_env_in_generated_parsers,
    every_generated_parser_emits_env_defaults_table,
    ls_env_defaults_surface_matches_uutils.

Test plan

  • cargo fmt --check clean
  • cargo clippy --all-targets -- -D warnings clean
  • cargo test -p bashkit --features http_client --lib (2291 pass)
  • cargo test --features http_client,jq --test threat_model_tests
    (174 pass)
  • cargo test -p bashkit --features http_client --test coreutils_differential_tests --test spec_tests (all pass)
  • CI green

Generated by Claude Code

chaliy added 2 commits May 7, 2026 14:45
uutils' uu_app() attaches `.env("TABSIZE")` / `.env("TIME_STYLE")` to ls
options so they default from std::env. With clap's `env` feature on,
that bypasses bashkit's ctx.env sandbox: scripts can probe host env
presence, and a host-set TIME_STYLE materializes as a clap value source
for an option ls hasn't implemented yet, breaking plain `ls` for
unrelated tenants.

Three-layer fix:

1. Codegen strips `.env("FOO")` chained calls in
   `bashkit-coreutils-port`, so generated `<util>_args.rs` only sees
   argv. Already-generated `ls_args.rs` is updated in place to match.
2. Static guard `no_clap_env_in_generated_parsers` greps every
   `builtins/generated/*.rs` and fails the build if `.env(` reappears.
3. Workspace `clap` dep drops the `env` cargo feature, so a
   re-introduced `.env(...)` fails to compile.

Adds a runtime regression test that exports TIME_STYLE/TABSIZE on the
host process and asserts plain `ls` still succeeds.
uutils' uu_app() ships .env("FOO") on options like TIME_STYLE/TABSIZE
so they default from std::env. Earlier fix stripped those calls
outright (TM-INF-023) which closed the host-env channel but threw
away the env-default UX. This restores the UX through bashkit's
virtual env without re-opening the host channel.

How it composes:

- Codegen now harvests every stripped .env("FOO") into a sidecar
  `pub static <UTIL>_ENV_DEFAULTS: &[clap_env::EnvDefault]` next to
  `<util>_command()`. Each row records (arg_id, long, env_var, kind).
  Always emitted (possibly empty) so every util has a uniform surface.

- New `builtins::clap_env` shim provides `apply_env_defaults(argv,
  defaults, ctx.env) -> argv'`. Reads only `ctx.env`, never std::env.
  Synthesises `--<long> <value>` for Single, `--<long>` for Bool,
  `--<long> <raw>` for Multi (clap re-splits via its own delimiter).
  Implements clap's documented "argv > env > default" precedence by
  skipping injection when argv already specifies the long flag.

- Ls::execute calls apply_env_defaults with LS_ENV_DEFAULTS before
  try_get_matches_from. Today's "not yet implemented" gate still
  fires for time-style/tabsize, but now the rejection is caused by
  bashkit-virtual env (or explicit argv), never the host process.

Tests:

- 9 unit tests on the shim (precedence, kind handling, std::env
  isolation, program-name slot).
- ls_honors_virtual_env_time_style asserts ctx.env reaches clap.
- ls_argv_time_style_overrides_virtual_env asserts precedence.
- ls_ignores_host_time_style_and_tabsize (existing) still pins the
  std::env half.
- Static tests: every_generated_parser_emits_env_defaults_table,
  ls_env_defaults_surface_matches_uutils, plus the existing
  no_clap_env_in_generated_parsers regression.

Per-builtin opt-in: cat/tac/shuf/readlink/truncate emit empty
defaults today; the shim wiring lights up automatically when their
upstream uu_app() grows .env(...) annotations and codegen reruns.
@cloudflare-workers-and-pages
Copy link
Copy Markdown

Deploying with  Cloudflare Workers  Cloudflare Workers

The latest updates on your project. Learn more about integrating Git with Workers.

Status Name Latest Commit Preview URL Updated (UTC)
✅ Deployment successful!
View logs
bashkit 1bcec0a Commit Preview URL

Branch Preview URL
May 07 2026, 03:08 PM

@chaliy chaliy merged commit 8a68189 into main May 7, 2026
34 checks passed
@chaliy chaliy deleted the claude/add-ls-env-support-yJR37 branch May 7, 2026 15:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant