Skip to content

fix(tui): default interactive Agent shell to approval-gated on#3756

Open
Hmbown wants to merge 2 commits into
mainfrom
codex/agent-shell-approval-default
Open

fix(tui): default interactive Agent shell to approval-gated on#3756
Hmbown wants to merge 2 commits into
mainfrom
codex/agent-shell-approval-default

Conversation

@Hmbown

@Hmbown Hmbown commented Jun 29, 2026

Copy link
Copy Markdown
Owner

Summary

  • expose shell tools by default for interactive Agent-mode TUI sessions so approvals can gate shell commands
  • keep explicit allow_shell=false respected for Agent sessions
  • keep GHSA-safe durable task defaults: omitted task fields still do not grant shell or auto-approve
  • stamp approved tool results with model-visible approval text plus structured metadata, so agents do not infer that shell ran ungated
  • update docs/MODES.md to describe the interactive default versus durable-task behavior

Verification

  • cargo fmt --all --check
  • git diff --check
  • ./scripts/release/check-versions.sh
  • cargo test -p codewhale-tui --bin codewhale-tui --locked approval_stamp
  • cargo test -p codewhale-tui --bin codewhale-tui --locked interactive_tui_defaults_agent_shell_to_approval_gated_on
  • cargo test -p codewhale-tui --bin codewhale-tui --locked allow_shell_defaults_to_false_when_unset
  • cargo test -p codewhale-tui --bin codewhale-tui --locked add_task_without_optional_fields_does_not_grant_shell_or_auto_approve
  • cargo test -p codewhale-tui --bin codewhale-tui --locked build_tool_context_uses_typed_shell_policy_per_mode
  • cargo test -p codewhale-tui --bin codewhale-tui --locked agent_tools_with_allow_shell

Interactive TUI sessions should expose shell tools in Agent mode so normal approval prompts can gate command execution. Keep the GHSA-safe Config::allow_shell() default for durable tasks and explicit allow_shell=false overrides, and document the distinction in the mode guide.

Tests: cargo fmt --all --check

Tests: git diff --check

Tests: ./scripts/release/check-versions.sh

Tests: cargo test -p codewhale-tui --bin codewhale-tui --locked interactive_tui_defaults_agent_shell_to_approval_gated_on

Tests: cargo test -p codewhale-tui --bin codewhale-tui --locked allow_shell_defaults_to_false_when_unset

Tests: cargo test -p codewhale-tui --bin codewhale-tui --locked add_task_without_optional_fields_does_not_grant_shell_or_auto_approve

Tests: cargo test -p codewhale-tui --bin codewhale-tui --locked build_tool_context_uses_typed_shell_policy_per_mode

Tests: cargo test -p codewhale-tui --bin codewhale-tui --locked agent_tools_with_allow_shell
@claude

claude Bot commented Jun 29, 2026

Copy link
Copy Markdown

Claude finished @Hmbown's task in 2m 10s —— View job


PR Review — fix(tui): default interactive Agent shell to approval-gated on

  • Read diff and gather context
  • Inspect Config::allow_shell() implementation and existing tests
  • Check durable-task / add_task path for shell policy
  • Review docs changes
  • Write review findings

Summary

Clean, focused change. The logic is correct and the critical GHSA-72w5-pf8h-xfp4 invariants are preserved: Config::allow_shell() still defaults to false, durable task paths still use that method, and only the interactive TUI layer adopts the new opt-out default.


Correctness

Good: The bifurcation is right.

Call site Method used Default
run_interactive (line 6520) interactive_tui_allow_shellunwrap_or(true) true (exposed, approval-gated)
durable task engine (line 7057) execution_config.allow_shell()unwrap_or(false) false (hidden)
add_task worker (line 7203) execution_config.allow_shell()unwrap_or(false) false (hidden)

One subtle edge case worth documenting (not a bug, just a potential surprise for upgrading users): DEEPSEEK_ALLOW_SHELL in the environment restores config.allow_shell to its pre-workspace-merge value — which for a bare config is None. In the old code Nonefalse for interactive; in the new code Nonetrue. So setting DEEPSEEK_ALLOW_SHELL used to incidentally keep shell hidden for interactive sessions when no explicit opt-in was present; now it doesn't. The env var's documented purpose (prevent workspace overlays from changing the policy) is still fully satisfied, but users relying on its incidental behavior would need an explicit allow_shell = false instead. Worth a brief changelog mention.


Tests

Coverage is solid for the new behavior:

  • interactive_tui_defaults_agent_shell_to_approval_gated_on — tests default Nonetrue, explicit Some(false)false, YOLO + Some(false)true. All three branches of interactive_tui_allow_shell covered. ✓
  • Existing allow_shell_defaults_to_false_when_unset (config/tests.rs) guards Config::allow_shell() method — unmodified and still passing. ✓
  • Existing add_task_without_optional_fields_does_not_grant_shell_or_auto_approve (task_manager.rs) guards durable task path — unmodified and still correct. ✓
  • Existing build_tool_context_uses_typed_shell_policy_per_mode (engine/tests.rs) — unmodified, still exercises EngineConfig.allow_shell = true/false. ✓

Missing but not blocking: No test verifies the full integrated path in run_interactive — i.e., that a workspace config containing allow_shell = false propagates through merge_project_configinteractive_tui_allow_shell → suppresses shell. The unit test on interactive_tui_allow_shell itself covers the logic; the integration gap is pre-existing.


Security

No new attack surface. Shell tools remain approval-gated in interactive mode (this doesn't add auto-approval). The change only moves shell from hidden to visible-but-gated in the interactive TUI, which is the intended UX improvement.


Docs (docs/MODES.md)

Accurate. The new paragraph clearly distinguishes interactive vs. durable-task defaults, and the table update correctly reflects allow_shell = false as the opt-out rather than the absence of allow_shell = true as the default.


Minor nit (non-blocking)

The interactive_tui_allow_shell helper is a 1-liner placed ~100 lines before its only call site. Could be an inline closure at the call site, but the named function makes the test simpler to write and the intent explicit — this is fine.


Verdict: Approve. Correctness holds, security posture is maintained, tests cover the new behavior, docs are accurate. Only thing worth adding to the changelog: the DEEPSEEK_ALLOW_SHELL edge case noted above for upgrading users who had no explicit allow_shell in their config.

Stamp approved tool results with model-visible approval text and structured metadata so Agent mode can report that shell execution was user-approved instead of inferring from successful output alone.

Covers both normal model tool calls and direct composer shell execution.

Tests: cargo fmt --all --check

Tests: git diff --check

Tests: ./scripts/release/check-versions.sh

Tests: cargo test -p codewhale-tui --bin codewhale-tui --locked approval_stamp

Tests: cargo test -p codewhale-tui --bin codewhale-tui --locked interactive_tui_defaults_agent_shell_to_approval_gated_on

Tests: cargo test -p codewhale-tui --bin codewhale-tui --locked allow_shell_defaults_to_false_when_unset

Tests: cargo test -p codewhale-tui --bin codewhale-tui --locked add_task_without_optional_fields_does_not_grant_shell_or_auto_approve
@claude

claude Bot commented Jun 29, 2026

Copy link
Copy Markdown

Claude finished @Hmbown's task in 2m 59s —— View job


PR Review — fix(tui): default interactive Agent shell to approval-gated on

  • Read diff and gather context
  • Review changed files in detail
  • Check correctness, security, and test coverage
  • Post findings

This sync added a second commit (50c6bcc fix(tui): surface approval decisions in tool results) on top of the previously-reviewed baa4b4b. Review covers both.


baa4b4b — default interactive shell to approval-gated on

No new concerns. The bifurcation is correct:

Path Function Default
run_interactive interactive_tui_allow_shellunwrap_or(true) exposed, gated
durable task engine Config::allow_shell()unwrap_or(false) hidden
add_task worker Config::allow_shell()unwrap_or(false) hidden

The three branches of interactive_tui_allow_shell are all tested. GHSA invariant preserved.


50c6bcc — surface approval decisions in tool results

Three call sites correctly stamp results:

  1. Composer-bang fast path, ApprovedApprovedByUser (engine.rs:1170)
  2. Composer-bang fast path, RetryWithPolicyApprovedWithPolicy (engine.rs:1216)
  3. Normal turn loop after execution (turn_loop.rs:2378)

These are separate execution paths; no double-stamp risk.

Issue — metadata update is not idempotent but content guard is (dispatch.rs:107-120)

// metadata written unconditionally
let metadata = result.metadata.get_or_insert_with(|| json!({}));
if let Some(object) = metadata.as_object_mut() {
    object.insert("approval".to_string(), approval_metadata); // always runs
}
// ...
if result.content.starts_with("[approval] ") {
    return; // only content update is skipped
}

If stamp_tool_result_approval were called twice (different stamps), the metadata would be overwritten by the second call but the content note would stay from the first. Not a real bug today since no double-stamp path exists, but the inconsistency makes this fragile if call sites are ever added. Either move the idempotency check before the metadata update, or check metadata["approval"].is_object() before inserting.

Minor — _prior fallback inner if let is always true (dispatch.rs:111-115)

After std::mem::replace(metadata, json!({})), *metadata is Value::Object, so metadata.as_object_mut() can't return None. The inner conditional is a no-op defensive check. The real defensive code is the outer else branch — that's fine. The inner guard just adds noise.

Missing test coverage:

  • stamp_tool_result_approval called with result.content.is_empty() — the empty-content branch at line 122 is untested
  • The _prior path (non-object existing metadata) is untested
  • No idempotency test (calling stamp twice should ideally be consistent)

model_visible: true in metadata (dispatch.rs:105): this field is set but there's no consumer of it visible in the diff. If it's reserved for future use, a brief comment would prevent a future reader from deleting it as dead data.


Security

No new attack surface. The change moves shell from hidden to visible-but-gated in interactive TUI (baa4b4b) and adds model-visible approval attribution to results (50c6bcc). Neither widens the actual execution surface.


Verdict

Approve. Both commits are functionally correct and the critical GHSA invariants are preserved. The issues above are non-blocking:

  • The idempotency ordering inconsistency is theoretical with no reachable double-stamp path today — worth a follow-up cleanup
  • Missing test cases are minor given the function is a leaf with simple branches

@Hmbown

Hmbown commented Jun 29, 2026

Copy link
Copy Markdown
Owner Author

Heads-up from 0.8.66 release prep: this PR is bundled into the local codex/local-0866-test-build rebuild candidate so the new default can be tested end-to-end.

While verifying it I found one internal inconsistency worth folding in before merge:

The durable Agent permission baseline doesn't follow the new interactive default. In crates/tui/src/tui/app.rs, App::new sets the #3386 baseline as:

agent_allow_shell: if matches!(initial_mode, AppMode::Auto | AppMode::Yolo) {
    config.allow_shell()          // hard-defaults false (GHSA-72w5-pf8h-xfp4)
} else {
    allow_shell                    // resolved interactive default (now true)
},

config.allow_shell() defaults false, so a session that starts in Auto/YOLO (e.g. default_mode = "yolo", or --yolo) and then downshifts to Agent restores to shell hidden — contradicting the documented "interactive Agent exposes shell by default." Agent-start sessions get the new default; Auto/YOLO-start sessions don't.

Suggested fix (keeps the security floor: explicit allow_shell = false still hides shell): add one shared resolver and route both startup and the baseline through it so they can't drift:

// crates/tui/src/config.rs
/// Interactive TUI Agent default: shell on, approval-gated. Honors an
/// explicit `allow_shell = false`. Single source of truth for startup
/// (run_interactive) and the durable Agent baseline (app.rs).
pub fn interactive_allow_shell(&self) -> bool { self.allow_shell.unwrap_or(true) }
  • main.rs: yolo || config.interactive_allow_shell()
  • app.rs baseline (Auto/YOLO arm): config.interactive_allow_shell() instead of config.allow_shell()

All other config.allow_shell() call sites (run_exec_agent, app-server runtime_api, task_manager, runtime_threads) are non-interactive and correctly keep the conservative default.

I've implemented and tested exactly this in the 0.8.66 test build (added interactive_allow_shell_defaults_to_true_but_honors_explicit_opt_out; cargo test -p codewhale-tui --bins green hermetically). Happy to push it onto codex/agent-shell-approval-default or open it as a stacked PR — your call, since this is your PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant