fix(tui): default interactive Agent shell to approval-gated on by Hmbown · Pull Request #3756 · Hmbown/CodeWhale

Hmbown · 2026-06-29T01:36:43Z

Summary

expose shell tools by default for interactive Agent-mode TUI sessions so approvals can gate shell commands
keep explicit allow_shell=false respected for Agent sessions
keep GHSA-safe durable task defaults: omitted task fields still do not grant shell or auto-approve
stamp approved tool results with model-visible approval text plus structured metadata, so agents do not infer that shell ran ungated
update docs/MODES.md to describe the interactive default versus durable-task behavior

Verification

cargo fmt --all --check
git diff --check
./scripts/release/check-versions.sh
cargo test -p codewhale-tui --bin codewhale-tui --locked approval_stamp
cargo test -p codewhale-tui --bin codewhale-tui --locked interactive_tui_defaults_agent_shell_to_approval_gated_on
cargo test -p codewhale-tui --bin codewhale-tui --locked allow_shell_defaults_to_false_when_unset
cargo test -p codewhale-tui --bin codewhale-tui --locked add_task_without_optional_fields_does_not_grant_shell_or_auto_approve
cargo test -p codewhale-tui --bin codewhale-tui --locked build_tool_context_uses_typed_shell_policy_per_mode
cargo test -p codewhale-tui --bin codewhale-tui --locked agent_tools_with_allow_shell

Interactive TUI sessions should expose shell tools in Agent mode so normal approval prompts can gate command execution. Keep the GHSA-safe Config::allow_shell() default for durable tasks and explicit allow_shell=false overrides, and document the distinction in the mode guide. Tests: cargo fmt --all --check Tests: git diff --check Tests: ./scripts/release/check-versions.sh Tests: cargo test -p codewhale-tui --bin codewhale-tui --locked interactive_tui_defaults_agent_shell_to_approval_gated_on Tests: cargo test -p codewhale-tui --bin codewhale-tui --locked allow_shell_defaults_to_false_when_unset Tests: cargo test -p codewhale-tui --bin codewhale-tui --locked add_task_without_optional_fields_does_not_grant_shell_or_auto_approve Tests: cargo test -p codewhale-tui --bin codewhale-tui --locked build_tool_context_uses_typed_shell_policy_per_mode Tests: cargo test -p codewhale-tui --bin codewhale-tui --locked agent_tools_with_allow_shell

claude · 2026-06-29T01:36:57Z

Claude finished @Hmbown's task in 2m 10s —— View job

PR Review — fix(tui): default interactive Agent shell to approval-gated on

Read diff and gather context
Inspect Config::allow_shell() implementation and existing tests
Check durable-task / add_task path for shell policy
Review docs changes
Write review findings

Summary

Clean, focused change. The logic is correct and the critical GHSA-72w5-pf8h-xfp4 invariants are preserved: Config::allow_shell() still defaults to false, durable task paths still use that method, and only the interactive TUI layer adopts the new opt-out default.

Correctness

Good: The bifurcation is right.

Call site	Method used	Default
`run_interactive` (line 6520)	`interactive_tui_allow_shell` → `unwrap_or(true)`	true (exposed, approval-gated)
durable task engine (line 7057)	`execution_config.allow_shell()` → `unwrap_or(false)`	false (hidden)
add_task worker (line 7203)	`execution_config.allow_shell()` → `unwrap_or(false)`	false (hidden)

One subtle edge case worth documenting (not a bug, just a potential surprise for upgrading users): DEEPSEEK_ALLOW_SHELL in the environment restores config.allow_shell to its pre-workspace-merge value — which for a bare config is None. In the old code None → false for interactive; in the new code None → true. So setting DEEPSEEK_ALLOW_SHELL used to incidentally keep shell hidden for interactive sessions when no explicit opt-in was present; now it doesn't. The env var's documented purpose (prevent workspace overlays from changing the policy) is still fully satisfied, but users relying on its incidental behavior would need an explicit allow_shell = false instead. Worth a brief changelog mention.

Tests

Coverage is solid for the new behavior:

interactive_tui_defaults_agent_shell_to_approval_gated_on — tests default None → true, explicit Some(false) → false, YOLO + Some(false) → true. All three branches of interactive_tui_allow_shell covered. ✓
Existing allow_shell_defaults_to_false_when_unset (config/tests.rs) guards Config::allow_shell() method — unmodified and still passing. ✓
Existing add_task_without_optional_fields_does_not_grant_shell_or_auto_approve (task_manager.rs) guards durable task path — unmodified and still correct. ✓
Existing build_tool_context_uses_typed_shell_policy_per_mode (engine/tests.rs) — unmodified, still exercises EngineConfig.allow_shell = true/false. ✓

Missing but not blocking: No test verifies the full integrated path in run_interactive — i.e., that a workspace config containing allow_shell = false propagates through merge_project_config → interactive_tui_allow_shell → suppresses shell. The unit test on interactive_tui_allow_shell itself covers the logic; the integration gap is pre-existing.

Security

No new attack surface. Shell tools remain approval-gated in interactive mode (this doesn't add auto-approval). The change only moves shell from hidden to visible-but-gated in the interactive TUI, which is the intended UX improvement.

Docs (`docs/MODES.md`)

Accurate. The new paragraph clearly distinguishes interactive vs. durable-task defaults, and the table update correctly reflects allow_shell = false as the opt-out rather than the absence of allow_shell = true as the default.

Minor nit (non-blocking)

The interactive_tui_allow_shell helper is a 1-liner placed ~100 lines before its only call site. Could be an inline closure at the call site, but the named function makes the test simpler to write and the intent explicit — this is fine.

Verdict: Approve. Correctness holds, security posture is maintained, tests cover the new behavior, docs are accurate. Only thing worth adding to the changelog: the DEEPSEEK_ALLOW_SHELL edge case noted above for upgrading users who had no explicit allow_shell in their config.

Stamp approved tool results with model-visible approval text and structured metadata so Agent mode can report that shell execution was user-approved instead of inferring from successful output alone. Covers both normal model tool calls and direct composer shell execution. Tests: cargo fmt --all --check Tests: git diff --check Tests: ./scripts/release/check-versions.sh Tests: cargo test -p codewhale-tui --bin codewhale-tui --locked approval_stamp Tests: cargo test -p codewhale-tui --bin codewhale-tui --locked interactive_tui_defaults_agent_shell_to_approval_gated_on Tests: cargo test -p codewhale-tui --bin codewhale-tui --locked allow_shell_defaults_to_false_when_unset Tests: cargo test -p codewhale-tui --bin codewhale-tui --locked add_task_without_optional_fields_does_not_grant_shell_or_auto_approve

claude · 2026-06-29T01:56:14Z

Claude finished @Hmbown's task in 2m 59s —— View job

PR Review — fix(tui): default interactive Agent shell to approval-gated on

Read diff and gather context
Review changed files in detail
Check correctness, security, and test coverage
Post findings

This sync added a second commit (50c6bcc fix(tui): surface approval decisions in tool results) on top of the previously-reviewed baa4b4b. Review covers both.

`baa4b4b` — default interactive shell to approval-gated on

No new concerns. The bifurcation is correct:

Path	Function	Default
`run_interactive`	`interactive_tui_allow_shell` → `unwrap_or(true)`	exposed, gated
durable task engine	`Config::allow_shell()` → `unwrap_or(false)`	hidden
add_task worker	`Config::allow_shell()` → `unwrap_or(false)`	hidden

The three branches of interactive_tui_allow_shell are all tested. GHSA invariant preserved.

`50c6bcc` — surface approval decisions in tool results

Three call sites correctly stamp results:

Composer-bang fast path, Approved → ApprovedByUser (engine.rs:1170)
Composer-bang fast path, RetryWithPolicy → ApprovedWithPolicy (engine.rs:1216)
Normal turn loop after execution (turn_loop.rs:2378)

These are separate execution paths; no double-stamp risk.

Issue — metadata update is not idempotent but content guard is (dispatch.rs:107-120)

// metadata written unconditionally
let metadata = result.metadata.get_or_insert_with(|| json!({}));
if let Some(object) = metadata.as_object_mut() {
    object.insert("approval".to_string(), approval_metadata); // always runs
}
// ...
if result.content.starts_with("[approval] ") {
    return; // only content update is skipped
}

If stamp_tool_result_approval were called twice (different stamps), the metadata would be overwritten by the second call but the content note would stay from the first. Not a real bug today since no double-stamp path exists, but the inconsistency makes this fragile if call sites are ever added. Either move the idempotency check before the metadata update, or check metadata["approval"].is_object() before inserting.

Minor — _prior fallback inner if let is always true (dispatch.rs:111-115)

After std::mem::replace(metadata, json!({})), *metadata is Value::Object, so metadata.as_object_mut() can't return None. The inner conditional is a no-op defensive check. The real defensive code is the outer else branch — that's fine. The inner guard just adds noise.

Missing test coverage:

stamp_tool_result_approval called with result.content.is_empty() — the empty-content branch at line 122 is untested
The _prior path (non-object existing metadata) is untested
No idempotency test (calling stamp twice should ideally be consistent)

model_visible: true in metadata (dispatch.rs:105): this field is set but there's no consumer of it visible in the diff. If it's reserved for future use, a brief comment would prevent a future reader from deleting it as dead data.

Security

No new attack surface. The change moves shell from hidden to visible-but-gated in interactive TUI (baa4b4b) and adds model-visible approval attribution to results (50c6bcc). Neither widens the actual execution surface.

Verdict

Approve. Both commits are functionally correct and the critical GHSA invariants are preserved. The issues above are non-blocking:

The idempotency ordering inconsistency is theoretical with no reachable double-stamp path today — worth a follow-up cleanup
Missing test cases are minor given the function is a leaf with simple branches

Hmbown · 2026-06-29T06:18:18Z

Heads-up from 0.8.66 release prep: this PR is bundled into the local codex/local-0866-test-build rebuild candidate so the new default can be tested end-to-end.

While verifying it I found one internal inconsistency worth folding in before merge:

The durable Agent permission baseline doesn't follow the new interactive default. In crates/tui/src/tui/app.rs, App::new sets the #3386 baseline as:

agent_allow_shell: if matches!(initial_mode, AppMode::Auto | AppMode::Yolo) {
    config.allow_shell()          // hard-defaults false (GHSA-72w5-pf8h-xfp4)
} else {
    allow_shell                    // resolved interactive default (now true)
},

config.allow_shell() defaults false, so a session that starts in Auto/YOLO (e.g. default_mode = "yolo", or --yolo) and then downshifts to Agent restores to shell hidden — contradicting the documented "interactive Agent exposes shell by default." Agent-start sessions get the new default; Auto/YOLO-start sessions don't.

Suggested fix (keeps the security floor: explicit allow_shell = false still hides shell): add one shared resolver and route both startup and the baseline through it so they can't drift:

// crates/tui/src/config.rs
/// Interactive TUI Agent default: shell on, approval-gated. Honors an
/// explicit `allow_shell = false`. Single source of truth for startup
/// (run_interactive) and the durable Agent baseline (app.rs).
pub fn interactive_allow_shell(&self) -> bool { self.allow_shell.unwrap_or(true) }

main.rs: yolo || config.interactive_allow_shell()
app.rs baseline (Auto/YOLO arm): config.interactive_allow_shell() instead of config.allow_shell()

All other config.allow_shell() call sites (run_exec_agent, app-server runtime_api, task_manager, runtime_threads) are non-interactive and correctly keep the conservative default.

I've implemented and tested exactly this in the 0.8.66 test build (added interactive_allow_shell_defaults_to_true_but_honors_explicit_opt_out; cargo test -p codewhale-tui --bins green hermetically). Happy to push it onto codex/agent-shell-approval-default or open it as a stacked PR — your call, since this is your PR.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(tui): default interactive Agent shell to approval-gated on#3756

fix(tui): default interactive Agent shell to approval-gated on#3756
Hmbown wants to merge 2 commits into
mainfrom
codex/agent-shell-approval-default

Hmbown commented Jun 29, 2026 •

edited

Loading

Uh oh!

claude Bot commented Jun 29, 2026 •

edited

Loading

Uh oh!

claude Bot commented Jun 29, 2026 •

edited

Loading

Uh oh!

Hmbown commented Jun 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Hmbown commented Jun 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Verification

Uh oh!

claude Bot commented Jun 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review — fix(tui): default interactive Agent shell to approval-gated on

Summary

Correctness

Tests

Security

Docs (docs/MODES.md)

Minor nit (non-blocking)

Uh oh!

claude Bot commented Jun 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review — fix(tui): default interactive Agent shell to approval-gated on

baa4b4b — default interactive shell to approval-gated on

50c6bcc — surface approval decisions in tool results

Security

Verdict

Uh oh!

Hmbown commented Jun 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Hmbown commented Jun 29, 2026 •

edited

Loading

claude Bot commented Jun 29, 2026 •

edited

Loading

Docs (`docs/MODES.md`)

claude Bot commented Jun 29, 2026 •

edited

Loading

`baa4b4b` — default interactive shell to approval-gated on

`50c6bcc` — surface approval decisions in tool results