fix(tui): default interactive Agent shell to approval-gated on by Hmbown · Pull Request #3756 · Hmbown/CodeWhale

Hmbown · 2026-06-29T01:36:43Z

Summary

expose shell tools by default for interactive Agent-mode TUI sessions so approvals can gate shell commands
keep explicit allow_shell=false respected for Agent sessions
keep GHSA-safe durable task defaults: omitted task fields still do not grant shell or auto-approve
stamp approved tool results with model-visible approval text plus structured metadata, so agents do not infer that shell ran ungated
update docs/MODES.md to describe the interactive default versus durable-task behavior

Verification

cargo fmt --all --check
git diff --check
./scripts/release/check-versions.sh
cargo test -p codewhale-tui --bin codewhale-tui --locked approval_stamp
cargo test -p codewhale-tui --bin codewhale-tui --locked interactive_tui_defaults_agent_shell_to_approval_gated_on
cargo test -p codewhale-tui --bin codewhale-tui --locked allow_shell_defaults_to_false_when_unset
cargo test -p codewhale-tui --bin codewhale-tui --locked add_task_without_optional_fields_does_not_grant_shell_or_auto_approve
cargo test -p codewhale-tui --bin codewhale-tui --locked build_tool_context_uses_typed_shell_policy_per_mode
cargo test -p codewhale-tui --bin codewhale-tui --locked agent_tools_with_allow_shell

Interactive TUI sessions should expose shell tools in Agent mode so normal approval prompts can gate command execution. Keep the GHSA-safe Config::allow_shell() default for durable tasks and explicit allow_shell=false overrides, and document the distinction in the mode guide. Tests: cargo fmt --all --check Tests: git diff --check Tests: ./scripts/release/check-versions.sh Tests: cargo test -p codewhale-tui --bin codewhale-tui --locked interactive_tui_defaults_agent_shell_to_approval_gated_on Tests: cargo test -p codewhale-tui --bin codewhale-tui --locked allow_shell_defaults_to_false_when_unset Tests: cargo test -p codewhale-tui --bin codewhale-tui --locked add_task_without_optional_fields_does_not_grant_shell_or_auto_approve Tests: cargo test -p codewhale-tui --bin codewhale-tui --locked build_tool_context_uses_typed_shell_policy_per_mode Tests: cargo test -p codewhale-tui --bin codewhale-tui --locked agent_tools_with_allow_shell

claude · 2026-06-29T01:36:57Z

Claude finished @Hmbown's task in 2m 10s —— View job

PR Review — fix(tui): default interactive Agent shell to approval-gated on

Read diff and gather context
Inspect Config::allow_shell() implementation and existing tests
Check durable-task / add_task path for shell policy
Review docs changes
Write review findings

Summary

Clean, focused change. The logic is correct and the critical GHSA-72w5-pf8h-xfp4 invariants are preserved: Config::allow_shell() still defaults to false, durable task paths still use that method, and only the interactive TUI layer adopts the new opt-out default.

Correctness

Good: The bifurcation is right.

Call site	Method used	Default
`run_interactive` (line 6520)	`interactive_tui_allow_shell` → `unwrap_or(true)`	true (exposed, approval-gated)
durable task engine (line 7057)	`execution_config.allow_shell()` → `unwrap_or(false)`	false (hidden)
add_task worker (line 7203)	`execution_config.allow_shell()` → `unwrap_or(false)`	false (hidden)

One subtle edge case worth documenting (not a bug, just a potential surprise for upgrading users): DEEPSEEK_ALLOW_SHELL in the environment restores config.allow_shell to its pre-workspace-merge value — which for a bare config is None. In the old code None → false for interactive; in the new code None → true. So setting DEEPSEEK_ALLOW_SHELL used to incidentally keep shell hidden for interactive sessions when no explicit opt-in was present; now it doesn't. The env var's documented purpose (prevent workspace overlays from changing the policy) is still fully satisfied, but users relying on its incidental behavior would need an explicit allow_shell = false instead. Worth a brief changelog mention.

Tests

Coverage is solid for the new behavior:

interactive_tui_defaults_agent_shell_to_approval_gated_on — tests default None → true, explicit Some(false) → false, YOLO + Some(false) → true. All three branches of interactive_tui_allow_shell covered. ✓
Existing allow_shell_defaults_to_false_when_unset (config/tests.rs) guards Config::allow_shell() method — unmodified and still passing. ✓
Existing add_task_without_optional_fields_does_not_grant_shell_or_auto_approve (task_manager.rs) guards durable task path — unmodified and still correct. ✓
Existing build_tool_context_uses_typed_shell_policy_per_mode (engine/tests.rs) — unmodified, still exercises EngineConfig.allow_shell = true/false. ✓

Missing but not blocking: No test verifies the full integrated path in run_interactive — i.e., that a workspace config containing allow_shell = false propagates through merge_project_config → interactive_tui_allow_shell → suppresses shell. The unit test on interactive_tui_allow_shell itself covers the logic; the integration gap is pre-existing.

Security

No new attack surface. Shell tools remain approval-gated in interactive mode (this doesn't add auto-approval). The change only moves shell from hidden to visible-but-gated in the interactive TUI, which is the intended UX improvement.

Docs (`docs/MODES.md`)

Accurate. The new paragraph clearly distinguishes interactive vs. durable-task defaults, and the table update correctly reflects allow_shell = false as the opt-out rather than the absence of allow_shell = true as the default.

Minor nit (non-blocking)

The interactive_tui_allow_shell helper is a 1-liner placed ~100 lines before its only call site. Could be an inline closure at the call site, but the named function makes the test simpler to write and the intent explicit — this is fine.

Verdict: Approve. Correctness holds, security posture is maintained, tests cover the new behavior, docs are accurate. Only thing worth adding to the changelog: the DEEPSEEK_ALLOW_SHELL edge case noted above for upgrading users who had no explicit allow_shell in their config.

Stamp approved tool results with model-visible approval text and structured metadata so Agent mode can report that shell execution was user-approved instead of inferring from successful output alone. Covers both normal model tool calls and direct composer shell execution. Tests: cargo fmt --all --check Tests: git diff --check Tests: ./scripts/release/check-versions.sh Tests: cargo test -p codewhale-tui --bin codewhale-tui --locked approval_stamp Tests: cargo test -p codewhale-tui --bin codewhale-tui --locked interactive_tui_defaults_agent_shell_to_approval_gated_on Tests: cargo test -p codewhale-tui --bin codewhale-tui --locked allow_shell_defaults_to_false_when_unset Tests: cargo test -p codewhale-tui --bin codewhale-tui --locked add_task_without_optional_fields_does_not_grant_shell_or_auto_approve

claude · 2026-06-29T01:56:14Z

Claude finished @Hmbown's task in 2m 59s —— View job

PR Review — fix(tui): default interactive Agent shell to approval-gated on

Read diff and gather context
Review changed files in detail
Check correctness, security, and test coverage
Post findings

This sync added a second commit (50c6bcc fix(tui): surface approval decisions in tool results) on top of the previously-reviewed baa4b4b. Review covers both.

`baa4b4b` — default interactive shell to approval-gated on

No new concerns. The bifurcation is correct:

Path	Function	Default
`run_interactive`	`interactive_tui_allow_shell` → `unwrap_or(true)`	exposed, gated
durable task engine	`Config::allow_shell()` → `unwrap_or(false)`	hidden
add_task worker	`Config::allow_shell()` → `unwrap_or(false)`	hidden

The three branches of interactive_tui_allow_shell are all tested. GHSA invariant preserved.

`50c6bcc` — surface approval decisions in tool results

Three call sites correctly stamp results:

Composer-bang fast path, Approved → ApprovedByUser (engine.rs:1170)
Composer-bang fast path, RetryWithPolicy → ApprovedWithPolicy (engine.rs:1216)
Normal turn loop after execution (turn_loop.rs:2378)

These are separate execution paths; no double-stamp risk.

Issue — metadata update is not idempotent but content guard is (dispatch.rs:107-120)

// metadata written unconditionally
let metadata = result.metadata.get_or_insert_with(|| json!({}));
if let Some(object) = metadata.as_object_mut() {
    object.insert("approval".to_string(), approval_metadata); // always runs
}
// ...
if result.content.starts_with("[approval] ") {
    return; // only content update is skipped
}

If stamp_tool_result_approval were called twice (different stamps), the metadata would be overwritten by the second call but the content note would stay from the first. Not a real bug today since no double-stamp path exists, but the inconsistency makes this fragile if call sites are ever added. Either move the idempotency check before the metadata update, or check metadata["approval"].is_object() before inserting.

Minor — _prior fallback inner if let is always true (dispatch.rs:111-115)

After std::mem::replace(metadata, json!({})), *metadata is Value::Object, so metadata.as_object_mut() can't return None. The inner conditional is a no-op defensive check. The real defensive code is the outer else branch — that's fine. The inner guard just adds noise.

Missing test coverage:

stamp_tool_result_approval called with result.content.is_empty() — the empty-content branch at line 122 is untested
The _prior path (non-object existing metadata) is untested
No idempotency test (calling stamp twice should ideally be consistent)

model_visible: true in metadata (dispatch.rs:105): this field is set but there's no consumer of it visible in the diff. If it's reserved for future use, a brief comment would prevent a future reader from deleting it as dead data.

Security

No new attack surface. The change moves shell from hidden to visible-but-gated in interactive TUI (baa4b4b) and adds model-visible approval attribution to results (50c6bcc). Neither widens the actual execution surface.

Verdict

Approve. Both commits are functionally correct and the critical GHSA invariants are preserved. The issues above are non-blocking:

The idempotency ordering inconsistency is theoretical with no reachable double-stamp path today — worth a follow-up cleanup
Missing test cases are minor given the function is a leaf with simple branches

Hmbown · 2026-06-29T06:18:18Z

Heads-up from 0.8.66 release prep: this PR is bundled into the local codex/local-0866-test-build rebuild candidate so the new default can be tested end-to-end.

While verifying it I found one internal inconsistency worth folding in before merge:

The durable Agent permission baseline doesn't follow the new interactive default. In crates/tui/src/tui/app.rs, App::new sets the #3386 baseline as:

agent_allow_shell: if matches!(initial_mode, AppMode::Auto | AppMode::Yolo) {
    config.allow_shell()          // hard-defaults false (GHSA-72w5-pf8h-xfp4)
} else {
    allow_shell                    // resolved interactive default (now true)
},

config.allow_shell() defaults false, so a session that starts in Auto/YOLO (e.g. default_mode = "yolo", or --yolo) and then downshifts to Agent restores to shell hidden — contradicting the documented "interactive Agent exposes shell by default." Agent-start sessions get the new default; Auto/YOLO-start sessions don't.

Suggested fix (keeps the security floor: explicit allow_shell = false still hides shell): add one shared resolver and route both startup and the baseline through it so they can't drift:

// crates/tui/src/config.rs
/// Interactive TUI Agent default: shell on, approval-gated. Honors an
/// explicit `allow_shell = false`. Single source of truth for startup
/// (run_interactive) and the durable Agent baseline (app.rs).
pub fn interactive_allow_shell(&self) -> bool { self.allow_shell.unwrap_or(true) }

main.rs: yolo || config.interactive_allow_shell()
app.rs baseline (Auto/YOLO arm): config.interactive_allow_shell() instead of config.allow_shell()

All other config.allow_shell() call sites (run_exec_agent, app-server runtime_api, task_manager, runtime_threads) are non-interactive and correctly keep the conservative default.

I've implemented and tested exactly this in the 0.8.66 test build (added interactive_allow_shell_defaults_to_true_but_honors_explicit_opt_out; cargo test -p codewhale-tui --bins green hermetically). Happy to push it onto codex/agent-shell-approval-default or open it as a stacked PR — your call, since this is your PR.

Interactive Agent sessions now expose shell tools by default so approval prompts can gate commands, but Auto/YOLO startup still captured the durable Agent baseline through Config::allow_shell(), which defaults false for noninteractive safety. That made a YOLO-start session downshift to Agent with shell hidden unless the user had explicitly opted in. Add Config::interactive_allow_shell() as the shared interactive resolver and use it from both run_interactive and App::new's Agent baseline. Durable/background call sites continue to use Config::allow_shell() and keep the conservative default. Tests: cargo fmt --all --check Tests: git diff --check Tests: cargo test -p codewhale-tui --bin codewhale-tui --locked interactive_allow_shell_defaults_to_true_but_honors_explicit_opt_out Tests: cargo test -p codewhale-tui --bin codewhale-tui --locked interactive_tui_defaults_agent_shell_to_approval_gated_on Tests: cargo test -p codewhale-tui --bin codewhale-tui --locked yolo_start_with_default_config_restores_interactive_agent_shell_baseline Tests: cargo test -p codewhale-tui --bin codewhale-tui --locked leaving_yolo_after_startup_restores_baseline_policies

claude · 2026-06-29T16:54:47Z

Claude finished @Hmbown's task in 2m 48s —— View job

PR Review — fix(tui): default interactive Agent shell to approval-gated on

Read diff and gather context
Review third commit (share interactive Agent shell baseline)
Check interactive_allow_shell implementation in config.rs
Verify app.rs baseline fix
Check test coverage for new helper
Post findings

This sync added df16415 fix(tui): share interactive Agent shell baseline on top of the two previously reviewed commits. Review below covers all three.

Summary of the full stack

Commit	Change
`baa4b4b`	`interactive_tui_allow_shell` helper in `main.rs`; shell on by default in interactive TUI
`50c6bcc`	`stamp_tool_result_approval` in `dispatch.rs`; three call sites stamping approval metadata
`df16415`	`Config::interactive_allow_shell()` as single source of truth; fixes YOLO→Agent baseline drift

`df16415` — correctness (the new commit)

Fix is correct. The app.rs baseline change addresses the drift Hmbown identified:

// Before (baa4b4b only):
agent_allow_shell: if matches!(initial_mode, AppMode::Auto | AppMode::Yolo) {
    config.allow_shell()             // → false for default Config — wrong
} else {
    allow_shell                      // → true from interactive_tui_allow_shell
},

// After (df16415):
agent_allow_shell: if matches!(initial_mode, AppMode::Auto | AppMode::Yolo) {
    config.interactive_allow_shell() // → true for default Config — correct
} else {
    allow_shell                      // unchanged
},

A YOLO/Auto-start session now restores the correct interactive default when downshifting to Agent. The else arm (allow_shell) was already computed via interactive_tui_allow_shell, so both arms are now consistent.

Config::interactive_allow_shell() is clean. Doc comment accurately explains the GHSA carve-out and the "single source of truth" intent. The conservative Config::allow_shell() (→ false) for all non-interactive paths is untouched and tested by the existing allow_shell_defaults_to_false_when_unset guard.

main.rs interactive_tui_allow_shell is now consistent. Updated to call config.interactive_allow_shell() instead of config.allow_shell() — the two code paths (startup and the app.rs baseline) now both flow from the same Config method and cannot drift again.

Tests

New tests added in df16415:

interactive_allow_shell_defaults_to_true_but_honors_explicit_opt_out (config/tests.rs): covers None→true, Some(false)→false, Some(true)→true. All three branches of the new method. ✓
yolo_start_with_default_config_restores_interactive_agent_shell_baseline (app/tests.rs): YOLO start → Agent downshift with Config::default(). Asserts allow_shell stays true. ✓

Minor gap (non-blocking): No test covers YOLO-start + allow_shell = Some(false) → Agent downshift. The logic handles it correctly (the Auto/YOLO arm calls config.interactive_allow_shell() which returns false for Some(false)), but the integration path is untested. Worth a follow-up.

Pre-existing issues (from `50c6bcc`, not worsened by `df16415`)

stamp_tool_result_approval metadata/content idempotency ordering (dispatch.rs:107-120):

The _prior recovery path always succeeds (after std::mem::replace(metadata, json!({})) the value is Value::Object, so the inner if let Some(object) = metadata.as_object_mut() is unconditionally Some). Not a bug — just unreachable defensive code that adds noise.

The metadata is written unconditionally while the content prepend is guarded by starts_with("[approval] "). If a double-stamp path were ever added, metadata would be overwritten but content would not update. No such path exists today. Still worth making idempotent for future-proofing. Fix this →

model_visible: true in metadata has no consumer (dispatch.rs:105). Reserved for future use — a one-line comment would prevent deletion as dead data. Non-blocking.

Missing stamp_tool_result_approval test cases: empty-content branch (line 122) and non-object prior metadata (the _prior path) remain untested. approval_stamp_preserves_existing_metadata covers the object-metadata case; the others are minor.

GHSA invariants

All three critical call-site classes are correctly bifurcated:

Path	Method	Default
`run_interactive`	`interactive_tui_allow_shell` → `config.interactive_allow_shell()`	true (approval-gated)
`App::new` Auto/YOLO baseline	`config.interactive_allow_shell()`	true (approval-gated)
durable task engine	`config.allow_shell()`	false (hidden)
add_task worker	`config.allow_shell()`	false (hidden)

Verdict

Approve. The fix for the YOLO→Agent baseline drift (df16415) is correct, the GHSA invariants are preserved across all three commits, test coverage is solid for new behavior, and the docs accurately reflect the new interactive default. The remaining open items are all non-blocking: the idempotency ordering in stamp_tool_result_approval, the missing edge-case tests, and the undocumented model_visible field.

docs(shell): align CONFIGURATION/TOOL_SURFACE/error message with shell default (#3756 follow-up)

…l default Hmbown#3756 flipped interactive Agent shell to on-by-default (approval-gated) and updated MODES.md, but CONFIGURATION.md, TOOL_SURFACE.md, and the exec_shell "unavailable" reason / allow_shell hint in tool_catalog.rs still described the old opt-in model. Post-merge `main` therefore had docs that contradicted both the runtime and MODES.md. Align all three with the shipped behavior: interactive TUI Agent exposes shell by default with approval prompts unless `allow_shell = false`; headless/durable/ noninteractive profiles keep the conservative omitted-field default and require `allow_shell = true`; Plan mode hides shell; YOLO enables shell + auto-approval. Behavior unchanged (docs + user-facing message only). Existing shell-message tests still pass (the improved message keeps the asserted substrings). Completes the documentation for Hmbown#3756.

Hmbown merged commit f6a19c2 into main Jun 29, 2026
18 checks passed

Hmbown deleted the codex/agent-shell-approval-default branch June 29, 2026 17:07

This was referenced Jun 29, 2026

feat(tui): hotbar Alt+1-8 discoverability + decision-card key disambiguation #3786

Closed

docs(shell): align CONFIGURATION/TOOL_SURFACE/error message with shell default (#3756 follow-up) #3787

Merged

Hmbown added a commit that referenced this pull request Jun 29, 2026

Merge pull request #3787 from Hmbown/codex/0866-shell-default-docs

72f6cb4

docs(shell): align CONFIGURATION/TOOL_SURFACE/error message with shell default (#3756 follow-up)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(tui): default interactive Agent shell to approval-gated on#3756

fix(tui): default interactive Agent shell to approval-gated on#3756
Hmbown merged 3 commits into
mainfrom
codex/agent-shell-approval-default

Hmbown commented Jun 29, 2026 •

edited

Loading

Uh oh!

claude Bot commented Jun 29, 2026 •

edited

Loading

Uh oh!

claude Bot commented Jun 29, 2026 •

edited

Loading

Uh oh!

Hmbown commented Jun 29, 2026

Uh oh!

claude Bot commented Jun 29, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Hmbown commented Jun 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Verification

Uh oh!

claude Bot commented Jun 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review — fix(tui): default interactive Agent shell to approval-gated on

Summary

Correctness

Tests

Security

Docs (docs/MODES.md)

Minor nit (non-blocking)

Uh oh!

claude Bot commented Jun 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review — fix(tui): default interactive Agent shell to approval-gated on

baa4b4b — default interactive shell to approval-gated on

50c6bcc — surface approval decisions in tool results

Security

Verdict

Uh oh!

Hmbown commented Jun 29, 2026

Uh oh!

claude Bot commented Jun 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review — fix(tui): default interactive Agent shell to approval-gated on

Summary of the full stack

df16415 — correctness (the new commit)

Tests

Pre-existing issues (from 50c6bcc, not worsened by df16415)

GHSA invariants

Verdict

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Hmbown commented Jun 29, 2026 •

edited

Loading

claude Bot commented Jun 29, 2026 •

edited

Loading

Docs (`docs/MODES.md`)

claude Bot commented Jun 29, 2026 •

edited

Loading

`baa4b4b` — default interactive shell to approval-gated on

`50c6bcc` — surface approval decisions in tool results

claude Bot commented Jun 29, 2026 •

edited

Loading

`df16415` — correctness (the new commit)

Pre-existing issues (from `50c6bcc`, not worsened by `df16415`)