Skip to content

fix(tui): default interactive Agent shell to approval-gated on#3756

Merged
Hmbown merged 3 commits into
mainfrom
codex/agent-shell-approval-default
Jun 29, 2026
Merged

fix(tui): default interactive Agent shell to approval-gated on#3756
Hmbown merged 3 commits into
mainfrom
codex/agent-shell-approval-default

Conversation

@Hmbown

@Hmbown Hmbown commented Jun 29, 2026

Copy link
Copy Markdown
Owner

Summary

  • expose shell tools by default for interactive Agent-mode TUI sessions so approvals can gate shell commands
  • keep explicit allow_shell=false respected for Agent sessions
  • keep GHSA-safe durable task defaults: omitted task fields still do not grant shell or auto-approve
  • stamp approved tool results with model-visible approval text plus structured metadata, so agents do not infer that shell ran ungated
  • update docs/MODES.md to describe the interactive default versus durable-task behavior

Verification

  • cargo fmt --all --check
  • git diff --check
  • ./scripts/release/check-versions.sh
  • cargo test -p codewhale-tui --bin codewhale-tui --locked approval_stamp
  • cargo test -p codewhale-tui --bin codewhale-tui --locked interactive_tui_defaults_agent_shell_to_approval_gated_on
  • cargo test -p codewhale-tui --bin codewhale-tui --locked allow_shell_defaults_to_false_when_unset
  • cargo test -p codewhale-tui --bin codewhale-tui --locked add_task_without_optional_fields_does_not_grant_shell_or_auto_approve
  • cargo test -p codewhale-tui --bin codewhale-tui --locked build_tool_context_uses_typed_shell_policy_per_mode
  • cargo test -p codewhale-tui --bin codewhale-tui --locked agent_tools_with_allow_shell

Interactive TUI sessions should expose shell tools in Agent mode so normal approval prompts can gate command execution. Keep the GHSA-safe Config::allow_shell() default for durable tasks and explicit allow_shell=false overrides, and document the distinction in the mode guide.

Tests: cargo fmt --all --check

Tests: git diff --check

Tests: ./scripts/release/check-versions.sh

Tests: cargo test -p codewhale-tui --bin codewhale-tui --locked interactive_tui_defaults_agent_shell_to_approval_gated_on

Tests: cargo test -p codewhale-tui --bin codewhale-tui --locked allow_shell_defaults_to_false_when_unset

Tests: cargo test -p codewhale-tui --bin codewhale-tui --locked add_task_without_optional_fields_does_not_grant_shell_or_auto_approve

Tests: cargo test -p codewhale-tui --bin codewhale-tui --locked build_tool_context_uses_typed_shell_policy_per_mode

Tests: cargo test -p codewhale-tui --bin codewhale-tui --locked agent_tools_with_allow_shell
@claude

claude Bot commented Jun 29, 2026

Copy link
Copy Markdown

Claude finished @Hmbown's task in 2m 10s —— View job


PR Review — fix(tui): default interactive Agent shell to approval-gated on

  • Read diff and gather context
  • Inspect Config::allow_shell() implementation and existing tests
  • Check durable-task / add_task path for shell policy
  • Review docs changes
  • Write review findings

Summary

Clean, focused change. The logic is correct and the critical GHSA-72w5-pf8h-xfp4 invariants are preserved: Config::allow_shell() still defaults to false, durable task paths still use that method, and only the interactive TUI layer adopts the new opt-out default.


Correctness

Good: The bifurcation is right.

Call site Method used Default
run_interactive (line 6520) interactive_tui_allow_shellunwrap_or(true) true (exposed, approval-gated)
durable task engine (line 7057) execution_config.allow_shell()unwrap_or(false) false (hidden)
add_task worker (line 7203) execution_config.allow_shell()unwrap_or(false) false (hidden)

One subtle edge case worth documenting (not a bug, just a potential surprise for upgrading users): DEEPSEEK_ALLOW_SHELL in the environment restores config.allow_shell to its pre-workspace-merge value — which for a bare config is None. In the old code Nonefalse for interactive; in the new code Nonetrue. So setting DEEPSEEK_ALLOW_SHELL used to incidentally keep shell hidden for interactive sessions when no explicit opt-in was present; now it doesn't. The env var's documented purpose (prevent workspace overlays from changing the policy) is still fully satisfied, but users relying on its incidental behavior would need an explicit allow_shell = false instead. Worth a brief changelog mention.


Tests

Coverage is solid for the new behavior:

  • interactive_tui_defaults_agent_shell_to_approval_gated_on — tests default Nonetrue, explicit Some(false)false, YOLO + Some(false)true. All three branches of interactive_tui_allow_shell covered. ✓
  • Existing allow_shell_defaults_to_false_when_unset (config/tests.rs) guards Config::allow_shell() method — unmodified and still passing. ✓
  • Existing add_task_without_optional_fields_does_not_grant_shell_or_auto_approve (task_manager.rs) guards durable task path — unmodified and still correct. ✓
  • Existing build_tool_context_uses_typed_shell_policy_per_mode (engine/tests.rs) — unmodified, still exercises EngineConfig.allow_shell = true/false. ✓

Missing but not blocking: No test verifies the full integrated path in run_interactive — i.e., that a workspace config containing allow_shell = false propagates through merge_project_configinteractive_tui_allow_shell → suppresses shell. The unit test on interactive_tui_allow_shell itself covers the logic; the integration gap is pre-existing.


Security

No new attack surface. Shell tools remain approval-gated in interactive mode (this doesn't add auto-approval). The change only moves shell from hidden to visible-but-gated in the interactive TUI, which is the intended UX improvement.


Docs (docs/MODES.md)

Accurate. The new paragraph clearly distinguishes interactive vs. durable-task defaults, and the table update correctly reflects allow_shell = false as the opt-out rather than the absence of allow_shell = true as the default.


Minor nit (non-blocking)

The interactive_tui_allow_shell helper is a 1-liner placed ~100 lines before its only call site. Could be an inline closure at the call site, but the named function makes the test simpler to write and the intent explicit — this is fine.


Verdict: Approve. Correctness holds, security posture is maintained, tests cover the new behavior, docs are accurate. Only thing worth adding to the changelog: the DEEPSEEK_ALLOW_SHELL edge case noted above for upgrading users who had no explicit allow_shell in their config.

Stamp approved tool results with model-visible approval text and structured metadata so Agent mode can report that shell execution was user-approved instead of inferring from successful output alone.

Covers both normal model tool calls and direct composer shell execution.

Tests: cargo fmt --all --check

Tests: git diff --check

Tests: ./scripts/release/check-versions.sh

Tests: cargo test -p codewhale-tui --bin codewhale-tui --locked approval_stamp

Tests: cargo test -p codewhale-tui --bin codewhale-tui --locked interactive_tui_defaults_agent_shell_to_approval_gated_on

Tests: cargo test -p codewhale-tui --bin codewhale-tui --locked allow_shell_defaults_to_false_when_unset

Tests: cargo test -p codewhale-tui --bin codewhale-tui --locked add_task_without_optional_fields_does_not_grant_shell_or_auto_approve
@claude

claude Bot commented Jun 29, 2026

Copy link
Copy Markdown

Claude finished @Hmbown's task in 2m 59s —— View job


PR Review — fix(tui): default interactive Agent shell to approval-gated on

  • Read diff and gather context
  • Review changed files in detail
  • Check correctness, security, and test coverage
  • Post findings

This sync added a second commit (50c6bcc fix(tui): surface approval decisions in tool results) on top of the previously-reviewed baa4b4b. Review covers both.


baa4b4b — default interactive shell to approval-gated on

No new concerns. The bifurcation is correct:

Path Function Default
run_interactive interactive_tui_allow_shellunwrap_or(true) exposed, gated
durable task engine Config::allow_shell()unwrap_or(false) hidden
add_task worker Config::allow_shell()unwrap_or(false) hidden

The three branches of interactive_tui_allow_shell are all tested. GHSA invariant preserved.


50c6bcc — surface approval decisions in tool results

Three call sites correctly stamp results:

  1. Composer-bang fast path, ApprovedApprovedByUser (engine.rs:1170)
  2. Composer-bang fast path, RetryWithPolicyApprovedWithPolicy (engine.rs:1216)
  3. Normal turn loop after execution (turn_loop.rs:2378)

These are separate execution paths; no double-stamp risk.

Issue — metadata update is not idempotent but content guard is (dispatch.rs:107-120)

// metadata written unconditionally
let metadata = result.metadata.get_or_insert_with(|| json!({}));
if let Some(object) = metadata.as_object_mut() {
    object.insert("approval".to_string(), approval_metadata); // always runs
}
// ...
if result.content.starts_with("[approval] ") {
    return; // only content update is skipped
}

If stamp_tool_result_approval were called twice (different stamps), the metadata would be overwritten by the second call but the content note would stay from the first. Not a real bug today since no double-stamp path exists, but the inconsistency makes this fragile if call sites are ever added. Either move the idempotency check before the metadata update, or check metadata["approval"].is_object() before inserting.

Minor — _prior fallback inner if let is always true (dispatch.rs:111-115)

After std::mem::replace(metadata, json!({})), *metadata is Value::Object, so metadata.as_object_mut() can't return None. The inner conditional is a no-op defensive check. The real defensive code is the outer else branch — that's fine. The inner guard just adds noise.

Missing test coverage:

  • stamp_tool_result_approval called with result.content.is_empty() — the empty-content branch at line 122 is untested
  • The _prior path (non-object existing metadata) is untested
  • No idempotency test (calling stamp twice should ideally be consistent)

model_visible: true in metadata (dispatch.rs:105): this field is set but there's no consumer of it visible in the diff. If it's reserved for future use, a brief comment would prevent a future reader from deleting it as dead data.


Security

No new attack surface. The change moves shell from hidden to visible-but-gated in interactive TUI (baa4b4b) and adds model-visible approval attribution to results (50c6bcc). Neither widens the actual execution surface.


Verdict

Approve. Both commits are functionally correct and the critical GHSA invariants are preserved. The issues above are non-blocking:

  • The idempotency ordering inconsistency is theoretical with no reachable double-stamp path today — worth a follow-up cleanup
  • Missing test cases are minor given the function is a leaf with simple branches

@Hmbown

Hmbown commented Jun 29, 2026

Copy link
Copy Markdown
Owner Author

Heads-up from 0.8.66 release prep: this PR is bundled into the local codex/local-0866-test-build rebuild candidate so the new default can be tested end-to-end.

While verifying it I found one internal inconsistency worth folding in before merge:

The durable Agent permission baseline doesn't follow the new interactive default. In crates/tui/src/tui/app.rs, App::new sets the #3386 baseline as:

agent_allow_shell: if matches!(initial_mode, AppMode::Auto | AppMode::Yolo) {
    config.allow_shell()          // hard-defaults false (GHSA-72w5-pf8h-xfp4)
} else {
    allow_shell                    // resolved interactive default (now true)
},

config.allow_shell() defaults false, so a session that starts in Auto/YOLO (e.g. default_mode = "yolo", or --yolo) and then downshifts to Agent restores to shell hidden — contradicting the documented "interactive Agent exposes shell by default." Agent-start sessions get the new default; Auto/YOLO-start sessions don't.

Suggested fix (keeps the security floor: explicit allow_shell = false still hides shell): add one shared resolver and route both startup and the baseline through it so they can't drift:

// crates/tui/src/config.rs
/// Interactive TUI Agent default: shell on, approval-gated. Honors an
/// explicit `allow_shell = false`. Single source of truth for startup
/// (run_interactive) and the durable Agent baseline (app.rs).
pub fn interactive_allow_shell(&self) -> bool { self.allow_shell.unwrap_or(true) }
  • main.rs: yolo || config.interactive_allow_shell()
  • app.rs baseline (Auto/YOLO arm): config.interactive_allow_shell() instead of config.allow_shell()

All other config.allow_shell() call sites (run_exec_agent, app-server runtime_api, task_manager, runtime_threads) are non-interactive and correctly keep the conservative default.

I've implemented and tested exactly this in the 0.8.66 test build (added interactive_allow_shell_defaults_to_true_but_honors_explicit_opt_out; cargo test -p codewhale-tui --bins green hermetically). Happy to push it onto codex/agent-shell-approval-default or open it as a stacked PR — your call, since this is your PR.

Interactive Agent sessions now expose shell tools by default so approval prompts can gate commands, but Auto/YOLO startup still captured the durable Agent baseline through Config::allow_shell(), which defaults false for noninteractive safety. That made a YOLO-start session downshift to Agent with shell hidden unless the user had explicitly opted in.

Add Config::interactive_allow_shell() as the shared interactive resolver and use it from both run_interactive and App::new's Agent baseline. Durable/background call sites continue to use Config::allow_shell() and keep the conservative default.

Tests: cargo fmt --all --check

Tests: git diff --check

Tests: cargo test -p codewhale-tui --bin codewhale-tui --locked interactive_allow_shell_defaults_to_true_but_honors_explicit_opt_out

Tests: cargo test -p codewhale-tui --bin codewhale-tui --locked interactive_tui_defaults_agent_shell_to_approval_gated_on

Tests: cargo test -p codewhale-tui --bin codewhale-tui --locked yolo_start_with_default_config_restores_interactive_agent_shell_baseline

Tests: cargo test -p codewhale-tui --bin codewhale-tui --locked leaving_yolo_after_startup_restores_baseline_policies
@claude

claude Bot commented Jun 29, 2026

Copy link
Copy Markdown

Claude finished @Hmbown's task in 2m 48s —— View job


PR Review — fix(tui): default interactive Agent shell to approval-gated on

  • Read diff and gather context
  • Review third commit (share interactive Agent shell baseline)
  • Check interactive_allow_shell implementation in config.rs
  • Verify app.rs baseline fix
  • Check test coverage for new helper
  • Post findings

This sync added df16415 fix(tui): share interactive Agent shell baseline on top of the two previously reviewed commits. Review below covers all three.


Summary of the full stack

Commit Change
baa4b4b interactive_tui_allow_shell helper in main.rs; shell on by default in interactive TUI
50c6bcc stamp_tool_result_approval in dispatch.rs; three call sites stamping approval metadata
df16415 Config::interactive_allow_shell() as single source of truth; fixes YOLO→Agent baseline drift

df16415 — correctness (the new commit)

Fix is correct. The app.rs baseline change addresses the drift Hmbown identified:

// Before (baa4b4b only):
agent_allow_shell: if matches!(initial_mode, AppMode::Auto | AppMode::Yolo) {
    config.allow_shell()             // → false for default Config — wrong
} else {
    allow_shell                      // → true from interactive_tui_allow_shell
},

// After (df16415):
agent_allow_shell: if matches!(initial_mode, AppMode::Auto | AppMode::Yolo) {
    config.interactive_allow_shell() // → true for default Config — correct
} else {
    allow_shell                      // unchanged
},

A YOLO/Auto-start session now restores the correct interactive default when downshifting to Agent. The else arm (allow_shell) was already computed via interactive_tui_allow_shell, so both arms are now consistent.

Config::interactive_allow_shell() is clean. Doc comment accurately explains the GHSA carve-out and the "single source of truth" intent. The conservative Config::allow_shell() (→ false) for all non-interactive paths is untouched and tested by the existing allow_shell_defaults_to_false_when_unset guard.

main.rs interactive_tui_allow_shell is now consistent. Updated to call config.interactive_allow_shell() instead of config.allow_shell() — the two code paths (startup and the app.rs baseline) now both flow from the same Config method and cannot drift again.


Tests

New tests added in df16415:

  • interactive_allow_shell_defaults_to_true_but_honors_explicit_opt_out (config/tests.rs): covers Nonetrue, Some(false)false, Some(true)true. All three branches of the new method. ✓
  • yolo_start_with_default_config_restores_interactive_agent_shell_baseline (app/tests.rs): YOLO start → Agent downshift with Config::default(). Asserts allow_shell stays true. ✓

Minor gap (non-blocking): No test covers YOLO-start + allow_shell = Some(false) → Agent downshift. The logic handles it correctly (the Auto/YOLO arm calls config.interactive_allow_shell() which returns false for Some(false)), but the integration path is untested. Worth a follow-up.


Pre-existing issues (from 50c6bcc, not worsened by df16415)

stamp_tool_result_approval metadata/content idempotency ordering (dispatch.rs:107-120):

The _prior recovery path always succeeds (after std::mem::replace(metadata, json!({})) the value is Value::Object, so the inner if let Some(object) = metadata.as_object_mut() is unconditionally Some). Not a bug — just unreachable defensive code that adds noise.

The metadata is written unconditionally while the content prepend is guarded by starts_with("[approval] "). If a double-stamp path were ever added, metadata would be overwritten but content would not update. No such path exists today. Still worth making idempotent for future-proofing. Fix this →

model_visible: true in metadata has no consumer (dispatch.rs:105). Reserved for future use — a one-line comment would prevent deletion as dead data. Non-blocking.

Missing stamp_tool_result_approval test cases: empty-content branch (line 122) and non-object prior metadata (the _prior path) remain untested. approval_stamp_preserves_existing_metadata covers the object-metadata case; the others are minor.


GHSA invariants

All three critical call-site classes are correctly bifurcated:

Path Method Default
run_interactive interactive_tui_allow_shellconfig.interactive_allow_shell() true (approval-gated)
App::new Auto/YOLO baseline config.interactive_allow_shell() true (approval-gated)
durable task engine config.allow_shell() false (hidden)
add_task worker config.allow_shell() false (hidden)

Verdict

Approve. The fix for the YOLO→Agent baseline drift (df16415) is correct, the GHSA invariants are preserved across all three commits, test coverage is solid for new behavior, and the docs accurately reflect the new interactive default. The remaining open items are all non-blocking: the idempotency ordering in stamp_tool_result_approval, the missing edge-case tests, and the undocumented model_visible field.

@Hmbown Hmbown merged commit f6a19c2 into main Jun 29, 2026
18 checks passed
@Hmbown Hmbown deleted the codex/agent-shell-approval-default branch June 29, 2026 17:07
Hmbown added a commit that referenced this pull request Jun 29, 2026
docs(shell): align CONFIGURATION/TOOL_SURFACE/error message with shell default (#3756 follow-up)
pull Bot pushed a commit to TheTechOddBug/DeepSeek-TUI that referenced this pull request Jun 29, 2026
…l default

Hmbown#3756 flipped interactive Agent shell to on-by-default (approval-gated) and
updated MODES.md, but CONFIGURATION.md, TOOL_SURFACE.md, and the exec_shell
"unavailable" reason / allow_shell hint in tool_catalog.rs still described the
old opt-in model. Post-merge `main` therefore had docs that contradicted both
the runtime and MODES.md.

Align all three with the shipped behavior: interactive TUI Agent exposes shell
by default with approval prompts unless `allow_shell = false`; headless/durable/
noninteractive profiles keep the conservative omitted-field default and require
`allow_shell = true`; Plan mode hides shell; YOLO enables shell + auto-approval.

Behavior unchanged (docs + user-facing message only). Existing shell-message
tests still pass (the improved message keeps the asserted substrings).
Completes the documentation for Hmbown#3756.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant