Skip to content

[pull] main from openai:main#58

Open
pull[bot] wants to merge 1138 commits intokontext-security:mainfrom
openai:main
Open

[pull] main from openai:main#58
pull[bot] wants to merge 1138 commits intokontext-security:mainfrom
openai:main

Conversation

@pull
Copy link
Copy Markdown

@pull pull bot commented Mar 12, 2026

See Commits and Changes for more details.


Created by pull[bot] (v2.0.0-alpha.4)

Can you help keep this open source service alive? 💖 Please sponsor : )

@pull pull bot locked and limited conversation to collaborators Mar 12, 2026
@pull pull bot added ⤵️ pull merge-conflict Sync PR has merge conflicts labels Mar 12, 2026
mcgrew-oai and others added 27 commits April 12, 2026 16:27
## Description

Enable pnpm's reviewed build-script gate for this repo.

## What changed

- added `strictDepBuilds: true` to `pnpm-workspace.yaml`

## Why

The repo already uses pinned pnpm and frozen installs in CI. This adds
the remaining guard so dependency build scripts do not run unless they
are explicitly reviewed.

## Validation

- ran `pnpm install --frozen-lockfile`

Co-authored-by: Codex <noreply@openai.com>
## Summary

- detect WSL1 before Codex probes or invokes the Linux bubblewrap
sandbox
- fail early with a clear unsupported-operation message when a command
would require bubblewrap on WSL1
- document that WSL2 follows the normal Linux bubblewrap path while WSL1
is unsupported

## Why

Codex 0.115.0 made bubblewrap the default Linux sandbox. WSL1 cannot
create the user namespaces that bubblewrap needs, so shell commands
currently fail later with a raw bwrap namespace error. This makes the
unsupported environment explicit and keeps non-bubblewrap paths
unchanged.

The WSL detection reads /proc/version, lets an explicit WSL<version>
marker decide WSL1 vs WSL2+, and only treats a bare Microsoft marker as
WSL1 when no explicit WSL version is present.

addresses #16076

---------

Co-authored-by: Codex <noreply@openai.com>
- Let typed user messages submit while realtime is active and mirror
accepted text into the realtime text stream.
- Add integration coverage and snapshot for outbound realtime text.
## Problem

The TUI had shell-style Up/Down history recall, but `Ctrl+R` did not
provide the reverse incremental search workflow users expect from
shells. Users needed a way to search older prompts without immediately
replacing the current draft, and the interaction needed to handle async
persistent history, repeated navigation keys, duplicate prompt text,
footer hints, and preview highlighting without making the main composer
file even harder to review.


https://github.com/user-attachments/assets/5165affd-4c9a-46e9-adbd-89088f5f7b6b

<img width="1227" height="722" alt="image"
src="https://github.com/user-attachments/assets/8bc83289-eeca-47c7-b0c3-8975101901af"
/>

## Mental model

`Ctrl+R` opens a temporary search session owned by the composer. The
footer line becomes the search input, the composer body previews the
current match only after the query has text, and `Enter` accepts that
preview as an editable draft while `Esc` restores the draft that existed
before search started. The history layer provides a combined offset
space over persistent and local history, but search navigation exposes
unique prompt text rather than every physical history row.

## Non-goals

This change does not rewrite stored history, change normal Up/Down
browsing semantics, add fuzzy matching, or add persistent metadata for
attachments in cross-session history. Search deduplication is
deliberately scoped to the active Ctrl+R search session and uses exact
prompt text, so case, whitespace, punctuation, and attachment-only
differences are not normalized.

## Tradeoffs

The implementation keeps search state in the existing composer and
history state machines instead of adding a new cross-module controller.
That keeps ownership local and testable, but it means the composer still
coordinates visible search status, draft restoration, footer rendering,
cursor placement, and match highlighting while `ChatComposerHistory`
owns traversal, async fetch continuation, boundary clamping, and
unique-result caching. Unique-result caching stores cloned
`HistoryEntry` values so known matches can be revisited without cache
lookups; this is simple and robust for interactive search sizes, but it
is not a global history index.

## Architecture

`ChatComposer` detects `Ctrl+R`, snapshots the current draft, switches
the footer to `FooterMode::HistorySearch`, and routes search-mode keys
before normal editing. Query edits call `ChatComposerHistory::search`
with `restart = true`, which starts from the newest combined-history
offset. Repeated `Ctrl+R` or Up searches older; Down searches newer
through already discovered unique matches or continues the scan.
Persistent history entries still arrive asynchronously through
`on_entry_response`, where a pending search either accepts the response,
skips a duplicate, or requests the next offset.

The composer-facing pieces now live in
`codex-rs/tui/src/bottom_pane/chat_composer/history_search.rs`, leaving
`chat_composer.rs` responsible for routing and rendering integration
instead of owning every search helper inline.
`codex-rs/tui/src/bottom_pane/chat_composer_history.rs` remains the
owner of stored history, combined offsets, async fetch state, boundary
semantics, and duplicate suppression. Match highlighting is computed
from the current composer text while search is active and disappears
when the match is accepted.

## Observability

There are no new logs or telemetry. The practical debug path is state
inspection: `ChatComposer.history_search` tells whether the footer query
is idle, searching, matched, or unmatched; `ChatComposerHistory.search`
tracks selected raw offsets, pending persistent fetches, exhausted
directions, and unique match cache state. If a user reports skipped or
repeated results, first inspect the exact stored prompt text, the
selected offset, whether an async persistent response is still pending,
and whether a query edit restarted the search session.

## Tests

The change is covered by focused `codex-tui` unit tests for opening
search without previewing the latest entry, accepting and canceling
search, no-match restoration, boundary clamping, footer hints,
case-insensitive highlighting, local duplicate skipping, and persistent
duplicate skipping through async responses. Snapshot coverage captures
the footer-mode visual changes. Local verification used `just fmt`,
`cargo test -p codex-tui history_search`, `cargo test -p codex-tui`, and
`just fix -p codex-tui`.
Addresses #17313

Problem: The visual context meter in the status line was confusing and
continued to draw negative feedback, and context reporting should remain
an explicit opt-in rather than part of the default footer.

Solution: Remove the visual meter, restore opt-in context remaining/used
percentage items that explicitly say "Context", keep existing
context-usage configs working as a hidden alias, and update the setup
text and snapshots.
Addresses #17498

Problem: The TUI derived /status instruction source paths from the local
client environment, which could show stale <none> output or incorrect
paths when connected to a remote app server.

Solution: Add an app-server v2 instructionSources snapshot to thread
start/resume/fork responses, default it to an empty list when older
servers omit it, and render TUI /status from that server-provided
session data.

Additional context: The app-server field is intentionally named
instructionSources rather than AGENTS.md-specific terminology because
the loaded instruction sources can include global instructions, project
AGENTS.md files, AGENTS.override.md, user-defined instruction files, and
future dynamic sources.
## Summary
Stop counting elicitation time towards mcp tool call time. There are
some tradeoffs here, but in general I don't think time spent waiting for
elicitations should count towards tool call time, or at least not
directly towards timeouts.

Elicitations are not exactly like exec_command escalation requests, but
I would argue it's ~roughly equivalent.

## Testing
- [x] Added unit tests
- [x] Tested locally
Include MCP wall time in the output so the model is aware of how long
it's calls are taking.
## Summary
- run exec-server filesystem RPCs requiring sandboxing through a
`codex-fs` arg0 helper over stdin/stdout
- keep direct local filesystem execution for `DangerFullAccess` and
external sandbox policies
- remove the standalone exec-server binary path in favor of top-level
arg0 dispatch/runtime paths
- add sandbox escape regression coverage for local and remote filesystem
paths

## Validation
- `just fmt`
- `git diff --check`
- remote devbox: `cd codex-rs && bazel test --bes_backend=
--bes_results_url= //codex-rs/exec-server:all` (6/6 passed)

---------

Co-authored-by: Codex <noreply@openai.com>
Problem: After #17294 switched exec-server tests to launch the top-level
`codex exec-server` command, parallel remote exec-process cases can
flake while waiting for the child server's listen URL or transport
shutdown.

Solution: Serialize remote exec-server-backed process tests and harden
the harness so spawned servers are killed on drop and shutdown waits for
the child process to exit.
To prevent the spammy: 
<img width="424" height="172" alt="Screenshot 2026-04-09 at 13 36 16"
src="https://github.com/user-attachments/assets/b5ece9e3-c561-422f-87ec-041e7bd6813d"
/>
## Summary
- add an exec-server `envPolicy` field; when present, the server starts
from its own process env and applies the shell environment policy there
- keep `env` as the exact environment for local/embedded starts, but
make it an overlay for remote unified-exec starts
- move the shell-environment-policy builder into `codex-config` so Core
and exec-server share the inherit/filter/set/include behavior
- overlay only runtime/sandbox/network deltas from Core onto the
exec-server-derived env

## Why
Remote unified exec was materializing the shell env inside Core and
forwarding the whole map to exec-server, so remote processes could
inherit the orchestrator machine's `HOME`, `PATH`, etc. This keeps the
base env on the executor while preserving Core-owned runtime additions
like `CODEX_THREAD_ID`, unified-exec defaults, network proxy env, and
sandbox marker env.

## Validation
- `just fmt`
- `git diff --check`
- `cargo test -p codex-exec-server --lib`
- `cargo test -p codex-core --lib unified_exec::process_manager::tests`
- `cargo test -p codex-core --lib exec_env::tests`
- `cargo test -p codex-core --lib exec_env_tests` (compile-only; filter
matched 0 tests)
- `cargo test -p codex-config --lib shell_environment` (compile-only;
filter matched 0 tests)
- `just bazel-lock-update`

## Known local validation issue
- `just bazel-lock-check` is not runnable in this checkout: it invokes
`./scripts/check-module-bazel-lock.sh`, which is missing.

---------

Co-authored-by: Codex <noreply@openai.com>
Co-authored-by: pakrym-oai <pakrym@openai.com>
## Summary

When a `spawn_agent` call does a full-history fork, keep the parent's
effective agent type and model configuration instead of applying child
role/model overrides.

This is the minimal config-inheritance slice of #16055. Prompt-cache key
inheritance and MCP tool-surface stability are split into follow-up PRs.

## Design

- Reject `agent_type`, `model`, and `reasoning_effort` for v1
`fork_context` spawns.
- Reject `agent_type`, `model`, and `reasoning_effort` for v2
`fork_turns = "all"` spawns.
- Keep v2 partial-history forks (`fork_turns = "N"`) configurable;
requested model/reasoning overrides and role config still apply there.
- Keep non-forked spawn behavior unchanged.

## Tests

- `cargo +1.93.1 test -p codex-core spawn_agent_fork_context --lib`
- `cargo +1.93.1 test -p codex-core multi_agent_v2_spawn_fork_turns
--lib`
- `cargo +1.93.1 test -p codex-core
multi_agent_v2_spawn_partial_fork_turns_allows_agent_type_override
--lib`
Addresses #16255

Problem: Incomplete Responses streams could leave completed custom tool
outputs out of cleanup and retry prompts, making persisted history
inconsistent and retries stale.

Solution: Route stream and output-item errors through shared cleanup,
and rebuild retry prompts from fresh session history after the first
attempt.
Addresses #17252

Problem: Plan-mode clarification questionnaires used the generic
user-input notification type, so configs listening for plan-mode-prompt
did not fire when request_user_input waited for an answer.

Solution: Map request_user_input prompts to the plan-mode-prompt
notification and remove the obsolete user-input TUI notification
variant.
Addresses #17453

Problem: /status rate-limit reset timestamps can be truncated in narrow
layouts, leaving users with partial times or dates.

Solution: Let narrow rate-limit rows drop the fixed progress bar to
preserve the percent summary, and wrap reset timestamps onto
continuation lines instead of truncating them.
Addresses #17514

Problem: PR #16966 made the TUI render the deprecated context-compaction
notification, while v2 could also receive legacy unified-exec
interaction items alongside terminal-interaction notifications, causing
duplicate "Context compacted" and "Waited for background terminal"
messages.

Solution: Suppress deprecated context-compaction notifications and
legacy unified-exec interaction command items from the app-server v2
projection, and render canonical context-compaction items through the
existing TUI info-event path.
Problem: PR #17601 updated context-compaction replay to call a new
ChatWidget handler, but the handler was never implemented, breaking
codex-tui compilation on main.

Solution: Render context-compaction replay through the existing
info-message path, preserving the intended `Context compacted` UI marker
without adding a one-off handler.
Addresses #17593

Problem: A regression introduced in
#16492 made thread/start fail when
Codex could not persist trusted project state, which crashes startup for
users with read-only config.toml.

Solution: Treat trusted project persistence as best effort and keep the
current thread's config trusted in memory when writing config.toml
fails.
## Summary

This updates the Windows elevated sandbox setup/refresh path to include
the legacy `compute_allow_paths(...).deny` protected children in the
same deny-write payload pipe added for split filesystem carveouts.

Concretely, elevated setup and elevated refresh now both build
deny-write payload paths from:

- explicit split-policy deny-write paths, preserving missing paths so
setup can materialize them before applying ACLs
- legacy `compute_allow_paths(...).deny`, which includes existing
`.git`, `.codex`, and `.agents` children under writable roots

This lets the elevated backend protect `.git` consistently with the
unelevated/restricted-token path, and removes the old janky hard-coded
`.codex` / `.agents` elevated setup helpers in favor of the shared
payload path.

## Root Cause

The landed split-carveout PR threaded a `deny_write_paths` pipe through
elevated setup/refresh, but the legacy workspace-write deny set from
`compute_allow_paths(...).deny` was not included in that payload. As a
result, elevated workspace-write did not apply the intended deny-write
ACLs for existing protected children like `<cwd>/.git`.

## Notes

The legacy protected children still only enter the deny set if they
already exist, because `compute_allow_paths` filters `.git`, `.codex`,
and `.agents` with `exists()`. Missing explicit split-policy deny paths
are preserved separately because setup intentionally materializes those
before applying ACLs.

## Validation

- `cargo fmt --check -p codex-windows-sandbox`
- `cargo test -p codex-windows-sandbox`
- `cargo build -p codex-cli -p codex-windows-sandbox --bins`
- Elevated `codex exec` smoke with `windows.sandbox='elevated'`: fresh
git repo, attempted append to `.git/config`, observed `Access is
denied`, marker not written, Deny ACE present on `.git`
- Unelevated `codex exec` smoke with `windows.sandbox='unelevated'`:
fresh git repo, attempted append to `.git/config`, observed `Access is
denied`, marker not written, Deny ACE present on `.git`
#17638)

- stop `list_tool_suggest_discoverable_plugins()` from reloading the
curated marketplace for each discoverable plugin
- reuse a direct plugin-detail loader against the already-resolved
marketplace entry


The trigger was to stop those logs spamming:
```
d=019d81cf-6f69-7230-98aa-74294ff2dc5a}:submission_dispatch{otel.name="op.dispatch.user_input" submission.id="019d86c8-0a8e-7013-b442-109aabbf75c9" codex.op="user_input"}:turn{otel.name="session_task.turn" thread.id=019d81cf-6f69-7230-98aa-74294ff2dc5a turn.id=019d86c8-0a8e-7013-b442-109aabbf75c9 model=gpt-5.4}: ignoring interface.defaultPrompt: prompt must be at most 128 characters path=/Users/jif/.codex/.tmp/plugins/plugins/life-science-research/.codex-plugin/plugin.json
2026-04-13T12:27:30.402Z WARN  [019d81cf-6f69-7230-98aa-74294ff2dc5a] codex_core::plugins::manifest - session_loop{thread_id=019d81cf-6f69-7230-98aa-74294ff2dc5a}:submission_dispatch{otel.name="op.dispatch.user_input" submission.id="019d86c8-0a8e-7013-b442-109aabbf75c9" codex.op="user_input"}:turn{otel.name="session_task.turn" thread.id=019d81cf-6f69-7230-98aa-74294ff2dc5a turn.id=019d86c8-0a8e-7013-b442-109aabbf75c9 model=gpt-5.4}: ignoring interface.defaultPrompt: prompt must be at most 128 characters path=/Users/jif/.codex/.tmp/plugins/plugins/build-ios-apps/.codex-plugin/plugin.json
2026-04-13T12:27:30.402Z WARN  [019d81cf-6f69-7230-98aa-74294ff2dc5a] codex_core::plugins::manifest - session_loop{thread_id=019d81cf-6f69-7230-98aa-74294ff2dc5a}:submission_dispatch{otel.name="op.dispatch.user_input" submission.id="019d86c8-0a8e-7013-b442-109aabbf75c9" codex.op="user_input"}:turn{otel.name="session_task.turn" thread.id=019d81cf-6f69-7230-98aa-74294ff2dc5a turn.id=019d86c8-0a8e-7013-b442-109aabbf75c9 model=gpt-5.4}: ignoring interface.defaultPrompt: prompt must be at most 128 characters path=/Users/jif/.codex/.tmp/plugins/plugins/life-science-research/.codex-plugin/plugin.json
2026-04-13T12:27:30.405Z WARN  [019d81cf-6f69-7230-98aa-74294ff2dc5a] codex_core::plugins::manifest - session_loop{thread_id=019d81cf-6f69-7230-98aa-74294ff2dc5a}:submission_dispatch{otel.name="op.dispatch.user_input" submission.id="019d86c8-0a8e-7013-b442-109aabbf75c9" codex.op="user_input"}:turn{otel.name="session_task.turn" thread.id=019d81cf-6f69-7230-98aa-74294ff2dc5a turn.id=019d86c8-0a8e-7013-b442-109aabbf75c9 model=gpt-5.4}: ignoring interface.defaultPrompt: prompt must be at most 128 characters path=/Users/jif/.codex/.tmp/plugins/plugins/build-ios-apps/.codex-plugin/plugin.json
2026-04-13T12:27:30.406Z WARN  [019d81cf-6f69-7230-98aa-74294ff2dc5a] codex_core::plugins::manifest - session_loop{thread_id=019d81cf-6f69-7230-98aa-74294ff2dc5a}:submission_dispatch{otel.name="op.dispatch.user_input" submission.id="019d86c8-0a8e-7013-b442-109aabbf75c9" codex.op="user_input"}:turn{otel.name="session_task.turn" thread.id=019d81cf-6f69-7230-98aa-74294ff2dc5a turn.id=019d86c8-0a8e-7013-b442-109aabbf75c9 model=gpt-5.4}: ignoring interface.defaultPrompt: prompt must be at most 128 characters path=/Users/jif/.codex/.tmp/plugins/plugins/life-science-research/.codex-plugin/plugin.json
2026-04-13T12:27:30.408Z WARN  [019d81cf-6f69-7230-98aa-74294ff2dc5a] codex_core::plugins::manifest - session_loop{thread_id=019d81cf-6f69-7230-98aa-74294ff2dc5a}:submission_dispatch{otel.name="op.dispatch.user_input" submission.id="019d86c8-0a8e-7013-b442-109aabbf75c9" codex.op="user_input"}:turn{otel.name="session_task.turn" thread.id=019d81cf-6f69-7230-98aa-74294ff2dc5a turn.id=019d86c8-0a8e-7013-b442-109aabbf75c9 model=gpt-5.4}: ignoring interface.defaultPrompt: prompt must be at most 128 characters path=/Users/jif/.codex/.tmp/plugins/plugins/build-ios-apps/.codex-plugin/plugin.json
```
Currently app-server may unload actively running threads once the last
connection disconnects, which is not expected.
Instead track when was the last active turn & when there were any
subscribers the last time, also add 30 minute idleness/no subscribers
timer to reduce the churn.
zbarsky-openai and others added 30 commits April 17, 2026 18:45
## Why
This branch brings the Bazel module pins for `rules_rs` and `llvm` up to
the latest BCR releases and aligns the root direct dependencies with the
versions the module graph already resolves to.

That gives us a few concrete wins:
- picks up newer upstream fixes in the `rules_rs` / `rules_rust` stack,
including work around repo-rule nondeterminism and default Cargo binary
target generation
- picks up test sharding support from the newer `rules_rust` stack
([hermeticbuild/rules_rust#13](hermeticbuild/rules_rust#13))
- picks up newer built-in knowledge for common system crates like
`gio-sys`, `glib-sys`, `gobject-sys`, `libgit2-sys`, and `libssh2-sys`,
which gives us a future path to reduce custom build-script handling
- reduces local patch maintenance by dropping fixes that are now
upstream and rebasing the remaining Windows patch stack onto a newer
upstream base
- removes the direct-dependency warnings from `bazel-lock-check` by
making the root pins match the resolved graph

## What Changed
- bump `rules_rs` from `0.0.43` to `0.0.58`
- bump `llvm` from `0.6.8` to `0.7.1`
- bump `bazel_skylib` from `1.8.2` to `1.9.0` so the root direct dep
matches the resolved graph
- regenerate `MODULE.bazel.lock` for the updated module graph
- refresh the remaining Windows-specific patch stack against the newer
upstream sources:
  - `patches/rules_rs_windows_gnullvm_exec.patch`
  - `patches/rules_rs_windows_exec_linker.patch`
  - `patches/rules_rust_windows_exec_std.patch`
  - `patches/rules_rust_windows_msvc_direct_link_args.patch`
- remove patches that are no longer needed because the underlying fixes
are upstream now:
  - `patches/rules_rs_delete_git_worktree_pointer.patch`
  - `patches/rules_rust_repository_set_exec_constraints.patch`

## Validation
- `just bazel-lock-update`
- `just bazel-lock-check`

---------

Co-authored-by: Codex <noreply@openai.com>
## Why

The large Rust test suites are slow and include some of our flakiest
tests, so we want to run them with Bazel native sharding while keeping
shard membership stable between runs.

This is the simpler follow-up to the explicit-label experiment in
#17998. Since #18397 upgraded Codex to `rules_rs` `0.0.58`, which
includes the stable test-name hashing support from
hermeticbuild/rules_rust#14, this PR only needs to wire Codex's Bazel
macros into that support.

Using native sharding preserves BuildBuddy's sharded-test UI and Bazel's
per-shard test action caching. Using stable name hashing avoids
reshuffling every test when one test is added or removed.

## What Changed

`codex_rust_crate` now accepts `test_shard_counts` and applies the right
Bazel/rules_rust attributes to generated unit and integration test
rules. Matched tests are also marked `flaky = True`, giving them Bazel's
default three attempts.

This PR shards these labels 8 ways:

```text
//codex-rs/core:core-all-test
//codex-rs/core:core-unit-tests
//codex-rs/app-server:app-server-all-test
//codex-rs/app-server:app-server-unit-tests
//codex-rs/tui:tui-unit-tests
```

## Verification

`bazel query --output=build` over the selected public labels and their
inner unit-test binaries confirmed the expected `shard_count = 8`,
`flaky = True`, and `experimental_enable_sharding = True` attributes.

Also verified that we see the shards as expected in BuildBuddy so they
can be analyzed independently.

Co-authored-by: Codex <noreply@openai.com>
We don't have to downsize to 768 height.
## Summary
Update the plugin API for the new remote plugin model.

The mental model is no longer “keep local plugin state in sync with
remote.” Instead, local and remote plugins are becoming separate
sources. Remote catalog entries can be shown directly from the remote
API before installation; after installation they are still downloaded
into the local cache for execution, but remote installed state will come
from the API and be held in memory rather than being read from config.

• ## API changes
- Remove `forceRemoteSync` from `plugin/list`, `plugin/install`, and
`plugin/uninstall`.
  - Remove `remoteSyncError` from `plugin/list`.
  - Add remote-capable metadata to `plugin/list` / `plugin/read`:
    - nullable `marketplaces[].path`
    - `source: { type: "remote", downloadUrl }`
    - URL asset fields alongside local path fields:
  `composerIconUrl`, `logoUrl`, `screenshotUrls`
  - Make `plugin/read` and `plugin/install` source-compatible:
    - `marketplacePath?: AbsolutePathBuf | null`
    - `remoteMarketplaceName?: string | null`
    - exactly one source is required at runtime
This PR adds inline enable/disable controls to the new /plugins browse
menu. Installed plugins can now be toggled directly from the list with
keyboard interaction, and the associated config-write plumbing is
included so the UI and persisted plugin state stay in sync. This also
includes the queued-write handling needed to avoid stale toggle
completions overwriting newer intent.

- Add toggleable plugin rows for installed plugins in /plugins
- Support Space to enable or disable without leaving the list
- Persist plugin enablement through the existing app/config write path
- Preserve the current selection while the list refreshes after a toggle
- Add tests and snapshot updates for toggling behavior

---------

Co-authored-by: Codex <noreply@openai.com>
## Summary
- trust-gate project `.codex` layers consistently, including repos that
have `.codex/hooks.json` or `.codex/execpolicy/*.rules` but no
`.codex/config.toml`
- keep disabled project layers in the config stack so nested trusted
project layers still resolve correctly, while preventing hooks and exec
policies from loading until the project is trusted
- update app-server/TUI onboarding copy to make the trust boundary
explicit and add regressions for loader, hooks, exec-policy, and
onboarding coverage

## Security
Before this change, an untrusted repo could auto-load project hooks or
exec policies from `.codex/` as long as `config.toml` was absent. This
makes trust the single gate for project-local config, hooks, and exec
policies.

## Stack
- Parent of #15936

## Test
- cargo test -p codex-core without_config_toml

---------

Co-authored-by: Codex <noreply@openai.com>
- add a TUI startup migration prompt for external agent config
- support migrating external configs including config, skills, AGENTS.md
and plugins
- gate the prompt behind features.external_migrate (default false)

<img width="1037" height="480" alt="Screenshot 2026-04-14 at 9 29 14 PM"
src="https://github.com/user-attachments/assets/6060849b-03cb-429a-9c13-c7bb46ad2e65"
/>
<img width="713" height="183" alt="Screenshot 2026-04-14 at 9 29 26 PM"
src="https://github.com/user-attachments/assets/d13f177e-d4c4-479c-8736-ef29636081e1"
/>

---------

Co-authored-by: Eric Traut <etraut@openai.com>
supporting guardian's rebrand to auto-review!
Cap the model-visible skills section to a small share of the context
window, with a fallback character budget, and keep only as many implicit
skills as fit within that budget.

Emit a non-fatal warning when enabled skills are omitted, and add a new
app-server warning notification

Record thread-start skill metrics for total enabled skills, kept skills,
and whether truncation happened

---------

Co-authored-by: Matthew Zeng <mzeng@openai.com>
Co-authored-by: Codex <noreply@openai.com>
## Summary
- Populate `PluginDetail.description` in core for uninstalled cross-repo
plugins when detailed fields are unavailable until install.
- Include the source Git URL plus optional path/ref/sha details in that
fallback description.
- Keep `details_unavailable_reason` as the structured signal while
app-server forwards the description normally.
- Add plugin-read coverage proving the response does not clone the
remote source just to show the message.

## Why
Uninstalled cross-repo plugins intentionally return sparse detail data
so listing/reading does not clone the plugin source. Without a
description, Desktop and TUI detail pages look like an ordinary empty
plugin. This gives users a concrete explanation and source pointer while
keeping the existing structured reason available for callers.

## Validation
- `just fmt`
- `cargo test -p codex-core
read_plugin_for_config_uninstalled_git_source_requires_install_without_cloning`
- `cargo test -p codex-app-server plugin_read --test all`
- `just fix -p codex-core`
- `just fix -p codex-app-server`

Note: `cargo test -p codex-app-server` was also attempted before the
latest refactor and failed broadly in unrelated v2
thread/realtime/review/skills suites; the new plugin-read test passed in
that run as well.
## Summary

Second PR in the split from #17956. Stacked on #18227.

- adds app-server v2 protocol/schema support for
`account/sendAddCreditsNudgeEmail`
- adds the backend-client `send_add_credits_nudge_email` request and
request body mapping
- handles the app-server request with auth checks, backend call, and
cooldown mapping
- adds the disabled `workspace_owner_usage_nudge` feature flag and
focused app-server/backend tests

## Validation

- `cargo test -p codex-backend-client`
- `cargo test -p codex-app-server-protocol`
- `cargo test -p codex-app-server rate_limits`
- `cargo test -p codex-tui workspace_`
- `cargo test -p codex-tui status_`
- `just fmt`
- `just fix -p codex-backend-client`
- `just fix -p codex-app-server-protocol`
- `just fix -p codex-app-server`
- `just fix -p codex-tui`
## Summary

Move the marketplace remove implementation into shared core logic so
both the CLI command and follow-up app-server RPC can reuse the same
behavior.

This change:
- adds a shared `codex_core::plugins::remove_marketplace(...)` flow
- moves validation, config removal, and installed-root deletion out of
the CLI
- keeps the CLI as a thin wrapper over the shared implementation
- adds focused core coverage for the shared remove path

## Validation

- `just fmt`
- focused local coverage for the shared remove path
- heavier follow-up validation deferred to stacked PR CI
Adds max_context_window to model metadata and routes core context-window
reads through resolved model info. Config model_context_window overrides
are clamped to max_context_window when present; without an override, the
model context_window is used.
## Summary
- Reverts PR #17749 so queued inter-agent mail can again preempt after
reasoning/commentary output item boundaries.
- Applies the revert to the current `codex/turn.rs` module layout and
restores the prior pending-input test expectations/snapshots.

## Testing
- `just fmt`
- `cargo test -p codex-core --test all pending_input`
- `cargo test -p codex-core` failed in unrelated
`tools::js_repl::tests::js_repl_imported_local_files_can_access_repl_globals`:
dotslash download hit `mktemp: mkdtemp failed ... Operation not
permitted` in the sandbox temp dir.

Co-authored-by: Codex <noreply@openai.com>
Do not assume the default `detail`.
## Summary

Fixes #16637. (I hit this bug after 11h of work on a long-running task.)

Plugin cache initialization could panic when an already-absolute cache
path was normalized through `AbsolutePathBuf::from_absolute_path`,
because that path still consulted `current_dir()`.

This changes absolute-path normalization so already-absolute paths do
not depend on cwd, and makes plugin cache root construction available as
a fallible path through `PluginStore::try_new()`. Plugin cache subpaths
now use `AbsolutePathBuf::join()` instead of re-absolutizing derived
absolute paths.
## Summary
- Add the executor-backed RMCP stdio transport.
- Wire MCP stdio placement through the executor environment config.
- Cover local and executor-backed stdio paths with the existing MCP test
helpers.

## Stack
```text
o  #18027 [6/6] Fail exec client operations after disconnect
│
@  #18212 [5/6] Wire executor-backed MCP stdio
│
o  #18087 [4/6] Abstract MCP stdio server launching
│
o  #18020 [3/6] Add pushed exec process events
│
o  #18086 [2/6] Support piped stdin in exec process API
│
o  #18085 [1/6] Add MCP server environment config
│
o  main
```

---------

Co-authored-by: Codex <noreply@openai.com>
# Summary

When a user finishes planning, the TUI asks whether to implement in the
current conversation or start fresh with the approved plan. The
clear-context choice is easier to evaluate when the prompt shows how
much context has already been used, because the user can see when
carrying the full prior conversation is likely to be less useful than
preserving only the plan.

<img width="1612" height="1312" alt="image"
src="https://github.com/user-attachments/assets/694bcf87-8be5-4e88-a412-e562af62d5f7"
/>
    
This PR adds that context signal directly to the clear-context option
while keeping the copy compact enough for the Plan-mode selection popup.

# What Changed

- Compute an optional context-usage label when opening the plan
implementation prompt.
- Show the label only on `Yes, clear context and implement`, where it
informs the cleanup decision.
- Prefer a percentage-used label when context-window information is
available, with a compact token-used fallback when only token totals are
known.
- Preserve the original option description when usage is unknown or
effectively zero.
- Add rustdoc comments around the prompt-copy boundary so future changes
keep the context label formatting and selection rendering
responsibilities clear.

# Testing

- `cargo test -p codex-tui plan_implementation`

# Notes

The footer continues to show context remaining as ambient status. The
implementation prompt intentionally shows context used because the user
is choosing whether to clean up the current thread before
implementation.
## Summary

`codex app` should be a platform-aware entry point for opening Codex
Desktop or helping users install it. Before this change, the command
only existed on macOS and its default installer URL always pointed at
the Apple Silicon DMG, which sent Intel Mac users to the wrong build.

This updates the macOS path to choose the Apple Silicon or Intel DMG
based on the detected processor, while keeping `--download-url` as an
advanced override. It also enables `codex app` on Windows, where the CLI
opens an installed Codex Desktop app when available and otherwise opens
the Windows installer URL.

---------

Co-authored-by: Felipe Coury <felipe.coury@openai.com>
## Why

Users have asked to queue follow-up slash commands while a task is
running, including in #14081, #14588, #14286, and #13779. The previous
TUI behavior validated slash commands immediately, so commands that are
only meaningful once the current turn is idle could not be queued
consistently.

The queue should preserve what the user typed and defer command parsing
until the item is actually dispatched. This also gives `/fast`, `/review
...`, `/rename ...`, `/model`, `/permissions`, and similar slash
workflows the same FIFO behavior as plain queued prompts.

## What Changed

- Added a queued-input action enum so queued items can be dispatched as
plain prompts, slash commands, or user shell commands.
- Changed `Tab` queueing to accept slash-led prompts without validating
them up front, then parse and dispatch them when dequeued.
- Added `!` shell-command queueing for `Tab` while a task is running,
while preserving existing `Enter` behavior for immediate shell
execution.
- Moved queued slash dispatch through shared slash-command parsing so
inline commands, unavailable commands, unknown commands, and local
config commands report at dequeue time.
- Continued queue draining after local-only actions and after slash menu
cancellation or selection when no task is running.
- Preserved slash-popup completion behavior so `/mo<Tab>` completes to
`/model ` instead of queueing the prefix.
- Updated pending-input preview snapshots to show queued follow-up
inputs.

## Verification

I did a bunch of manual validation (and found and fixed a few bugs along
the way).
- Log the actual realtime session id when the session.updated event
arrives.
- Remove the stale core models catalog.
- Update the release workflow to refresh the active models-manager
catalog.
The TUI supports long-running turns and agent threads, but quick side
questions have required interrupting the main flow or manually
forking/navigating threads. This PR adds a guarded `/side` flow so users
can ask brief side-conversation questions in an ephemeral fork while
keeping the primary thread focused. This also helps address the feature
request in #18125.

The implementation creates one side conversation at a time, lets `/side`
open either an empty side thread or immediately submit `/side
<question>`, and returns to the parent with Esc or Ctrl+C. Side
conversations get hidden developer guardrails that treat inherited
history as reference-only and steer the model away from workspace
mutations unless explicitly requested in the side conversation.

The TUI hides most slash commands while side mode is active, leaving
only `/copy`, `/diff`, `/mention`, and `/status` available there.
## Summary

Fixes #18554.

The `/experimental` menu can submit the full experimental feature state
even when the user presses Enter without toggling anything. Previously,
Codex showed `Memories will be enabled in the next session.` whenever
the submitted updates included `Feature::MemoryTool = true`, so sessions
where Memories were already enabled could show a redundant warning on a
no-op save.

This change records whether `Feature::MemoryTool` was enabled before
applying feature updates and only emits the next-session notice when
Memories actually transitions from disabled to enabled.
## Stack

1. This PR: expand and filter `USERPROFILE` roots.
2. Follow-up: #18493 filters SSH config dependency roots on top of this
base.

## Bug

On Windows, Codex can grant the sandbox ACL access to the whole user
profile directory.

That means the sandbox ACL can be applied under paths like:

```text
C:\Users\me\.ssh
C:\Users\me\.tsh
```

This breaks SSH. Windows OpenSSH checks permissions on SSH config and
key material. If Codex adds a sandbox group ACL to those files, OpenSSH
can reject the config or keys.

The bad interaction is:

1. Codex asks the Windows sandbox to grant access to `USERPROFILE`.
2. The sandbox applies ACLs under that root.
3. SSH-owned files get an extra ACL entry.
4. OpenSSH rejects those files because their permissions are no longer
strict enough.

## Why this happens more now

Codex now has more flows that naturally start in the user profile:

- a new chat can start in the user directory
- a project can be rooted in the user directory
- a user can start the Codex CLI from the user directory

Those are valid user actions. The bug is that `USERPROFILE` is too broad
a sandbox root.

## Change

This PR keeps the useful behavior of starting from the user profile
without granting the profile root itself.

The new flow is:

1. collect the normal read and write roots
2. if a root is exactly `USERPROFILE`, replace it with the direct
children of `USERPROFILE`
3. remove `USERPROFILE` itself from the final root list
4. apply the existing user-profile read exclusions to both read and
write roots
5. add `.tsh` and `.brev` to that exclusion list

So this input:

```text
C:\Users\me
```

becomes roots like:

```text
C:\Users\me\Desktop
C:\Users\me\Documents
C:\Users\me\Downloads
```

and does not include:

```text
C:\Users\me
C:\Users\me\.ssh
C:\Users\me\.tsh
C:\Users\me\.brev
```

If `USERPROFILE` cannot be listed, expansion falls back to the profile
root and the later filter removes it. That keeps the failure mode closed
for this bug.

## Why this shape

The sandbox still gets access to ordinary profile folders when the user
starts from home.

The sandbox no longer grants access to the profile root itself.

All filtering happens after expansion, for both read and write roots.
That gives us one simple rule: expand broad profile grants first, then
remove roots the sandbox must not own.

## Tests

- `just fmt`
- `cargo test -p codex-windows-sandbox`
- `just fix -p codex-windows-sandbox`
- `git diff --check`
## Stack

1. Base PR: #18443 stops granting ACLs on `USERPROFILE`.
2. This PR: filters additional SSH-owned profile roots discovered from
SSH config.

## Bug

The base PR removes the broadest bad grant: `USERPROFILE` itself.

That still leaves one important case. A user profile child can be
SSH-owned even when its name is not one of our fixed exclusions.

For example:

```sshconfig
Host devbox
  IdentityFile ~/.keys/devbox
  CertificateFile ~/.certs/devbox-cert.pub
  UserKnownHostsFile ~/.known_hosts_custom
  Include ~/.ssh/conf.d/*.conf
```

After profile expansion, the sandbox might see these as normal profile
children:

```text
C:\Users\me\.keys
C:\Users\me\.certs
C:\Users\me\.known_hosts_custom
C:\Users\me\.ssh
```

Those paths have another owner: OpenSSH and the tools that manage SSH
identity and host-key state. Codex should not add sandbox ACLs to them.

OpenSSH describes this dependency tree in
[`ssh_config(5)`](https://man.openbsd.org/ssh_config.5), and the client
parser follows the same shape in `readconf.c`:

- `Include` recursively reads more config files and expands globs
- `IdentityFile` and `CertificateFile` name authentication files
- `UserKnownHostsFile`, `GlobalKnownHostsFile`, and `RevokedHostKeys`
name host-key files
- `ControlPath` and `IdentityAgent` can name profile-owned sockets or
control files
- these path directives can use forms such as `~`, `%d`, and `${HOME}`

## Change

This PR adds a small SSH config dependency scanner.

It starts at:

```text
~/.ssh/config
```

Then it returns concrete paths named by `Include` and by path-valued SSH
config directives:

```text
IdentityFile
CertificateFile
UserKnownHostsFile
GlobalKnownHostsFile
RevokedHostKeys
ControlPath
IdentityAgent
```

For example:

```sshconfig
IdentityFile ~/.keys/devbox
CertificateFile ~/.certs/devbox-cert.pub
Include ~/.ssh/conf.d/*.conf
```

returns paths like:

```text
C:\Users\me\.keys\devbox
C:\Users\me\.certs\devbox-cert.pub
C:\Users\me\.ssh\conf.d\devbox.conf
```

The setup code then maps those paths back to their top-level
`USERPROFILE` child and filters matching sandbox roots out of both the
writable and readable root lists.

## Why this shape

The parser reports what SSH config references. The sandbox setup code
decides which `USERPROFILE` roots are unsafe to grant.

That keeps the policy simple:

1. expand broad profile grants
2. remove the profile root
3. remove fixed sensitive profile folders
4. remove profile folders referenced by SSH config dependencies

If a path has two possible owners, the sandbox steps back. SSH keeps
control of SSH config, keys, certificates, known-hosts files, sockets,
and included config files.

## Tests

- `cargo test -p codex-windows-sandbox --lib`
- `just bazel-lock-check`
- `just fix -p codex-windows-sandbox`
- `git diff --check`
## Summary
- persist registered agent tasks in the session state update stream so
the thread can reuse them
- prewarm task registration once identity registration succeeds, while
keeping startup failures best-effort
- isolate the session-side task lifecycle into a dedicated module so
AgentIdentityManager and RegisteredAgentTask do not leak across as many
core layers

## Testing
- cargo test -p codex-core startup_agent_task_prewarm
- cargo test -p codex-core
cached_agent_task_for_current_identity_clears_stale_task
- cargo test -p codex-core record_initial_history_
Fast mode TUI copy currently names a specific plan-usage multiplier in
two lightweight promo/help surfaces. This swaps that exact multiplier
language for the broader increased plan usage wording we use elsewhere.

There are no behavior changes here; the slash command and startup tip
still point users at the same Fast mode flow.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

⤵️ pull merge-conflict Sync PR has merge conflicts

Projects

None yet

Development

Successfully merging this pull request may close these issues.