refactor: make reliability features configurable#65
refactor: make reliability features configurable#65vlordier wants to merge 5 commits intoLiquid4All:mainfrom
Conversation
- Tool Result Compression: smart summarization for large tool outputs (directory listings, search results, JSON data) - Request Deduplication: 500ms debounce on send button to prevent duplicate requests from rapid clicks - Config Hot Reload: poll config file for changes, reload without restart, show toast notification - Error Boundary: timeout wrapper (120s) around tool execution, graceful error messages instead of crashing agent loop
- Add 13 tests for settings.rs (AppSettings, SamplingConfig, config hot reload) - Add 12 tests for chat.rs compression functions (truncate, compress directory, search, JSON) - Total tests: 392 (up from 365) - Coverage: 42.5% (up from 39.5%)
- Add ModelStatus tests (serialization, healthy/unhealthy states) - Add SamplingOverrides serialization tests - Add InferenceClient tests for: - LM Studio URL construction and model selection - Tool call format (NativeJson vs Pythonic) - Fallback chain exhaustion (AllModelsUnavailable) - is_retriable for various HTTP status codes - Error repair from malformed tool calls - Total tests: 417 (up from 392)
- Add tests for data_dir(), cache_dir(), resolve_db_path() - Add rotate_log_file() tests (creates rotated copies, handles missing files) - Add filter_by_enabled_servers() tests (filters correctly, handles missing/invalid config) - Add resolve_vision_model() tests (returns None without config) - Add filter_tools_by_allowlist() test (doesn't panic without config) - Add load_override_file() tests (parses config, returns empty for missing) - Total tests: 430 (up from 417) - Coverage: 43.3% (up from 42.5%)
- Add compression_threshold_chars and max_tool_result_chars to SamplingConfig - Add version field to AppSettings with migration support - Re-export AppSettings and SamplingConfig from lib.rs - Add helper methods for accessing compression config This enables future UI-based configuration of tool result compression.
There was a problem hiding this comment.
Pull request overview
Adds frontend/backend plumbing to make reliability + agent runtime behavior configurable and observable from the app (settings UI + Tauri commands), including MCP server management and inference streaming support.
Changes:
- Added a Zustand settings store with IPC-backed loaders and config-file watch + auto-reload toast.
- Introduced MCP client modules (stdio JSON-RPC transport, lifecycle management, auto-discovery) and expanded agent/inference infrastructure.
- Added multiple new Tauri IPC commands (sessions, python venv provisioning, model download, hardware detection) plus updated Tauri config/capabilities.
Reviewed changes
Copilot reviewed 48 out of 55 changed files in this pull request and generated 9 comments.
Show a summary per file
| File | Description |
|---|---|
| src/stores/settingsStore.ts | New Zustand store for settings panel state + backend IPC actions (including config reload watching). |
| src/components/Chat/MessageInput.tsx | New chat input with Enter-to-send, auto-resize, and send debouncing. |
| src/App.tsx | Starts/stops config watch based on settings panel visibility; shows reload toast. |
| src-tauri/tauri.conf.json | Tauri app/build/bundle configuration and CSP settings. |
| src-tauri/src/mcp_client/types.rs | JSON-RPC + MCP protocol structs with serde defaults and tests. |
| src-tauri/src/mcp_client/transport.rs | Stdio JSON-RPC request/notify transport + helpers and tests. |
| src-tauri/src/mcp_client/mod.rs | MCP client module exports/re-exports. |
| src-tauri/src/mcp_client/lifecycle.rs | Server spawn/init, restart-with-backoff, and shutdown orchestration. |
| src-tauri/src/mcp_client/errors.rs | MCP client error enum definitions. |
| src-tauri/src/mcp_client/discovery.rs | Auto-discovery of MCP servers from mcp-servers/ conventions + tests. |
| src-tauri/src/mcp_client/client.rs | High-level MCP client (start/stop/call/restart) + tests. |
| src-tauri/src/main.rs | Tauri entrypoint for the Rust app. |
| src-tauri/src/inference/types.rs | OpenAI-compatible request/response types + sampling overrides + tests. |
| src-tauri/src/inference/streaming.rs | SSE streaming parser with tool-call accumulation + non-stream fallback + tests. |
| src-tauri/src/inference/mod.rs | Inference module exports/re-exports. |
| src-tauri/src/inference/errors.rs | Inference error enum + helpers + tests. |
| src-tauri/src/inference/config.rs | _models/config.yaml loader, env interpolation, and model selection helpers + tests. |
| src-tauri/src/commands/session.rs | Session list/load/delete/context budget + empty-session cleanup commands. |
| src-tauri/src/commands/python_env_startup.rs | Startup-time provisioning of missing Python server venvs. |
| src-tauri/src/commands/python_env.rs | IPC commands + evented progress for Python venv provisioning. |
| src-tauri/src/commands/ollama.rs | IPC commands for Ollama model listing/pull + llama.cpp health check. |
| src-tauri/src/commands/model_download.rs | Streaming model download with progress events + SHA-256 verification. |
| src-tauri/src/commands/mod.rs | Commands module wiring and a placeholder greet command. |
| src-tauri/src/commands/hardware.rs | Hardware detection and runtime/quant recommendation command. |
| src-tauri/src/commands/filesystem.rs | Stub filesystem browsing commands for the UI. |
| src-tauri/src/agent_core/types.rs | Agent-core shared types: messages, sessions, undo, audit, confirmations. |
| src-tauri/src/agent_core/tool_router.rs | Tool dispatch with confirmation + retries + audit/undo recording + tests. |
| src-tauri/src/agent_core/tool_prefilter.rs | Embedding-based top-K tool prefiltering for RAG selection + tests. |
| src-tauri/src/agent_core/tokens.rs | Token estimation + UTF-8-safe truncation + summarization helpers + tests. |
| src-tauri/src/agent_core/response_analysis.rs | Detect incomplete/deflection/completion patterns in model text + tests. |
| src-tauri/src/agent_core/plan_templates.rs | Keyword-template planning shortcuts for common use cases + tests. |
| src-tauri/src/agent_core/plan_parser.rs | Bracket + JSON plan output parsers + tests. |
| src-tauri/src/agent_core/permissions.rs | Tiered permission store (session/persistent) + persistence + tests. |
| src-tauri/src/agent_core/mod.rs | Agent-core module exports/re-exports. |
| src-tauri/src/agent_core/errors.rs | Agent-core error enum + conversions. |
| src-tauri/mcp-servers.json | Default empty MCP servers config placeholder. |
| src-tauri/entitlements.plist | macOS entitlements for process spawning, dylib loading, and network. |
| src-tauri/capabilities/default.json | Tauri capability permissions (shell/dialog/core). |
| src-tauri/build.rs | Tauri build script wiring. |
| src-tauri/Cargo.toml | Rust dependencies for agent core, MCP, inference, commands, and tooling. |
Comments suppressed due to low confidence (1)
src/App.tsx:1
- The dismiss “button” has no
onClickhandler and no explicittype="button", so it relies on the parent<div>click handler (which is less accessible and can be problematic if this is ever rendered inside a<form>). AttachonClick={clearConfigReloadNotification}to the button (and stop propagation if you keep the parent click handler), and settype="button"explicitly.
import { useEffect } from "react";
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if (interval) { | ||
| clearInterval(interval); |
There was a problem hiding this comment.
interval is not defined here, so this won’t compile and the polling interval will never be cleared. Use the configWatchInterval value retrieved from get() when checking/clearing the interval.
| if (interval) { | |
| clearInterval(interval); | |
| if (configWatchInterval) { | |
| clearInterval(configWatchInterval); |
| // Read response lines until we find one with matching id | ||
| let mut line_buf = String::new(); | ||
| let mut reader = self.reader.lock().await; | ||
|
|
||
| loop { | ||
| line_buf.clear(); | ||
| let bytes_read = reader | ||
| .read_line(&mut line_buf) | ||
| .await | ||
| .map_err(|e| McpError::TransportError { | ||
| server: self.server_name.clone(), | ||
| reason: format!("failed to read from stdout: {e}"), | ||
| })?; |
There was a problem hiding this comment.
This transport can drop responses when multiple request() calls happen concurrently: the first waiter holds the reader lock and will discard responses with other IDs (lines 112–117), causing later requests to hang forever. A concrete fix is to enforce a single in-flight request (e.g., one mutex guarding the entire request/response exchange), or implement a dedicated reader task that demultiplexes responses to per-id channels.
| match serde_json::from_str::<JsonRpcResponse>(trimmed) { | ||
| Ok(resp) if resp.id == id => return Ok(resp), | ||
| Ok(_) => { | ||
| // Response for a different request ID — skip | ||
| // This shouldn't happen in our single-threaded protocol, | ||
| // but handle gracefully. | ||
| continue; | ||
| } | ||
| Err(_) => { | ||
| // Not a JSON-RPC response — could be server log output. | ||
| // Skip and keep reading. | ||
| continue; | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
This transport can drop responses when multiple request() calls happen concurrently: the first waiter holds the reader lock and will discard responses with other IDs (lines 112–117), causing later requests to hang forever. A concrete fix is to enforce a single in-flight request (e.g., one mutex guarding the entire request/response exchange), or implement a dedicated reader task that demultiplexes responses to per-id channels.
| // fs was overridden | ||
| assert_eq!(merged["fs"].command, "node"); | ||
| // ocr was NOT overridden — preserved from discovery | ||
| assert_eq!(merged["ocr"].command, "npx"); |
There was a problem hiding this comment.
This test is not cross-platform: on Windows default_npx_command() returns npx.cmd, so asserting "npx" will fail. Prefer asserting against default_npx_command() (or reusing ts_config("ocr").command) to keep the test platform-correct.
| assert_eq!(merged["ocr"].command, "npx"); | |
| assert_eq!(merged["ocr"].command, ts_config("ocr").command); |
| if (now - lastSendTimeRef.current < SEND_DEBOUNCE_MS) { | ||
| setIsDebouncing(true); | ||
| setTimeout(() => setIsDebouncing(false), SEND_DEBOUNCE_MS); | ||
| return; | ||
| } |
There was a problem hiding this comment.
The debounce setTimeout isn’t cleaned up on unmount; if the component unmounts within the debounce window, it can call setIsDebouncing(false) on an unmounted component. Track the timeout ID in a ref and clear it in a useEffect cleanup (or clear/restart it when re-triggering debounce).
| for entry_result in read_dir { | ||
| let entry = entry_result.map_err(|e| format!("Failed to read entry: {e}"))?; | ||
| let metadata = entry.metadata().map_err(|e| format!("Failed to read metadata: {e}"))?; |
There was a problem hiding this comment.
entry.metadata() follows symlinks, so metadata.file_type().is_symlink() will typically be false even for symlinks (you’ll see the target’s type instead). Use entry.file_type() (from DirEntry) or std::fs::symlink_metadata() to correctly detect symlinks.
| let entry_type = if metadata.is_dir() { | ||
| "dir".to_string() | ||
| } else if metadata.file_type().is_symlink() { | ||
| "symlink".to_string() |
There was a problem hiding this comment.
entry.metadata() follows symlinks, so metadata.file_type().is_symlink() will typically be false even for symlinks (you’ll see the target’s type instead). Use entry.file_type() (from DirEntry) or std::fs::symlink_metadata() to correctly detect symlinks.
| let entry_type = if metadata.is_dir() { | |
| "dir".to_string() | |
| } else if metadata.file_type().is_symlink() { | |
| "symlink".to_string() | |
| let file_type = entry | |
| .file_type() | |
| .map_err(|e| format!("Failed to read file type: {e}"))?; | |
| let entry_type = if file_type.is_symlink() { | |
| "symlink".to_string() | |
| } else if metadata.is_dir() { | |
| "dir".to_string() |
| target_dir: String, | ||
| app_handle: tauri::AppHandle, | ||
| ) -> Result<ModelDownloadResult, String> { | ||
| let client = reqwest::Client::new(); |
There was a problem hiding this comment.
The download client has no timeout configured, so a stalled connection can hang indefinitely. Consider building the client with a reasonable total timeout and/or per-read timeouts to make the command fail fast and surface a clear error to the UI.
| <!-- Allow spawning child processes (Node.js/Python MCP servers) --> | ||
| <key>com.apple.security.cs.allow-unsigned-executable-memory</key> | ||
| <true/> | ||
| <!-- Allow loading external dynamic libraries (Python packages, Node modules) --> | ||
| <key>com.apple.security.cs.disable-library-validation</key> | ||
| <true/> |
There was a problem hiding this comment.
These entitlements significantly weaken macOS code-signing protections (unsigned executable memory and disabled library validation). If they’re strictly required, consider documenting the concrete runtime need (which subsystem requires them) and, if possible, tightening to the minimum entitlements needed (e.g., prefer more specific JIT-related entitlements where applicable) to reduce attack surface.
| <!-- Allow spawning child processes (Node.js/Python MCP servers) --> | |
| <key>com.apple.security.cs.allow-unsigned-executable-memory</key> | |
| <true/> | |
| <!-- Allow loading external dynamic libraries (Python packages, Node modules) --> | |
| <key>com.apple.security.cs.disable-library-validation</key> | |
| <true/> |
Summary
Improves on PR #59 by making hardcoded reliability features configurable:
compression_threshold_charsandmax_tool_result_charstoSamplingConfig(defaults: 3000 and 6000)versionfield toAppSettingswith automatic migration supportAppSettingsandSamplingConfigpublic for use in other modulesWhy this improves existing PRs
Testing