Skip to content

test: add proptest and quickcheck for property-based testing#63

Open
vlordier wants to merge 7 commits intoLiquid4All:mainfrom
vlordier:test/better-coverage
Open

test: add proptest and quickcheck for property-based testing#63
vlordier wants to merge 7 commits intoLiquid4All:mainfrom
vlordier:test/better-coverage

Conversation

@vlordier
Copy link
Copy Markdown

@vlordier vlordier commented Mar 6, 2026

Summary

  • Add proptest and quickcheck to dev-dependencies
  • Add proptest-based property tests for token estimation functions
  • Add quickcheck fuzz tests to ensure functions never panic

Changes

  • Add property-based tests in agent_core/tokens.rs
  • Test count: 434 (up from 426)

Testing

  • All 434 tests pass (including 7 new proptest/quickcheck tests)
  • cargo clippy -- -D warnings passes

vlordier added 6 commits March 6, 2026 21:46
- Tool Result Compression: smart summarization for large tool outputs
  (directory listings, search results, JSON data)
- Request Deduplication: 500ms debounce on send button to prevent
  duplicate requests from rapid clicks
- Config Hot Reload: poll config file for changes, reload without
  restart, show toast notification
- Error Boundary: timeout wrapper (120s) around tool execution,
  graceful error messages instead of crashing agent loop
- Add 13 tests for settings.rs (AppSettings, SamplingConfig, config hot reload)
- Add 12 tests for chat.rs compression functions (truncate, compress directory, search, JSON)
- Total tests: 392 (up from 365)
- Coverage: 42.5% (up from 39.5%)
- Add ModelStatus tests (serialization, healthy/unhealthy states)
- Add SamplingOverrides serialization tests
- Add InferenceClient tests for:
  - LM Studio URL construction and model selection
  - Tool call format (NativeJson vs Pythonic)
  - Fallback chain exhaustion (AllModelsUnavailable)
  - is_retriable for various HTTP status codes
  - Error repair from malformed tool calls
- Total tests: 417 (up from 392)
- Add tests for data_dir(), cache_dir(), resolve_db_path()
- Add rotate_log_file() tests (creates rotated copies, handles missing files)
- Add filter_by_enabled_servers() tests (filters correctly, handles missing/invalid config)
- Add resolve_vision_model() tests (returns None without config)
- Add filter_tools_by_allowlist() test (doesn't panic without config)
- Add load_override_file() tests (parses config, returns empty for missing)
- Total tests: 430 (up from 417)
- Coverage: 43.3% (up from 42.5%)
- Add src-tauri/rustfmt.toml with code formatting configuration
- Add Tauri-specific entries to .gitignore (target/, gen/)
- Add #[allow(dead_code)] for unused cache_dir() function
- Add proptest and quickcheck to dev-dependencies
- Add proptest-based property tests for token estimation functions
- Add quickcheck fuzz tests to ensure functions never panic
- Fix incorrect property test (tokens aren't additive)

Test count: 434 (up from 426)
Copilot AI review requested due to automatic review settings March 6, 2026 22:10
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces property-based testing for token estimation and significantly expands the Tauri app’s backend/frontend capabilities around settings, inference, MCP server management, and provisioning.

Changes:

  • Add proptest/quickcheck dev-dependencies and property/fuzz tests for token estimation utilities.
  • Introduce an MCP JSON-RPC client (stdio transport, lifecycle, discovery, registry integration) and inference types/stream parsing.
  • Add frontend settings store + chat input UI behaviors, plus multiple new Tauri IPC commands for onboarding/runtime features.

Reviewed changes

Copilot reviewed 49 out of 57 changed files in this pull request and generated 8 comments.

Show a summary per file
File Description
src/stores/settingsStore.ts Adds a Zustand settings store with IPC-backed loading and config file watch polling.
src/components/Chat/MessageInput.tsx Adds debounced message sending UX and input auto-resize behavior.
src/App.tsx Wires settings-driven config watch lifecycle and a toast for reload notifications.
src-tauri/tauri.conf.json Adds/updates Tauri app configuration (windowing, CSP, bundling).
src-tauri/src/mcp_client/types.rs Introduces shared JSON-RPC + MCP protocol types with unit tests.
src-tauri/src/mcp_client/transport.rs Implements JSON-RPC-over-stdio transport and result extraction helpers.
src-tauri/src/mcp_client/mod.rs Defines MCP client module surface and re-exports.
src-tauri/src/mcp_client/lifecycle.rs Adds server process spawn/init, restart backoff, and shutdown management.
src-tauri/src/mcp_client/errors.rs Defines MCP client error taxonomy via thiserror.
src-tauri/src/mcp_client/discovery.rs Adds filesystem-based MCP server auto-discovery + merge logic and tests.
src-tauri/src/mcp_client/client.rs Implements high-level MCP client API for server lifecycle and tool execution.
src-tauri/src/main.rs Adds the Tauri entry point delegating to localcowork::run().
src-tauri/src/inference/types.rs Adds OpenAI-compatible request/response types and associated tests.
src-tauri/src/inference/streaming.rs Adds SSE streaming parser and non-streaming parsing fallback with tests.
src-tauri/src/inference/mod.rs Defines inference module surface and re-exports.
src-tauri/src/inference/errors.rs Adds structured inference errors and convenience helpers with tests.
src-tauri/src/inference/config.rs Adds model config loading, env-var interpolation, and active model resolution.
src-tauri/src/commands/session.rs Adds session listing/loading/deletion/context budget IPC commands.
src-tauri/src/commands/python_env_startup.rs Adds startup-time Python venv provisioning for MCP servers.
src-tauri/src/commands/python_env.rs Adds IPC-driven Python venv provisioning with progress events and tests.
src-tauri/src/commands/ollama.rs Adds IPC for Ollama/llama.cpp status, model listing, and streaming pull progress.
src-tauri/src/commands/model_download.rs Adds streaming model download + SHA-256 verification IPC with progress events.
src-tauri/src/commands/mod.rs Registers IPC command modules and includes a greet placeholder command.
src-tauri/src/commands/hardware.rs Adds hardware detection IPC with runtime/quantization recommendations.
src-tauri/src/commands/filesystem.rs Adds stub filesystem browsing IPC for the file browser UI.
src-tauri/src/agent_core/types.rs Adds agent core shared types (messages, sessions, audit, confirmations, budgets).
src-tauri/src/agent_core/tool_router.rs Adds tool execution routing with confirmation/permissions, retries, audit + undo.
src-tauri/src/agent_core/tool_prefilter.rs Adds embedding-based RAG tool pre-filtering utilities and tests.
src-tauri/src/agent_core/tokens.rs Adds token estimation utilities plus proptest/quickcheck-based tests.
src-tauri/src/agent_core/response_analysis.rs Adds heuristics to detect incomplete/deflection responses with tests.
src-tauri/src/agent_core/plan_templates.rs Adds template-based planning for known use cases with tests.
src-tauri/src/agent_core/plan_parser.rs Adds bracket + JSON plan parsing utilities with tests.
src-tauri/src/agent_core/permissions.rs Adds tiered permission store with persistence and tests.
src-tauri/src/agent_core/mod.rs Defines agent core module surface and re-exports.
src-tauri/src/agent_core/errors.rs Adds agent-level error types and conversions.
src-tauri/rustfmt.toml Introduces Rust formatting configuration for consistent style.
src-tauri/mcp-servers.json Adds an (empty) MCP servers config file placeholder.
src-tauri/entitlements.plist Adds macOS entitlements needed for processes/network/file dialogs.
src-tauri/capabilities/default.json Adds Tauri capabilities (shell/dialog permissions).
src-tauri/build.rs Adds Tauri build script hook.
src-tauri/Cargo.toml Adds new dependencies/dev-dependencies (incl. proptest/quickcheck) and config.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +142 to +145
if (interval) {
clearInterval(interval);
}
set({ configWatchInterval: null });
Copy link

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

interval is not defined in this scope, so this will fail at compile time and config watching will never stop. Use the retrieved configWatchInterval (and clear it only when non-null) when calling clearInterval(...).

Suggested change
if (interval) {
clearInterval(interval);
}
set({ configWatchInterval: null });
if (configWatchInterval !== null) {
clearInterval(configWatchInterval);
set({ configWatchInterval: null });
}

Copilot uses AI. Check for mistakes.
// fs was overridden
assert_eq!(merged["fs"].command, "node");
// ocr was NOT overridden — preserved from discovery
assert_eq!(merged["ocr"].command, "npx");
Copy link

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This assertion is platform-dependent: on Windows default_npx_command() returns "npx.cmd", so the discovered command will not be "npx". Update the test to assert against default_npx_command() (or otherwise branch on platform) to avoid failing CI on Windows.

Suggested change
assert_eq!(merged["ocr"].command, "npx");
assert_eq!(merged["ocr"].command, default_npx_command());

Copilot uses AI. Check for mistakes.
Comment on lines +50 to +56
pub async fn request(
&self,
method: &str,
params: Option<serde_json::Value>,
) -> Result<JsonRpcResponse, McpError> {
let id = next_request_id();
let req = JsonRpcRequest::new(id, method, params);
Copy link

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

request(&self, ...) allows multiple concurrent callers. If two requests overlap, the first reader loop can consume and discard a response for a different id, causing the other request to hang indefinitely waiting for a response that was already dropped. To make this safe, either serialize the entire request/response pipeline behind a single lock (write+read as one critical section), or implement response demultiplexing (store per-request waiters keyed by id and route responses accordingly) instead of discarding mismatched responses.

Copilot uses AI. Check for mistakes.
let mut line_buf = String::new();
let mut reader = self.reader.lock().await;

loop {
Copy link

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

request(&self, ...) allows multiple concurrent callers. If two requests overlap, the first reader loop can consume and discard a response for a different id, causing the other request to hang indefinitely waiting for a response that was already dropped. To make this safe, either serialize the entire request/response pipeline behind a single lock (write+read as one critical section), or implement response demultiplexing (store per-request waiters keyed by id and route responses accordingly) instead of discarding mismatched responses.

Copilot uses AI. Check for mistakes.
Comment on lines +110 to +117
match serde_json::from_str::<JsonRpcResponse>(trimmed) {
Ok(resp) if resp.id == id => return Ok(resp),
Ok(_) => {
// Response for a different request ID — skip
// This shouldn't happen in our single-threaded protocol,
// but handle gracefully.
continue;
}
Copy link

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

request(&self, ...) allows multiple concurrent callers. If two requests overlap, the first reader loop can consume and discard a response for a different id, causing the other request to hang indefinitely waiting for a response that was already dropped. To make this safe, either serialize the entire request/response pipeline behind a single lock (write+read as one critical section), or implement response demultiplexing (store per-request waiters keyed by id and route responses accordingly) instead of discarding mismatched responses.

Copilot uses AI. Check for mistakes.
Comment on lines +224 to +225
// 1D: already pooled — [f32, f32, ...]
serde_json::Value::Array(arr) if arr[0].is_f64() => {
Copy link

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

serde_json::Value::is_f64() will be false for integer JSON numbers even though as_f64() can still succeed, so a 1D embedding like [1, 2, 3] would incorrectly fall through and error. Prefer checking arr[0].is_number() (or just attempt as_f64() on the first element) to robustly accept both integer and float number arrays.

Suggested change
// 1D: already pooled — [f32, f32, ...]
serde_json::Value::Array(arr) if arr[0].is_f64() => {
// 1D: already pooled — [number, number, ...]
serde_json::Value::Array(arr) if arr[0].is_number() => {

Copilot uses AI. Check for mistakes.
Comment thread src/App.tsx
Comment on lines +45 to +56
{configReloadNotification && (
<div
className="config-reload-toast"
onClick={clearConfigReloadNotification}
>
<span className="toast-icon">🔄</span>
<span className="toast-message">{configReloadNotification}</span>
<button className="toast-close" aria-label="Dismiss">
×
</button>
</div>
)}
Copy link

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The toast container is clickable but not keyboard-accessible (no role/tabIndex/keyboard handlers). Also, the “Dismiss” button has no onClick handler (it only works via event bubbling to the parent), which is fragile and less explicit for assistive tech. Consider making dismissal a real button action (attach onClick={clearConfigReloadNotification} to the button, add type="button", and optionally stop propagation) and make the container non-interactive or add proper keyboard support if the whole toast is meant to be clickable.

Copilot uses AI. Check for mistakes.
Comment on lines +36 to +41
const now = Date.now();
if (now - lastSendTimeRef.current < SEND_DEBOUNCE_MS) {
setIsDebouncing(true);
setTimeout(() => setIsDebouncing(false), SEND_DEBOUNCE_MS);
return;
}
Copy link

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The debounce “cooldown” timer isn’t tracked/cleared, so rapid repeated triggers can schedule multiple timeouts, and unmounting the component could leave a pending timer that later tries to set state. Store the timeout id in a ref and clear it before scheduling a new one (and in a cleanup effect) to keep state transitions predictable.

Copilot uses AI. Check for mistakes.
refactor: improve test module organization with doc comments
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants