Skip to content

Fix Gemini tool schema required fields#570

Merged
buger merged 1 commit into
mainfrom
fix-gemini-tool-schema-required
May 29, 2026
Merged

Fix Gemini tool schema required fields#570
buger merged 1 commit into
mainfrom
fix-gemini-tool-schema-required

Conversation

@buger

@buger buger commented May 29, 2026

Copy link
Copy Markdown
Collaborator

Summary

  • sanitize plain JSON tool input schemas before wrapping them for the AI SDK
  • remove required entries that are not present in the same schema object's properties map, including nested object and array item schemas
  • preserve Zod schemas and avoid mutating MCP-provided schema objects

Tests

  • npm test -- tests/unit/mcp-message-history.test.js --runInBand
  • git diff --check

Closes #569

@probelabs

probelabs Bot commented May 29, 2026

Copy link
Copy Markdown
Contributor

PR Overview

This PR fixes a critical bug where Gemini API rejects tool schemas containing required field entries that don't exist in the corresponding properties map. The issue manifests when MCP tools or custom JSON schemas include invalid required arrays referencing non-existent properties.

Problem Statement

Gemini's API strictly validates JSON Schema required arrays against available properties. When a schema specifies required: ['field1', 'missingField'] but only defines field1 in properties, Gemini rejects the entire tool registration. This commonly occurs with:

  1. MCP tools that may have stale or incomplete schema definitions
  2. Nested object schemas where child objects have mismatched required/properties
  3. Array item schemas where items define required fields not present in their properties

Solution Architecture

The PR introduces a schema sanitization pipeline that:

  1. Detects plain JSON Schema objects vs Zod schemas (via _def property check)
  2. Recursively traverses schema objects including:
    • properties, patternProperties, definitions, $defs, dependentSchemas
    • items (both object and array forms)
    • Composition keywords: allOf, anyOf, oneOf
    • not, additionalProperties
  3. Filters required arrays to only include property names that exist in the corresponding properties map
  4. Clones schemas to avoid mutating original MCP-provided objects
  5. Preserves Zod schemas without modification

Technical Implementation

New Functions in ProbeAgent.js:

  • isPlainJsonSchemaObject(value): Distinguishes plain JSON from Zod schemas
  • sanitizeRequiredFieldsInJsonSchema(schema): Recursive sanitization logic
  • sanitizeToolInputSchema(schema): Public API with safe cloning

Modified Code Paths:

  1. Native tool wrapping (_buildNativeToolswrapTool):

    const sanitizedSchema = sanitizeToolInputSchema(schema);
    const resolvedSchema = sanitizedSchema && sanitizedSchema._def 
      ? sanitizedSchema 
      : jsonSchema(sanitizedSchema);
  2. MCP tool processing (_buildNativeTools → MCP section):

    const mcpSchema = mcpTool.inputSchema || mcpTool.parameters || { type: 'object', properties: {} };
    const sanitizedSchema = sanitizeToolInputSchema(mcpSchema);
    const wrappedSchema = sanitizedSchema && sanitizedSchema._def 
      ? sanitizedSchema 
      : jsonSchema(sanitizedSchema);

Test Coverage

Added comprehensive test in mcp-message-history.test.js:

  • Root level: required: ['jql', 'cloudId', 'options']['jql', 'options'] (removes 'cloudId')
  • Nested object: options.required: ['limit', 'missingNested']['limit']
  • Array items: filters.items.required: ['field', 'missingArrayItem']['field']
  • Immutability: Verifies original MCP schema objects remain unchanged

Architecture Impact

flowchart TD
    A[Tool Schema Input] --> B{isPlainJsonSchemaObject?}
    B -->|Yes| C[Clone Schema]
    B -->|No Zod| D[Return Original]
    C --> E[Recursive Sanitization]
    E --> F[Filter Required Arrays]
    F --> G[Visit Properties Map]
    G --> H[Visit Items/Composition]
    H --> I[Return Sanitized Schema]
    I --> J{has _def property?}
    J -->|Yes| K[Use as Zod]
    J -->|No| L[Wrap with jsonSchema]
    K --> M[Tool Registration]
    L --> M
    D --> M
Loading

Affected Components

  • ProbeAgent.js: Core tool schema processing logic
  • MCP Integration: All MCP tools now pass through sanitization
  • Native Tools: Custom JSON schemas also sanitized
  • Gemini Provider: Primary beneficiary (strict validation)
  • Other Providers: No negative impact (cleaner schemas)

Files Changed

npm/src/agent/ProbeAgent.js (+76 lines, -3 lines):

  • Added 3 new functions for schema sanitization
  • Modified 2 locations in _buildNativeTools method
  • Total: 79 lines changed

npm/tests/unit/mcp-message-history.test.js (+48 lines):

References

  • Implementation: npm/src/agent/ProbeAgent.js:150-200 (new functions), :1890-1896 (native tools), :2091-2098 (MCP tools)
  • Test Coverage: npm/tests/unit/mcp-message-history.test.js:276-323
  • Related: Schema validation utilities in npm/src/agent/schemaUtils.js
Metadata
  • Review Effort: 2 / 5
  • Primary Label: bug

Powered by Visor from Probelabs

Last updated: 2026-05-29T11:54:14.613Z | Triggered by: pr_opened | Commit: e790db9

💡 TIP: You can chat with Visor using /visor ask <your question>

@probelabs

probelabs Bot commented May 29, 2026

Copy link
Copy Markdown
Contributor

Security Issues (3)

Severity Location Issue
🟡 Warning npm/src/agent/ProbeAgent.js:217
The sanitizeToolInputSchema function uses JSON.parse(JSON.stringify(schema)) for deep cloning, which can be vulnerable to prototype pollution attacks if the schema contains malicious __proto__ or constructor properties.
💡 SuggestionReplace JSON.parse(JSON.stringify()) with a safer deep cloning method that explicitly blocks prototype pollution, or add validation to reject schemas with __proto__, constructor, or prototype properties before cloning.
🟡 Warning npm/src/agent/ProbeAgent.js:217
The try-catch block around JSON.parse(JSON.stringify()) silently returns the original schema on any error, potentially masking malicious input that causes parse errors while still allowing the unsafe schema through.
💡 SuggestionLog or track when JSON parsing fails to provide visibility into potential attack attempts, and consider whether schemas that fail to parse should be rejected entirely rather than passed through.
🟡 Warning npm/src/agent/ProbeAgent.js:154
The recursive traversal in sanitizeRequiredFieldsInJsonSchema doesn't have depth limiting for deeply nested schemas, which could be exploited for DoS attacks through extremely nested schema objects.
💡 SuggestionAdd recursion depth limits to prevent potential stack overflow or excessive processing time on malicious deeply nested schemas.

Security Issues (3)

Severity Location Issue
🟡 Warning npm/src/agent/ProbeAgent.js:217
The sanitizeToolInputSchema function uses JSON.parse(JSON.stringify(schema)) for deep cloning, which can be vulnerable to prototype pollution attacks if the schema contains malicious __proto__ or constructor properties.
💡 SuggestionReplace JSON.parse(JSON.stringify()) with a safer deep cloning method that explicitly blocks prototype pollution, or add validation to reject schemas with __proto__, constructor, or prototype properties before cloning.
🟡 Warning npm/src/agent/ProbeAgent.js:217
The try-catch block around JSON.parse(JSON.stringify()) silently returns the original schema on any error, potentially masking malicious input that causes parse errors while still allowing the unsafe schema through.
💡 SuggestionLog or track when JSON parsing fails to provide visibility into potential attack attempts, and consider whether schemas that fail to parse should be rejected entirely rather than passed through.
🟡 Warning npm/src/agent/ProbeAgent.js:154
The recursive traversal in sanitizeRequiredFieldsInJsonSchema doesn't have depth limiting for deeply nested schemas, which could be exploited for DoS attacks through extremely nested schema objects.
💡 SuggestionAdd recursion depth limits to prevent potential stack overflow or excessive processing time on malicious deeply nested schemas.
\n\n \n\n

Performance Issues (1)

Severity Location Issue
🟠 Error contract:0
Output schema validation failed: must have required property 'issues'

Quality Issues (1)

Severity Location Issue
🟠 Error contract:0
Output schema validation failed: must have required property 'issues'

Powered by Visor from Probelabs

Last updated: 2026-05-29T11:41:48.194Z | Triggered by: pr_opened | Commit: e790db9

💡 TIP: You can chat with Visor using /visor ask <your question>

@buger buger merged commit 06fd929 into main May 29, 2026
15 of 17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Gemini rejects tool schemas with required fields referencing undefined properties

1 participant