-
-
Notifications
You must be signed in to change notification settings - Fork 18
feat(ai-monitoring): add sampling guidance and Node.js AI monitoring reference #87
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
7 commits
Select commit
Hold shift + click to select a range
4bbdfca
feat(ai-monitoring): add sampling guidance and Node.js AI monitoring …
sergical e496551
fix(ai-monitoring): trim verbose skill references, update model names
sergical eb50926
fix(node-sdk): bring ai-monitoring reference up to Python's depth
sergical ed41ce4
fix(node-sdk): add unsupported providers table to ai-monitoring
sergical b12e09f
fix(node-sdk): remove Python-only providers from Node unsupported table
sergical b5bbd95
fix(node-sdk): address PR review - remove handoff span, fix model name
sergical c521fa7
fix(node-sdk): use gen_ai.request not gen_ai.chat in examples
sergical File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,213 @@ | ||
| # AI Monitoring - Sentry Node.js SDK | ||
|
|
||
| > Minimum SDK: `@sentry/node` >=10.28.0 (OpenAI, Anthropic, LangChain, LangGraph, Google GenAI). Vercel AI SDK: >=10.6.0. | ||
|
|
||
| ## Prerequisites | ||
|
|
||
| Tracing must be enabled - AI spans require an active trace: | ||
|
|
||
| ```typescript | ||
| Sentry.init({ dsn: "...", tracesSampleRate: 1.0 }); | ||
| ``` | ||
|
|
||
| ## Integration Matrix | ||
|
|
||
| | Integration | Min Library | Auto-Enabled | Status | | ||
| |-------------|-------------|-------------|--------| | ||
| | OpenAI (`openai`) | openai 4.0+ | Yes | Stable | | ||
| | Anthropic (`@anthropic-ai/sdk`) | 0.19.2+ | Yes | Stable | | ||
| | Vercel AI SDK (`ai`) | ai 3.0+ | Yes* | Stable | | ||
| | LangChain (`@langchain/core`) | 0.1.0+ | Yes | Stable | | ||
| | LangGraph (`@langchain/langgraph`) | 0.1.0+ | Yes | Stable | | ||
| | Google GenAI (`@google/genai`) | 1.0+ | Yes | Stable | | ||
|
|
||
| *Vercel AI SDK requires `experimental_telemetry: { isEnabled: true }` on every call. | ||
|
|
||
| ## PII Control | ||
|
|
||
| | `sendDefaultPii` | `recordInputs` | Prompts captured? | | ||
| |-------------------|-----------------|-------------------| | ||
| | `false` (default) | `true` | No | | ||
| | `true` | `true` (default) | Yes | | ||
| | `true` | `false` | No | | ||
|
|
||
| ## Configuration Examples | ||
|
|
||
| ### Auto-enabled integrations | ||
|
|
||
| ```typescript | ||
| import * as Sentry from "@sentry/node"; | ||
|
|
||
| Sentry.init({ | ||
| dsn: process.env.SENTRY_DSN, | ||
| tracesSampleRate: 1.0, | ||
| sendDefaultPii: true, // required to capture prompts/outputs | ||
| }); | ||
| // OpenAI, Anthropic, LangChain, LangGraph, Google GenAI activate automatically | ||
| ``` | ||
|
|
||
| ### Explicit configuration with recordInputs/recordOutputs override | ||
|
|
||
| ```typescript | ||
| Sentry.init({ | ||
| dsn: process.env.SENTRY_DSN, | ||
| tracesSampleRate: 1.0, | ||
| integrations: [ | ||
| Sentry.openAIIntegration({ recordInputs: true, recordOutputs: true }), | ||
| Sentry.vercelAIIntegration({ recordInputs: true, recordOutputs: true }), | ||
| ], | ||
| }); | ||
| ``` | ||
|
|
||
| ### Vercel AI SDK per-call telemetry (required) | ||
|
|
||
| ```typescript | ||
| await generateText({ | ||
| model: openai("gpt-4.1"), | ||
| prompt: "Hello", | ||
| experimental_telemetry: { isEnabled: true, recordInputs: true, recordOutputs: true }, | ||
| }); | ||
| ``` | ||
|
|
||
| ### Browser / Next.js client-side (manual wrapping required) | ||
|
|
||
| ```typescript | ||
| import OpenAI from "openai"; | ||
| import * as Sentry from "@sentry/nextjs"; // or @sentry/browser | ||
|
|
||
| const openai = Sentry.instrumentOpenAiClient(new OpenAI()); | ||
| ``` | ||
|
|
||
| ## Manual Instrumentation - `gen_ai.*` Spans | ||
|
|
||
| Use when the library isn't supported, or for wrapping custom AI logic. | ||
|
|
||
| ### `gen_ai.request` - LLM call | ||
|
|
||
| ```typescript | ||
| await Sentry.startSpan({ | ||
| op: "gen_ai.request", | ||
| name: "chat claude-sonnet-4-6", | ||
| attributes: { "gen_ai.request.model": "claude-sonnet-4-6" }, | ||
| }, async (span) => { | ||
| span.setAttribute("gen_ai.request.messages", JSON.stringify(messages)); | ||
| const result = await myClient.chat(messages); | ||
| span.setAttribute("gen_ai.usage.input_tokens", result.usage.inputTokens); | ||
| span.setAttribute("gen_ai.usage.output_tokens", result.usage.outputTokens); | ||
| return result; | ||
| }); | ||
| ``` | ||
|
|
||
| ### `gen_ai.invoke_agent` - Agent lifecycle | ||
|
|
||
| ```typescript | ||
| await Sentry.startSpan({ | ||
| op: "gen_ai.invoke_agent", | ||
| name: "invoke_agent Weather Agent", | ||
| attributes: { "gen_ai.agent.name": "Weather Agent", "gen_ai.request.model": "claude-sonnet-4-6" }, | ||
| }, async (span) => { | ||
| const result = await myAgent.run(task); | ||
| span.setAttribute("gen_ai.usage.input_tokens", result.totalInputTokens); | ||
| span.setAttribute("gen_ai.usage.output_tokens", result.totalOutputTokens); | ||
| return result; | ||
| }); | ||
| ``` | ||
|
|
||
| ### `gen_ai.execute_tool` - Tool/function call | ||
|
|
||
| ```typescript | ||
| await Sentry.startSpan({ | ||
| op: "gen_ai.execute_tool", | ||
| name: "execute_tool get_weather", | ||
| attributes: { | ||
| "gen_ai.tool.name": "get_weather", | ||
| "gen_ai.tool.type": "function", | ||
| "gen_ai.tool.input": JSON.stringify({ location: "Paris" }), | ||
| }, | ||
| }, async (span) => { | ||
| const result = await getWeather("Paris"); | ||
| span.setAttribute("gen_ai.tool.output", JSON.stringify(result)); | ||
| return result; | ||
| }); | ||
| ``` | ||
|
|
||
|
|
||
| ## Span Attribute Reference | ||
|
|
||
| ### Common attributes | ||
|
|
||
| | Attribute | Type | Required | Description | | ||
| |-----------|------|----------|-------------| | ||
| | `gen_ai.request.model` | string | Yes | Model identifier (e.g., `claude-sonnet-4-6`, `gemini-2.5-flash`) | | ||
| | `gen_ai.operation.name` | string | No | Human-readable operation label | | ||
| | `gen_ai.agent.name` | string | No | Agent name (for agent spans) | | ||
|
|
||
| ### Content attributes (PII-gated - only when `sendDefaultPii: true` + `recordInputs/recordOutputs: true`) | ||
|
|
||
| | Attribute | Type | Description | | ||
| |-----------|------|-------------| | ||
| | `gen_ai.request.messages` | string | **JSON-stringified** message array | | ||
| | `gen_ai.request.available_tools` | string | **JSON-stringified** tool definitions | | ||
| | `gen_ai.response.text` | string | **JSON-stringified** response array | | ||
| | `gen_ai.response.tool_calls` | string | **JSON-stringified** tool call array | | ||
|
|
||
| > Span attributes only accept primitives - arrays/objects must be JSON-stringified. | ||
|
|
||
| ### Token usage attributes | ||
|
|
||
| | Attribute | Type | Description | | ||
| |-----------|------|-------------| | ||
| | `gen_ai.usage.input_tokens` | int | Total input tokens (including cached) | | ||
| | `gen_ai.usage.input_tokens.cached` | int | Subset served from cache | | ||
| | `gen_ai.usage.input_tokens.cache_write` | int | Tokens written to cache (Anthropic) | | ||
| | `gen_ai.usage.output_tokens` | int | Total output tokens (including reasoning) | | ||
| | `gen_ai.usage.output_tokens.reasoning` | int | Subset for chain-of-thought (o3, etc.) | | ||
| | `gen_ai.usage.total_tokens` | int | Sum of input + output | | ||
|
|
||
| > Cached and reasoning tokens are **subsets** of totals, not additive. Incorrect reporting produces wrong cost calculations. | ||
|
|
||
| ## Agent Workflow Hierarchy | ||
|
|
||
| ``` | ||
| Transaction | ||
| └── gen_ai.invoke_agent "Weather Agent" | ||
| ├── gen_ai.request "chat claude-sonnet-4-6" | ||
| ├── gen_ai.execute_tool "get_weather" | ||
| ├── gen_ai.request "chat claude-sonnet-4-6" ← follow-up | ||
| └── gen_ai.execute_tool "format_report" | ||
| ``` | ||
|
|
||
| ## Streaming | ||
|
|
||
| | Integration | Streaming | Token counts in streams | | ||
| |-------------|-----------|------------------------| | ||
| | OpenAI | Yes | Requires `stream_options: { include_usage: true }` | | ||
| | Anthropic | Yes | Automatic | | ||
| | Vercel AI SDK | Yes | Automatic (with `experimental_telemetry`) | | ||
| | LangChain | Yes | Tracked | | ||
| | Manual `gen_ai.*` | Yes | Set token counts after stream completes | | ||
|
|
||
| ## Unsupported Providers | ||
|
|
||
| | Provider | Workaround | | ||
| |----------|-----------| | ||
| | Cohere | Manual `gen_ai.*` spans | | ||
| | AWS Bedrock | Manual `gen_ai.*` spans | | ||
| | Mistral | Manual `gen_ai.*` spans | | ||
| | Groq | Manual `gen_ai.*` spans | | ||
|
|
||
| ## Sampling Strategy | ||
|
|
||
| If `tracesSampleRate` < 1.0, see the [AI sampling guide](../../sentry-setup-ai-monitoring/references/sampling.md). | ||
|
|
||
| ## Troubleshooting | ||
|
|
||
| | Issue | Solution | | ||
| |-------|----------| | ||
| | No AI spans appearing | Verify `tracesSampleRate > 0`; check SDK >=10.28.0 | | ||
| | Token counts missing in streams | Add `stream_options: { include_usage: true }` (OpenAI) | | ||
| | Vercel AI spans not tracked | Add `experimental_telemetry: { isEnabled: true }` per call | | ||
| | Browser OpenAI not traced | Use `Sentry.instrumentOpenAiClient()` - auto-instrumentation is server-only | | ||
| | Prompts not captured | Set `sendDefaultPii: true` or explicit `recordInputs: true` | | ||
| | AI Agents Dashboard empty | Ensure traces are being sent; check DSN and `tracesSampleRate` | | ||
| | Wrong cost calculations | Cached/reasoning tokens are subsets of totals, not additions | |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,82 @@ | ||
| # Sampling Strategy for AI Agent Spans | ||
|
|
||
| > `@sentry/node` >=9.x (`inheritOrSampleWith`), `sentry-sdk` >=2.x (`traces_sampler`) | ||
|
|
||
| ## The Problem | ||
|
|
||
| Agent runs are span trees. Sampling decides at the root; children inherit. Drop the root, lose every child span. At any rate below 1.0, you lose entire agent executions. | ||
|
|
||
| ## How It Works | ||
|
|
||
| `tracesSampler` / `traces_sampler` only fires on **root spans**. Non-root spans (including `gen_ai.*` children) inherit unconditionally. | ||
|
|
||
| **Scenario 1: gen_ai span IS the root** (cron, queue consumer, CLI). The sampler sees `gen_ai.*` directly. Match and return 1.0. | ||
|
|
||
| **Scenario 2: gen_ai spans are children of HTTP transactions** (most web apps). `POST /api/chat` is sampled before any AI code runs. Solution: sample AI routes at 1.0. | ||
|
|
||
| ## JavaScript | ||
|
|
||
| ```javascript | ||
| Sentry.init({ | ||
| dsn: process.env.SENTRY_DSN, | ||
| tracesSampler: ({ name, attributes, inheritOrSampleWith }) => { | ||
| // Standalone gen_ai root spans | ||
| if (attributes?.['sentry.op']?.startsWith('gen_ai.') || attributes?.['gen_ai.system']) { | ||
| return 1.0; | ||
| } | ||
| // HTTP routes that trigger AI calls | ||
| if (name?.includes('/api/chat') || name?.includes('/api/agent')) { | ||
| return 1.0; | ||
| } | ||
| return inheritOrSampleWith(0.2); // adjust to your baseline | ||
| }, | ||
| }); | ||
| ``` | ||
|
|
||
| ## Python | ||
|
|
||
| ```python | ||
| def traces_sampler(sampling_context): | ||
| tx = sampling_context.get("transaction_context", {}) | ||
| op, name = tx.get("op", ""), tx.get("name", "") | ||
|
|
||
| if op.startswith("gen_ai."): | ||
| return 1.0 | ||
| if op == "http.server" and any(p in name for p in ["/api/chat", "/api/agent"]): | ||
| return 1.0 | ||
|
|
||
| parent = sampling_context.get("parent_sampled") | ||
| if parent is not None: | ||
| return float(parent) | ||
| return 0.2 | ||
|
|
||
| sentry_sdk.init(dsn="...", traces_sampler=traces_sampler) | ||
| ``` | ||
|
|
||
| If AI is the core product, skip `tracesSampler` and use `tracesSampleRate: 1.0`. | ||
|
|
||
| ## Fallback: Metrics + Logs | ||
|
|
||
| If 100% tracing isn't feasible, emit metrics and logs on every LLM call (independent of trace sampling): | ||
|
|
||
| ```python | ||
| # Metrics - 100% coverage of cost/usage/latency | ||
| sentry_sdk.metrics.distribution("gen_ai.token_usage", usage.total_tokens, | ||
| attributes={"model": model, "user_id": str(user.id)}) | ||
| sentry_sdk.metrics.count("gen_ai.calls", 1, | ||
| attributes={"model": model, "status": "error" if error else "success"}) | ||
|
|
||
| # Logs - 100% searchable per-call records | ||
| sentry_sdk.logger.info("LLM call", model=model, input_tokens=usage.prompt_tokens, | ||
| output_tokens=usage.completion_tokens, latency_ms=response_time_ms) | ||
| ``` | ||
|
|
||
| JS equivalent uses `Sentry.metrics.*` and `Sentry.logger.*` with the same attribute patterns. | ||
|
|
||
| ## Troubleshooting | ||
|
|
||
| | Issue | Solution | | ||
| |-------|----------| | ||
| | gen_ai spans missing despite sampler returning 1.0 | Parent HTTP transaction was sampled at a lower rate. Add the route to your sampler. | | ||
| | `tracesSampler` not called for gen_ai spans | Expected. It only runs on root spans. Sample the parent HTTP route instead. | | ||
| | All traces at 100% | Check the fallback rate in `inheritOrSampleWith()` / default return value. | |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.