getsentry · sergical · Mar 31, 2026 · Mar 30, 2026 · Mar 30, 2026 · Mar 30, 2026
diff --git a/skills/sentry-nestjs-sdk/references/ai-monitoring.md b/skills/sentry-nestjs-sdk/references/ai-monitoring.md
@@ -341,6 +341,12 @@ Access at **Sentry → AI → Agents** (or **Insights → AI**).
 
 ---
 
+## Sampling Strategy
+
+If your `tracesSampleRate` is below 1.0, you may be losing entire agent runs. See the [AI sampling guide](../../sentry-setup-ai-monitoring/references/sampling.md) for `tracesSampler` patterns that keep 100% of gen_ai-related transactions while sampling other traffic at a lower rate.
+
+---
+
 ## Troubleshooting
 
 | Issue | Solution |

diff --git a/skills/sentry-nextjs-sdk/references/ai-monitoring.md b/skills/sentry-nextjs-sdk/references/ai-monitoring.md
@@ -393,6 +393,12 @@ Access at **Sentry → AI → Agents** (or **Insights → AI**).
 
 ---
 
+## Sampling Strategy
+
+If your `tracesSampleRate` is below 1.0, you may be losing entire agent runs. See the [AI sampling guide](../../sentry-setup-ai-monitoring/references/sampling.md) for `tracesSampler` patterns that keep 100% of gen_ai-related transactions while sampling other traffic at a lower rate.
+
+---
+
 ## Troubleshooting
 
 | Issue | Solution |

diff --git a/skills/sentry-node-sdk/references/ai-monitoring.md b/skills/sentry-node-sdk/references/ai-monitoring.md
@@ -0,0 +1,213 @@
+# AI Monitoring - Sentry Node.js SDK
+
+> Minimum SDK: `@sentry/node` >=10.28.0 (OpenAI, Anthropic, LangChain, LangGraph, Google GenAI). Vercel AI SDK: >=10.6.0.
+
+## Prerequisites
+
+Tracing must be enabled - AI spans require an active trace:
+
+```typescript
+Sentry.init({ dsn: "...", tracesSampleRate: 1.0 });
+```
+
+## Integration Matrix
+
+| Integration | Min Library | Auto-Enabled | Status |
+|-------------|-------------|-------------|--------|
+| OpenAI (`openai`) | openai 4.0+ | Yes | Stable |
+| Anthropic (`@anthropic-ai/sdk`) | 0.19.2+ | Yes | Stable |
+| Vercel AI SDK (`ai`) | ai 3.0+ | Yes* | Stable |
+| LangChain (`@langchain/core`) | 0.1.0+ | Yes | Stable |
+| LangGraph (`@langchain/langgraph`) | 0.1.0+ | Yes | Stable |
+| Google GenAI (`@google/genai`) | 1.0+ | Yes | Stable |
+
+*Vercel AI SDK requires `experimental_telemetry: { isEnabled: true }` on every call.
+
+## PII Control
+
+| `sendDefaultPii` | `recordInputs` | Prompts captured? |
+|-------------------|-----------------|-------------------|
+| `false` (default) | `true` | No |
+| `true` | `true` (default) | Yes |
+| `true` | `false` | No |
+
+## Configuration Examples
+
+### Auto-enabled integrations
+
+```typescript
+import * as Sentry from "@sentry/node";
+
+Sentry.init({
+  dsn: process.env.SENTRY_DSN,
+  tracesSampleRate: 1.0,
+  sendDefaultPii: true, // required to capture prompts/outputs
+});
+// OpenAI, Anthropic, LangChain, LangGraph, Google GenAI activate automatically
+```
+
+### Explicit configuration with recordInputs/recordOutputs override
+
+```typescript
+Sentry.init({
+  dsn: process.env.SENTRY_DSN,
+  tracesSampleRate: 1.0,
+  integrations: [
+    Sentry.openAIIntegration({ recordInputs: true, recordOutputs: true }),
+    Sentry.vercelAIIntegration({ recordInputs: true, recordOutputs: true }),
+  ],
+});
+```
+
+### Vercel AI SDK per-call telemetry (required)
+
+```typescript
+await generateText({
+  model: openai("gpt-4.1"),
+  prompt: "Hello",
+  experimental_telemetry: { isEnabled: true, recordInputs: true, recordOutputs: true },
+});
+```
+
+### Browser / Next.js client-side (manual wrapping required)
+
+```typescript
+import OpenAI from "openai";
+import * as Sentry from "@sentry/nextjs"; // or @sentry/browser
+
+const openai = Sentry.instrumentOpenAiClient(new OpenAI());
+```
+
+## Manual Instrumentation - `gen_ai.*` Spans
+
+Use when the library isn't supported, or for wrapping custom AI logic.
+
+### `gen_ai.request` - LLM call
+
+```typescript
+await Sentry.startSpan({
+  op: "gen_ai.request",
+  name: "chat claude-sonnet-4-6",
+  attributes: { "gen_ai.request.model": "claude-sonnet-4-6" },
+}, async (span) => {
+  span.setAttribute("gen_ai.request.messages", JSON.stringify(messages));
+  const result = await myClient.chat(messages);
+  span.setAttribute("gen_ai.usage.input_tokens", result.usage.inputTokens);
+  span.setAttribute("gen_ai.usage.output_tokens", result.usage.outputTokens);
+  return result;
+});
+```
+
+### `gen_ai.invoke_agent` - Agent lifecycle
+
+```typescript
+await Sentry.startSpan({
+  op: "gen_ai.invoke_agent",
+  name: "invoke_agent Weather Agent",
+  attributes: { "gen_ai.agent.name": "Weather Agent", "gen_ai.request.model": "claude-sonnet-4-6" },
+}, async (span) => {
+  const result = await myAgent.run(task);
+  span.setAttribute("gen_ai.usage.input_tokens", result.totalInputTokens);
+  span.setAttribute("gen_ai.usage.output_tokens", result.totalOutputTokens);
+  return result;
+});
+```
+
+### `gen_ai.execute_tool` - Tool/function call
+
+```typescript
+await Sentry.startSpan({
+  op: "gen_ai.execute_tool",
+  name: "execute_tool get_weather",
+  attributes: {
+    "gen_ai.tool.name": "get_weather",
+    "gen_ai.tool.type": "function",
+    "gen_ai.tool.input": JSON.stringify({ location: "Paris" }),
+  },
+}, async (span) => {
+  const result = await getWeather("Paris");
+  span.setAttribute("gen_ai.tool.output", JSON.stringify(result));
+  return result;
+});
+```
+
+
+## Span Attribute Reference
+
+### Common attributes
+
+| Attribute | Type | Required | Description |
+|-----------|------|----------|-------------|
+| `gen_ai.request.model` | string | Yes | Model identifier (e.g., `claude-sonnet-4-6`, `gemini-2.5-flash`) |
+| `gen_ai.operation.name` | string | No | Human-readable operation label |
+| `gen_ai.agent.name` | string | No | Agent name (for agent spans) |
+
+### Content attributes (PII-gated - only when `sendDefaultPii: true` + `recordInputs/recordOutputs: true`)
+
+| Attribute | Type | Description |
+|-----------|------|-------------|
+| `gen_ai.request.messages` | string | **JSON-stringified** message array |
+| `gen_ai.request.available_tools` | string | **JSON-stringified** tool definitions |
+| `gen_ai.response.text` | string | **JSON-stringified** response array |
+| `gen_ai.response.tool_calls` | string | **JSON-stringified** tool call array |
+
+> Span attributes only accept primitives - arrays/objects must be JSON-stringified.
+
+### Token usage attributes
+
+| Attribute | Type | Description |
+|-----------|------|-------------|
+| `gen_ai.usage.input_tokens` | int | Total input tokens (including cached) |
+| `gen_ai.usage.input_tokens.cached` | int | Subset served from cache |
+| `gen_ai.usage.input_tokens.cache_write` | int | Tokens written to cache (Anthropic) |
+| `gen_ai.usage.output_tokens` | int | Total output tokens (including reasoning) |
+| `gen_ai.usage.output_tokens.reasoning` | int | Subset for chain-of-thought (o3, etc.) |
+| `gen_ai.usage.total_tokens` | int | Sum of input + output |
+
+> Cached and reasoning tokens are **subsets** of totals, not additive. Incorrect reporting produces wrong cost calculations.
+
+## Agent Workflow Hierarchy
+
+```
+Transaction
+└── gen_ai.invoke_agent  "Weather Agent"
+    ├── gen_ai.request      "chat claude-sonnet-4-6"
+    ├── gen_ai.execute_tool "get_weather"
+    ├── gen_ai.request      "chat claude-sonnet-4-6"     ← follow-up
+    └── gen_ai.execute_tool "format_report"
+```
+
+## Streaming
+
+| Integration | Streaming | Token counts in streams |
+|-------------|-----------|------------------------|
+| OpenAI | Yes | Requires `stream_options: { include_usage: true }` |
+| Anthropic | Yes | Automatic |
+| Vercel AI SDK | Yes | Automatic (with `experimental_telemetry`) |
+| LangChain | Yes | Tracked |
+| Manual `gen_ai.*` | Yes | Set token counts after stream completes |
+
+## Unsupported Providers
+
+| Provider | Workaround |
+|----------|-----------|
+| Cohere | Manual `gen_ai.*` spans |
+| AWS Bedrock | Manual `gen_ai.*` spans |
+| Mistral | Manual `gen_ai.*` spans |
+| Groq | Manual `gen_ai.*` spans |
+
+## Sampling Strategy
+
+If `tracesSampleRate` < 1.0, see the [AI sampling guide](../../sentry-setup-ai-monitoring/references/sampling.md).
+
+## Troubleshooting
+
+| Issue | Solution |
+|-------|----------|
+| No AI spans appearing | Verify `tracesSampleRate > 0`; check SDK >=10.28.0 |
+| Token counts missing in streams | Add `stream_options: { include_usage: true }` (OpenAI) |
+| Vercel AI spans not tracked | Add `experimental_telemetry: { isEnabled: true }` per call |
+| Browser OpenAI not traced | Use `Sentry.instrumentOpenAiClient()` - auto-instrumentation is server-only |
+| Prompts not captured | Set `sendDefaultPii: true` or explicit `recordInputs: true` |
+| AI Agents Dashboard empty | Ensure traces are being sent; check DSN and `tracesSampleRate` |
+| Wrong cost calculations | Cached/reasoning tokens are subsets of totals, not additions |
diff --git a/skills/sentry-python-sdk/references/ai-monitoring.md b/skills/sentry-python-sdk/references/ai-monitoring.md
@@ -281,6 +281,10 @@ sentry_sdk.ai.set_conversation_id("user-session-abc123")
 | Groq | `LiteLLMIntegration` |
 | Vertex AI | `GoogleGenAIIntegration` or `LiteLLMIntegration` |
 
+## Sampling Strategy
+
+If your `traces_sample_rate` is below 1.0, you may be losing entire agent runs. See the [AI sampling guide](../../sentry-setup-ai-monitoring/references/sampling.md) for `traces_sampler` patterns that keep 100% of gen_ai-related transactions while sampling other traffic at a lower rate.
+
 ## Troubleshooting
 
 | Issue | Solution |

diff --git a/skills/sentry-setup-ai-monitoring/SKILL.md b/skills/sentry-setup-ai-monitoring/SKILL.md
@@ -47,6 +47,26 @@ grep -E '"(openai|@anthropic-ai/sdk|ai|@langchain|@google/genai)"' package.json
 grep -E '(openai|anthropic|langchain|huggingface)' requirements.txt pyproject.toml 2>/dev/null
 ```
 
+## Sampling Check
+
+After detecting AI SDKs, check the current sampling configuration:
+
+```bash
+# JavaScript
+grep -E 'tracesSampleRate|tracesSampler' sentry.*.config.* instrument.* src/instrument.* app/instrument.* 2>/dev/null
+
+# Python
+grep -E 'traces_sample_rate|traces_sampler' *.py **/*.py 2>/dev/null
+```
+
+**If `tracesSampleRate` / `traces_sample_rate` is below 1.0 AND no `tracesSampler` / `traces_sampler` is configured:**
+
+Ask the user:
+
+> "Your current sample rate is {rate}. Agent runs are sampled as complete span trees — if the root span is dropped, all child gen_ai spans are lost. For full AI visibility, gen_ai-related transactions should be sampled at 100%. Would you like me to set up a `tracesSampler` that keeps AI traces at 100% while sampling other traffic at your current rate?"
+
+If user confirms, read `${SKILL_ROOT}/references/sampling.md` for implementation patterns.
+
 ## Supported SDKs
 
 ### JavaScript

diff --git a/skills/sentry-setup-ai-monitoring/references/sampling.md b/skills/sentry-setup-ai-monitoring/references/sampling.md
@@ -0,0 +1,82 @@
+# Sampling Strategy for AI Agent Spans
+
+> `@sentry/node` >=9.x (`inheritOrSampleWith`), `sentry-sdk` >=2.x (`traces_sampler`)
+
+## The Problem
+
+Agent runs are span trees. Sampling decides at the root; children inherit. Drop the root, lose every child span. At any rate below 1.0, you lose entire agent executions.
+
+## How It Works
+
+`tracesSampler` / `traces_sampler` only fires on **root spans**. Non-root spans (including `gen_ai.*` children) inherit unconditionally.
+
+**Scenario 1: gen_ai span IS the root** (cron, queue consumer, CLI). The sampler sees `gen_ai.*` directly. Match and return 1.0.
+
+**Scenario 2: gen_ai spans are children of HTTP transactions** (most web apps). `POST /api/chat` is sampled before any AI code runs. Solution: sample AI routes at 1.0.
+
+## JavaScript
+
+```javascript
+Sentry.init({
+  dsn: process.env.SENTRY_DSN,
+  tracesSampler: ({ name, attributes, inheritOrSampleWith }) => {
+    // Standalone gen_ai root spans
+    if (attributes?.['sentry.op']?.startsWith('gen_ai.') || attributes?.['gen_ai.system']) {
+      return 1.0;
+    }
+    // HTTP routes that trigger AI calls
+    if (name?.includes('/api/chat') || name?.includes('/api/agent')) {
+      return 1.0;
+    }
+    return inheritOrSampleWith(0.2); // adjust to your baseline
+  },
+});
+```
+
+## Python
+
+```python
+def traces_sampler(sampling_context):
+    tx = sampling_context.get("transaction_context", {})
+    op, name = tx.get("op", ""), tx.get("name", "")
+
+    if op.startswith("gen_ai."):
+        return 1.0
+    if op == "http.server" and any(p in name for p in ["/api/chat", "/api/agent"]):
+        return 1.0
+
+    parent = sampling_context.get("parent_sampled")
+    if parent is not None:
+        return float(parent)
+    return 0.2
+
+sentry_sdk.init(dsn="...", traces_sampler=traces_sampler)
+```
+
+If AI is the core product, skip `tracesSampler` and use `tracesSampleRate: 1.0`.
+
+## Fallback: Metrics + Logs
+
+If 100% tracing isn't feasible, emit metrics and logs on every LLM call (independent of trace sampling):
+
+```python
+# Metrics - 100% coverage of cost/usage/latency
+sentry_sdk.metrics.distribution("gen_ai.token_usage", usage.total_tokens,
+    attributes={"model": model, "user_id": str(user.id)})
+sentry_sdk.metrics.count("gen_ai.calls", 1,
+    attributes={"model": model, "status": "error" if error else "success"})
+
+# Logs - 100% searchable per-call records
+sentry_sdk.logger.info("LLM call", model=model, input_tokens=usage.prompt_tokens,
+    output_tokens=usage.completion_tokens, latency_ms=response_time_ms)
+```
+
+JS equivalent uses `Sentry.metrics.*` and `Sentry.logger.*` with the same attribute patterns.
+
+## Troubleshooting
+
+| Issue | Solution |
+|-------|----------|
+| gen_ai spans missing despite sampler returning 1.0 | Parent HTTP transaction was sampled at a lower rate. Add the route to your sampler. |
+| `tracesSampler` not called for gen_ai spans | Expected. It only runs on root spans. Sample the parent HTTP route instead. |
+| All traces at 100% | Check the fallback rate in `inheritOrSampleWith()` / default return value. |