Skip to content
6 changes: 6 additions & 0 deletions skills/sentry-nestjs-sdk/references/ai-monitoring.md
Original file line number Diff line number Diff line change
Expand Up @@ -341,6 +341,12 @@ Access at **Sentry → AI → Agents** (or **Insights → AI**).

---

## Sampling Strategy

If your `tracesSampleRate` is below 1.0, you may be losing entire agent runs. See the [AI sampling guide](../../sentry-setup-ai-monitoring/references/sampling.md) for `tracesSampler` patterns that keep 100% of gen_ai-related transactions while sampling other traffic at a lower rate.

---

## Troubleshooting

| Issue | Solution |
Expand Down
6 changes: 6 additions & 0 deletions skills/sentry-nextjs-sdk/references/ai-monitoring.md
Original file line number Diff line number Diff line change
Expand Up @@ -393,6 +393,12 @@ Access at **Sentry → AI → Agents** (or **Insights → AI**).

---

## Sampling Strategy

If your `tracesSampleRate` is below 1.0, you may be losing entire agent runs. See the [AI sampling guide](../../sentry-setup-ai-monitoring/references/sampling.md) for `tracesSampler` patterns that keep 100% of gen_ai-related transactions while sampling other traffic at a lower rate.

---

## Troubleshooting

| Issue | Solution |
Expand Down
213 changes: 213 additions & 0 deletions skills/sentry-node-sdk/references/ai-monitoring.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,213 @@
# AI Monitoring - Sentry Node.js SDK

> Minimum SDK: `@sentry/node` >=10.28.0 (OpenAI, Anthropic, LangChain, LangGraph, Google GenAI). Vercel AI SDK: >=10.6.0.

## Prerequisites

Tracing must be enabled - AI spans require an active trace:

```typescript
Sentry.init({ dsn: "...", tracesSampleRate: 1.0 });
```

## Integration Matrix

| Integration | Min Library | Auto-Enabled | Status |
|-------------|-------------|-------------|--------|
| OpenAI (`openai`) | openai 4.0+ | Yes | Stable |
| Anthropic (`@anthropic-ai/sdk`) | 0.19.2+ | Yes | Stable |
| Vercel AI SDK (`ai`) | ai 3.0+ | Yes* | Stable |
| LangChain (`@langchain/core`) | 0.1.0+ | Yes | Stable |
| LangGraph (`@langchain/langgraph`) | 0.1.0+ | Yes | Stable |
| Google GenAI (`@google/genai`) | 1.0+ | Yes | Stable |

*Vercel AI SDK requires `experimental_telemetry: { isEnabled: true }` on every call.

## PII Control

| `sendDefaultPii` | `recordInputs` | Prompts captured? |
|-------------------|-----------------|-------------------|
| `false` (default) | `true` | No |
| `true` | `true` (default) | Yes |
| `true` | `false` | No |

## Configuration Examples

### Auto-enabled integrations

```typescript
import * as Sentry from "@sentry/node";

Sentry.init({
dsn: process.env.SENTRY_DSN,
tracesSampleRate: 1.0,
sendDefaultPii: true, // required to capture prompts/outputs
});
// OpenAI, Anthropic, LangChain, LangGraph, Google GenAI activate automatically
```

### Explicit configuration with recordInputs/recordOutputs override

```typescript
Sentry.init({
dsn: process.env.SENTRY_DSN,
tracesSampleRate: 1.0,
integrations: [
Sentry.openAIIntegration({ recordInputs: true, recordOutputs: true }),
Sentry.vercelAIIntegration({ recordInputs: true, recordOutputs: true }),
],
});
```

### Vercel AI SDK per-call telemetry (required)

```typescript
await generateText({
model: openai("gpt-4.1"),
prompt: "Hello",
experimental_telemetry: { isEnabled: true, recordInputs: true, recordOutputs: true },
});
```

### Browser / Next.js client-side (manual wrapping required)

```typescript
import OpenAI from "openai";
import * as Sentry from "@sentry/nextjs"; // or @sentry/browser

const openai = Sentry.instrumentOpenAiClient(new OpenAI());
```

## Manual Instrumentation - `gen_ai.*` Spans

Use when the library isn't supported, or for wrapping custom AI logic.

### `gen_ai.request` - LLM call

```typescript
await Sentry.startSpan({
op: "gen_ai.request",
name: "chat claude-sonnet-4-6",
attributes: { "gen_ai.request.model": "claude-sonnet-4-6" },
}, async (span) => {
span.setAttribute("gen_ai.request.messages", JSON.stringify(messages));
const result = await myClient.chat(messages);
span.setAttribute("gen_ai.usage.input_tokens", result.usage.inputTokens);
span.setAttribute("gen_ai.usage.output_tokens", result.usage.outputTokens);
return result;
});
```

### `gen_ai.invoke_agent` - Agent lifecycle

```typescript
await Sentry.startSpan({
op: "gen_ai.invoke_agent",
name: "invoke_agent Weather Agent",
attributes: { "gen_ai.agent.name": "Weather Agent", "gen_ai.request.model": "claude-sonnet-4-6" },
}, async (span) => {
const result = await myAgent.run(task);
span.setAttribute("gen_ai.usage.input_tokens", result.totalInputTokens);
span.setAttribute("gen_ai.usage.output_tokens", result.totalOutputTokens);
return result;
});
```

### `gen_ai.execute_tool` - Tool/function call

```typescript
await Sentry.startSpan({
op: "gen_ai.execute_tool",
name: "execute_tool get_weather",
attributes: {
"gen_ai.tool.name": "get_weather",
"gen_ai.tool.type": "function",
"gen_ai.tool.input": JSON.stringify({ location: "Paris" }),
},
}, async (span) => {
const result = await getWeather("Paris");
span.setAttribute("gen_ai.tool.output", JSON.stringify(result));
return result;
});
```


## Span Attribute Reference

### Common attributes

| Attribute | Type | Required | Description |
|-----------|------|----------|-------------|
| `gen_ai.request.model` | string | Yes | Model identifier (e.g., `claude-sonnet-4-6`, `gemini-2.5-flash`) |
| `gen_ai.operation.name` | string | No | Human-readable operation label |
| `gen_ai.agent.name` | string | No | Agent name (for agent spans) |

### Content attributes (PII-gated - only when `sendDefaultPii: true` + `recordInputs/recordOutputs: true`)

| Attribute | Type | Description |
|-----------|------|-------------|
| `gen_ai.request.messages` | string | **JSON-stringified** message array |
| `gen_ai.request.available_tools` | string | **JSON-stringified** tool definitions |
| `gen_ai.response.text` | string | **JSON-stringified** response array |
| `gen_ai.response.tool_calls` | string | **JSON-stringified** tool call array |

> Span attributes only accept primitives - arrays/objects must be JSON-stringified.

### Token usage attributes

| Attribute | Type | Description |
|-----------|------|-------------|
| `gen_ai.usage.input_tokens` | int | Total input tokens (including cached) |
| `gen_ai.usage.input_tokens.cached` | int | Subset served from cache |
| `gen_ai.usage.input_tokens.cache_write` | int | Tokens written to cache (Anthropic) |
| `gen_ai.usage.output_tokens` | int | Total output tokens (including reasoning) |
| `gen_ai.usage.output_tokens.reasoning` | int | Subset for chain-of-thought (o3, etc.) |
| `gen_ai.usage.total_tokens` | int | Sum of input + output |

> Cached and reasoning tokens are **subsets** of totals, not additive. Incorrect reporting produces wrong cost calculations.

## Agent Workflow Hierarchy

```
Transaction
└── gen_ai.invoke_agent "Weather Agent"
├── gen_ai.request "chat claude-sonnet-4-6"
├── gen_ai.execute_tool "get_weather"
├── gen_ai.request "chat claude-sonnet-4-6" ← follow-up
└── gen_ai.execute_tool "format_report"
```

## Streaming

| Integration | Streaming | Token counts in streams |
|-------------|-----------|------------------------|
| OpenAI | Yes | Requires `stream_options: { include_usage: true }` |
| Anthropic | Yes | Automatic |
| Vercel AI SDK | Yes | Automatic (with `experimental_telemetry`) |
| LangChain | Yes | Tracked |
| Manual `gen_ai.*` | Yes | Set token counts after stream completes |

## Unsupported Providers

| Provider | Workaround |
|----------|-----------|
| Cohere | Manual `gen_ai.*` spans |
| AWS Bedrock | Manual `gen_ai.*` spans |
| Mistral | Manual `gen_ai.*` spans |
| Groq | Manual `gen_ai.*` spans |

## Sampling Strategy

If `tracesSampleRate` < 1.0, see the [AI sampling guide](../../sentry-setup-ai-monitoring/references/sampling.md).

## Troubleshooting

| Issue | Solution |
|-------|----------|
| No AI spans appearing | Verify `tracesSampleRate > 0`; check SDK >=10.28.0 |
| Token counts missing in streams | Add `stream_options: { include_usage: true }` (OpenAI) |
| Vercel AI spans not tracked | Add `experimental_telemetry: { isEnabled: true }` per call |
| Browser OpenAI not traced | Use `Sentry.instrumentOpenAiClient()` - auto-instrumentation is server-only |
| Prompts not captured | Set `sendDefaultPii: true` or explicit `recordInputs: true` |
| AI Agents Dashboard empty | Ensure traces are being sent; check DSN and `tracesSampleRate` |
| Wrong cost calculations | Cached/reasoning tokens are subsets of totals, not additions |
4 changes: 4 additions & 0 deletions skills/sentry-python-sdk/references/ai-monitoring.md
Original file line number Diff line number Diff line change
Expand Up @@ -281,6 +281,10 @@ sentry_sdk.ai.set_conversation_id("user-session-abc123")
| Groq | `LiteLLMIntegration` |
| Vertex AI | `GoogleGenAIIntegration` or `LiteLLMIntegration` |

## Sampling Strategy

If your `traces_sample_rate` is below 1.0, you may be losing entire agent runs. See the [AI sampling guide](../../sentry-setup-ai-monitoring/references/sampling.md) for `traces_sampler` patterns that keep 100% of gen_ai-related transactions while sampling other traffic at a lower rate.

## Troubleshooting

| Issue | Solution |
Expand Down
20 changes: 20 additions & 0 deletions skills/sentry-setup-ai-monitoring/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,26 @@ grep -E '"(openai|@anthropic-ai/sdk|ai|@langchain|@google/genai)"' package.json
grep -E '(openai|anthropic|langchain|huggingface)' requirements.txt pyproject.toml 2>/dev/null
```

## Sampling Check

After detecting AI SDKs, check the current sampling configuration:

```bash
# JavaScript
grep -E 'tracesSampleRate|tracesSampler' sentry.*.config.* instrument.* src/instrument.* app/instrument.* 2>/dev/null

# Python
grep -E 'traces_sample_rate|traces_sampler' *.py **/*.py 2>/dev/null
```

**If `tracesSampleRate` / `traces_sample_rate` is below 1.0 AND no `tracesSampler` / `traces_sampler` is configured:**

Ask the user:

> "Your current sample rate is {rate}. Agent runs are sampled as complete span trees — if the root span is dropped, all child gen_ai spans are lost. For full AI visibility, gen_ai-related transactions should be sampled at 100%. Would you like me to set up a `tracesSampler` that keeps AI traces at 100% while sampling other traffic at your current rate?"
Comment thread
sergical marked this conversation as resolved.

If user confirms, read `${SKILL_ROOT}/references/sampling.md` for implementation patterns.

## Supported SDKs

### JavaScript
Expand Down
82 changes: 82 additions & 0 deletions skills/sentry-setup-ai-monitoring/references/sampling.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,82 @@
# Sampling Strategy for AI Agent Spans

> `@sentry/node` >=9.x (`inheritOrSampleWith`), `sentry-sdk` >=2.x (`traces_sampler`)

## The Problem

Agent runs are span trees. Sampling decides at the root; children inherit. Drop the root, lose every child span. At any rate below 1.0, you lose entire agent executions.

## How It Works

`tracesSampler` / `traces_sampler` only fires on **root spans**. Non-root spans (including `gen_ai.*` children) inherit unconditionally.

**Scenario 1: gen_ai span IS the root** (cron, queue consumer, CLI). The sampler sees `gen_ai.*` directly. Match and return 1.0.

**Scenario 2: gen_ai spans are children of HTTP transactions** (most web apps). `POST /api/chat` is sampled before any AI code runs. Solution: sample AI routes at 1.0.

## JavaScript

```javascript
Sentry.init({
dsn: process.env.SENTRY_DSN,
tracesSampler: ({ name, attributes, inheritOrSampleWith }) => {
// Standalone gen_ai root spans
if (attributes?.['sentry.op']?.startsWith('gen_ai.') || attributes?.['gen_ai.system']) {
return 1.0;
}
// HTTP routes that trigger AI calls
if (name?.includes('/api/chat') || name?.includes('/api/agent')) {
return 1.0;
}
return inheritOrSampleWith(0.2); // adjust to your baseline
},
});
```

## Python

```python
def traces_sampler(sampling_context):
tx = sampling_context.get("transaction_context", {})
op, name = tx.get("op", ""), tx.get("name", "")

if op.startswith("gen_ai."):
return 1.0
if op == "http.server" and any(p in name for p in ["/api/chat", "/api/agent"]):
return 1.0

parent = sampling_context.get("parent_sampled")
if parent is not None:
return float(parent)
return 0.2

sentry_sdk.init(dsn="...", traces_sampler=traces_sampler)
```

If AI is the core product, skip `tracesSampler` and use `tracesSampleRate: 1.0`.

## Fallback: Metrics + Logs

If 100% tracing isn't feasible, emit metrics and logs on every LLM call (independent of trace sampling):

```python
# Metrics - 100% coverage of cost/usage/latency
sentry_sdk.metrics.distribution("gen_ai.token_usage", usage.total_tokens,
attributes={"model": model, "user_id": str(user.id)})
sentry_sdk.metrics.count("gen_ai.calls", 1,
attributes={"model": model, "status": "error" if error else "success"})

# Logs - 100% searchable per-call records
sentry_sdk.logger.info("LLM call", model=model, input_tokens=usage.prompt_tokens,
output_tokens=usage.completion_tokens, latency_ms=response_time_ms)
```

JS equivalent uses `Sentry.metrics.*` and `Sentry.logger.*` with the same attribute patterns.

## Troubleshooting

| Issue | Solution |
|-------|----------|
| gen_ai spans missing despite sampler returning 1.0 | Parent HTTP transaction was sampled at a lower rate. Add the route to your sampler. |
| `tracesSampler` not called for gen_ai spans | Expected. It only runs on root spans. Sample the parent HTTP route instead. |
| All traces at 100% | Check the fallback rate in `inheritOrSampleWith()` / default return value. |
Loading