Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 9 additions & 9 deletions .agents/skills/probe/SKILL.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,11 @@
---
name: probe
description: Diagnose what the agentops dashboard sees for a session or trace — empty Sessions page, missing trace, wrong user, missing attributes. Use when the user pastes a session/trace UUID, a `localhost:<port>/sessions/<id>` URL, or asks "why is this empty / why doesn't it show / what attributes are on this span". Trigger on bare UUIDs in the agentops repo without asking — the user expects a lookup.
description: Diagnose what the loupe dashboard sees for a session or trace — empty Sessions page, missing trace, wrong user, missing attributes. Use when the user pastes a session/trace UUID, a `localhost:<port>/sessions/<id>` URL, or asks "why is this empty / why doesn't it show / what attributes are on this span". Trigger on bare UUIDs in the loupe repo without asking — the user expects a lookup.
---

# Agentops probe
# loupe probe

This skill diagnoses what the agentops dashboard sees (or doesn't see) for a given session, trace, or the data stream as a whole. It's about debugging the _consumer side_ — "why is X empty / wrong / missing" — by comparing what the producer actually emitted against what agentops looks for.
This skill diagnoses what the loupe dashboard sees (or doesn't see) for a given session, trace, or the data stream as a whole. It's about debugging the _consumer side_ — "why is X empty / wrong / missing" — by comparing what the producer actually emitted against what loupe looks for.

## When this fires

Expand All @@ -19,7 +19,7 @@ User says:
- "this session doesn't appear"
- "check session X" / "look at trace X"

If you're in the agentops repo and see a bare UUID, treat it as a session/trace id and look it up. Don't ask for confirmation.
If you're in the loupe repo and see a bare UUID, treat it as a session/trace id and look it up. Don't ask for confirmation.

## How to run it

Expand Down Expand Up @@ -76,11 +76,11 @@ The JSON has this shape — focus on the diagnostic fields, not the timeline:
| `trace_ids: []` (empty) | No data for this id. Wrong id, wrong env, or outside the time window (3d AI / 7d OO). |
| `timeline` | Quick agent flow — invoke_agent, chat models, tool calls, purposes. Filters out generic HTTP / DB / queue spans by default. |
| `errors` | Spans with `success=false`. Cite the span name. |
| `key_drift.sessionId` (multiple entries) | Same concept (`thread_id`) appears under multiple key names. App Insights customDimensions can carry both `ag_ui.thread_id` (dotted) and `ag_ui_thread_id` (underscore) depending on which SDK wrote it; agentops' `aiCoalesce` must check both forms via `bothForms()`. |
| `key_drift.session_only_underscore` | Trace has only underscore form, no dotted. If agentops looks for dotted-only, the trace won't appear on the Sessions page. |
| `key_drift.sessionId` (multiple entries) | Same concept (`thread_id`) appears under multiple key names. App Insights customDimensions can carry both `ag_ui.thread_id` (dotted) and `ag_ui_thread_id` (underscore) depending on which SDK wrote it; loupe' `aiCoalesce` must check both forms via `bothForms()`. |
| `key_drift.session_only_underscore` | Trace has only underscore form, no dotted. If loupe looks for dotted-only, the trace won't appear on the Sessions page. |
| `purpose` field on a span | Standard key: `gen_ai.operation.purpose`. Legacy data may show `teammate.llm.purpose` (pre-refactor); new producer emits the standard key. |
| `key_drift.purpose_on_ancestor_not_on_chat` | Purpose lives on parent Activity, not on the `chat` span. `propagateInheritedAttrs` lifts it down automatically for the standard key. |
| `key_drift.unrecognized_session_keys` / `unrecognized_purpose_keys` | Producer emitted these keys but agentops won't read them under current config. Either add to `conventions.ts` (if standard) or set the matching `CUSTOM_*_FIELD` env var. |
| `key_drift.unrecognized_session_keys` / `unrecognized_purpose_keys` | Producer emitted these keys but loupe won't read them under current config. Either add to `conventions.ts` (if standard) or set the matching `CUSTOM_*_FIELD` env var. |
| `env_health` (per-session output) | Non-empty means a `CUSTOM_*_FIELD` env value contains chars that `field-config.ts` `ident()` silently drops (anything outside `[A-Za-z0-9_.]`). Fix the env value or relax the regex. |
| `tokens` | Per-trace LLM token total. Useful for "why is this run so expensive". |

Expand All @@ -90,8 +90,8 @@ The JSON has this shape — focus on the diagnostic fields, not the timeline:

1. Run `query.py --audit` first.
2. Check `env_health` — silent drops from `field-config.ts ident()` mean a `CUSTOM_*` override isn't taking effect even though it's set.
3. Check `emitted_keys_unrecognized_for_concept` — these are session/user/purpose keys the producer is emitting that agentops doesn't recognize. Top of that list is your fix target (add to `conventions.ts` or `CUSTOM_*_FIELD`).
4. Compare `traces_with_dotted` to `traces_with_only_underscore`. If underscore dominates, agentops' KQL coalesce is missing the underscore form — fix `aiCoalesce` in `src/lib/telemetry/conventions.ts` to run keys through `bothForms()`.
3. Check `emitted_keys_unrecognized_for_concept` — these are session/user/purpose keys the producer is emitting that loupe doesn't recognize. Top of that list is your fix target (add to `conventions.ts` or `CUSTOM_*_FIELD`).
4. Compare `traces_with_dotted` to `traces_with_only_underscore`. If underscore dominates, loupe' KQL coalesce is missing the underscore form — fix `aiCoalesce` in `src/lib/telemetry/conventions.ts` to run keys through `bothForms()`.
5. If `traces_in_listSessions_filter` is 0, the producer isn't emitting `gen_ai.operation.name`, `invoke_agent`, `execute_tool`, or `session.trigger_type` on any span — producer-side instrumentation issue.

### "This specific session/trace doesn't show"
Expand Down
22 changes: 11 additions & 11 deletions .agents/skills/probe/scripts/query.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
#!/usr/bin/env python3
"""
Agentops debug query — pulls trace/session diagnostics from whichever
telemetry provider the agentops .env points at, and prints lean JSON.
loupe debug query — pulls trace/session diagnostics from whichever
telemetry provider the loupe .env points at, and prints lean JSON.

Usage:
query.py <session-or-trace-id> # per-session diagnostic
Expand All @@ -25,8 +25,8 @@
from pathlib import Path
from typing import Any

AGENTOPS_DIR = Path(__file__).resolve().parents[3].parent # skill is in agentops/.agents/skills/probe/scripts
ENV_FILE = AGENTOPS_DIR / ".env"
LOUPE_DIR = Path(__file__).resolve().parents[3].parent # skill is in loupe/.agents/skills/probe/scripts
ENV_FILE = LOUPE_DIR / ".env"

# What we consider session/user attrs across all OTel/AG-UI/MAF variants.
SESSION_KEYS = [
Expand Down Expand Up @@ -59,7 +59,7 @@


def load_env() -> dict[str, str]:
"""Read agentops .env. Returns a dict; missing file = empty dict."""
"""Read loupe .env. Returns a dict; missing file = empty dict."""
env: dict[str, str] = {}
if not ENV_FILE.exists():
return env
Expand All @@ -77,7 +77,7 @@ def detect_provider(env: dict[str, str]) -> str:


def recognized_keys(env: dict[str, str]) -> dict[str, set[str]]:
"""Keys agentops will actually look at, given conventions.ts + .env overrides."""
"""Keys loupe will actually look at, given conventions.ts + .env overrides."""
sess = set(SESSION_KEYS)
usr = set(USER_KEYS)
purp = set(PURPOSE_KEYS)
Expand Down Expand Up @@ -258,7 +258,7 @@ def diagnose_session_app_insights(env: dict[str, str], session_id: str, full: bo
t["timeline"].append(entry_tl)
t["span_count"] += 1

# Detect "purpose tag on ancestor, not on the chat LLM span" — agentops
# Detect "purpose tag on ancestor, not on the chat LLM span" — loupe
# propagateInheritedAttrs lifts it, but only if the key is recognized.
def find_ancestor_purpose(span_id: str) -> str | None:
cur = spans_by_id.get(span_id)
Expand Down Expand Up @@ -288,12 +288,12 @@ def find_ancestor_purpose(span_id: str) -> str | None:
drift = {}
if len(sk) > 1: drift["sessionId"] = sk
if len(pk) > 1: drift["purpose"] = pk
# Keys the producer emitted but agentops won't recognize given current config.
# Keys the producer emitted but loupe won't recognize given current config.
unrec_sess = [k for k in sk if k not in recog["session"]]
unrec_purp = [k for k in pk if k not in recog["purpose"]]
if unrec_sess: drift["unrecognized_session_keys"] = unrec_sess
if unrec_purp: drift["unrecognized_purpose_keys"] = unrec_purp
# Also flag agentops-style mismatch: only underscore form present, no dotted
# Also flag loupe-style mismatch: only underscore form present, no dotted
all_dotted = [k for k in sk if "." in k]
all_under = [k for k in sk if "_" in k and "." not in k]
if all_under and not all_dotted:
Expand Down Expand Up @@ -405,7 +405,7 @@ def audit_app_insights(env: dict[str, str]) -> dict[str, Any]:
"""
rows = query_app_insights(env, kql)
# Distinct keys appearing in customDimensions on AI-relevant spans, grouped
# by whether agentops will look at them under current config.
# by whether loupe will look at them under current config.
keys_kql = """
union dependencies, requests
| where timestamp > ago(3d)
Expand All @@ -428,7 +428,7 @@ def audit_app_insights(env: dict[str, str]) -> dict[str, Any]:
all_recog = recog["session"] | recog["user"] | recog["purpose"]
seen = [(r["k"], int(r.get("n") or 0)) for r in key_rows if r.get("k")]
# Categorize: only flag a key as "unrecognized" if it LOOKS like a session/
# user/purpose concept (the things agentops needs to match on) but isn't in
# user/purpose concept (the things loupe needs to match on) but isn't in
# the recognized set. Generic OTel keys like gen_ai.request.model aren't
# visibility-blocking and shouldn't show up as problems.
def concept(k: str) -> str | None:
Expand Down
10 changes: 5 additions & 5 deletions .claude/skills/maf-sandbox/SKILL.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,11 @@
---
name: maf-sandbox
description: Generate test telemetry for agentops by firing requests at a local Microsoft Agent Framework (MAF) Python agent that emits OTel spans to local OpenObserve. Use whenever the user wants to fire traces, produce spans, exercise telemetry shapes, generate test data for the agentops dashboard, verify how a particular pattern renders (parallel tool calls, subagent handoff, MCP tools, scheduled tasks, errors, streaming, token usage), or test the local OpenAI Responses endpoint — even if they don't say "MAF" or "sandbox" explicitly. Improvise the input each invocation; don't repeat payloads. Skip this skill when the user wants to *read* existing traces (that's the openobserve skill) or *diagnose* what agentops shows for a specific session id (that's the probe skill).
description: Generate test telemetry for loupe by firing requests at a local Microsoft Agent Framework (MAF) Python agent that emits OTel spans to local OpenObserve. Use whenever the user wants to fire traces, produce spans, exercise telemetry shapes, generate test data for the loupe dashboard, verify how a particular pattern renders (parallel tool calls, subagent handoff, MCP tools, scheduled tasks, errors, streaming, token usage), or test the local OpenAI Responses endpoint — even if they don't say "MAF" or "sandbox" explicitly. Improvise the input each invocation; don't repeat payloads. Skip this skill when the user wants to *read* existing traces (that's the openobserve skill) or *diagnose* what loupe shows for a specific session id (that's the probe skill).
---

# MAF sandbox

Test rig for generating agent telemetry into local OpenObserve so we can inspect what agentops renders.
Test rig for generating agent telemetry into local OpenObserve so we can inspect what loupe renders.

## Quick start

Expand All @@ -14,11 +14,11 @@ Test rig for generating agent telemetry into local OpenObserve so we can inspect
./fire.py "your prompt here" --stream # SSE stream
```

`fire.py` handles everything: spawns `maf.py` via `uv` if not already running (logs → `/tmp/maf-sandbox.log`), discovers the entity_id, sends a correctly-shaped Responses API body, and returns the reply. The sandbox listens on `localhost:4280`, exports OTel to `http://localhost:5080/api/default` (OpenObserve), reads `OPENAI_API_KEY` from `agentops/.env.local`.
`fire.py` handles everything: spawns `maf.py` via `uv` if not already running (logs → `/tmp/maf-sandbox.log`), discovers the entity_id, sends a correctly-shaped Responses API body, and returns the reply. The sandbox listens on `localhost:4280`, exports OTel to `http://localhost:5080/api/default` (OpenObserve), reads `OPENAI_API_KEY` from `loupe/.env.local`.

## Optional: dual-emit to App Insights

agentops reads from App Insights by default — so to make sandbox traces visible in the agentops UI, also set `APPLICATIONINSIGHTS_CONNECTION_STRING` in `agentops/.env.local`. Without it, sandbox traces land only in OpenObserve and **agentops will not see them**; `maf.py`'s startup banner prints a warning in that case. AppInsights export is purely additive — OO emission continues either way.
loupe reads from App Insights by default — so to make sandbox traces visible in the loupe UI, also set `APPLICATIONINSIGHTS_CONNECTION_STRING` in `loupe/.env.local`. Without it, sandbox traces land only in OpenObserve and **loupe will not see them**; `maf.py`'s startup banner prints a warning in that case. AppInsights export is purely additive — OO emission continues either way.

## What the sandbox agent can do

Expand All @@ -37,7 +37,7 @@ The agent (`sandbox-agent`) is wired to exercise these telemetry categories. **P

1. Fire a request: `./fire.py "..."` with an input chosen to exercise something interesting
2. Read the resulting spans via the `openobserve` skill, filtering `service_name=maf-sandbox`
3. Tell the user what attributes/shapes agentops would render — including anything missing, mangled, or that doesn't fit existing renderers
3. Tell the user what attributes/shapes loupe would render — including anything missing, mangled, or that doesn't fit existing renderers

## Files

Expand Down
6 changes: 3 additions & 3 deletions .claude/skills/maf-sandbox/maf.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@

from dotenv import load_dotenv

# Load OPENAI_API_KEY (and anything else) from the agentops repo .env.local — gitignored.
# Load OPENAI_API_KEY (and anything else) from the loupe repo .env.local — gitignored.
_REPO_ROOT = Path(__file__).resolve().parents[3]
load_dotenv(_REPO_ROOT / ".env.local")
load_dotenv(_REPO_ROOT / ".env")
Expand Down Expand Up @@ -59,7 +59,7 @@
configure_otel_providers(enable_sensitive_data=True)

# Dual-emit to App Insights when configured — exercises the 8 KB
# customDimensions truncation that the agentops truncation-resilience branch
# customDimensions truncation that the loupe truncation-resilience branch
# is meant to handle. Silently skipped when the connection string isn't set,
# so the sandbox still works against OO alone.
_AI_CONN = os.environ.get("APPLICATIONINSIGHTS_CONNECTION_STRING")
Expand Down Expand Up @@ -411,7 +411,7 @@ async def process(self, context: ChatContext, call_next: Any) -> None:
print(f" OTel → {_ingest} (AppInsights)")
else:
print(" ⚠ AppInsights export OFF — set APPLICATIONINSIGHTS_CONNECTION_STRING in")
print(f" {_REPO_ROOT / '.env.local'} to enable. Without it, agentops")
print(f" {_REPO_ROOT / '.env.local'} to enable. Without it, loupe")
print(" (which reads AppInsights by default) will NOT see these traces.")
serve(
entities=[main_agent, weather_subagent],
Expand Down
2 changes: 1 addition & 1 deletion .cta.json
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
{
"projectName": "agentops",
"projectName": "loupe",
"mode": "file-router",
"typescript": true,
"packageManager": "npm",
Expand Down
6 changes: 3 additions & 3 deletions AGENTS.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# agentops
# loupe

`TODO.md` is the running todo list. `docs/README.md` covers docs structure. **What attrs to emit / what agentops reads**: `docs/explanation/02-spec.md` is the canonical operating-set spec.
`TODO.md` is the running todo list. `docs/README.md` covers docs structure. **What attrs to emit / what loupe reads**: `docs/explanation/02-spec.md` is the canonical operating-set spec.

## Layout

Expand All @@ -15,4 +15,4 @@
- `/traces` has two tabs: Traces (end-to-end runs; utility traces filtered out) and Spans (`?tab=spans`, lazy-fetched) listing utility purpose-attr spans + sub-agent invocations (`invoke_agent` under `execute_tool`). When the inspector is open, cmd+k narrows to spans in that session (`exclusive` provider in `use-span-search.tsx`).
- Inspect drawer (`src/components/inspect/`, shared by sessions + traces): `drawer.tsx` Sheet shell · `overview.tsx` `InspectLayout` Spans-tab layout + inspector tabs · `tree.tsx` left pane (tree, palette) · `detail-panel.tsx` right pane (messages, tool calls, Make-prompt) · `context.tsx` Context-tab UI backed by pure `context-collectors.ts` · `context-segments.ts` stacked-bar math · `view-bar.tsx` `InspectViewBar`. Per-entity hosts that bind the right query live next to each route: `src/routes/sessions/-components/sessions-drawer-host.tsx`, `src/routes/traces/-components/trace-drawer-host.tsx`. Keep pure helpers in `.ts` siblings so tests don't pull `src/db` via React imports.
- Span/domain helpers: `src/lib/spans.ts`. Formatting: `src/lib/format.ts`.
- Ingest: `src/lib/classify-span.ts` (OTel bag → typed `Classification`); deep dive at `docs/explanation/03-classify-span.md`. Provider clients in `src/lib/telemetry/` — no local mirror DB. Attribute reference: `docs/reference/ai-attributes.md` (full OTel catalog). Convention spec (curated subset agentops reads + stamps, including `gen_ai.task.*` and `tag.tags`): `docs/explanation/02-spec.md`.
- Ingest: `src/lib/classify-span.ts` (OTel bag → typed `Classification`); deep dive at `docs/explanation/03-classify-span.md`. Provider clients in `src/lib/telemetry/` — no local mirror DB. Attribute reference: `docs/reference/ai-attributes.md` (full OTel catalog). Convention spec (curated subset loupe reads + stamps, including `gen_ai.task.*` and `tag.tags`): `docs/explanation/02-spec.md`.
Loading