Based on my analysis and the official documentation, here is a proposal for V2 of the Multi-Agent Observability System.
This version fundamentally re-architects the project to align with Claude Code's best practices, leveraging its native OpenTelemetry support for metrics and logs while repurposing the existing frontend/backend for high-value qualitative events. The result is a system that is orders of magnitude more efficient, far more data-rich, and aligned with industry standards for observability.
Here is the new README.md followed by the necessary code and configuration files.
README.md (V2)
Multi-Agent Observability System V2
This revised system provides a professional-grade, real-time monitoring solution for Claude Code agents. It leverages Claude Code's native OpenTelemetry (OTel) integration for quantitative metrics and logs, while reserving hooks for powerful, qualitative event-driven enhancements.
This V2 architecture corrects the critical flaws of the original approach by eliminating inefficient, per-event LLM calls and capturing far richer data directly from the agent's core.
🏗️ V2 Architecture
The system is now split into two complementary data pipelines:
1. Quantitative Observability (The Core): For metrics and logs.
Claude Code → OpenTelemetry Collector → Prometheus (Metrics) & Loki (Logs) → Grafana
2. Qualitative Events (The Enhancement): For rich, context-aware events.
Claude Code Hooks → Python Scripts → Bun Server → SQLite → WebSocket → Vue Client
✨ Key Improvements in V2
- ⚡ Extreme Efficiency: By removing the LLM summarizer from the hooks, the system is now orders of magnitude faster and cheaper. A simple
ls command no longer triggers two expensive API calls.
- 📊 Richer Data: The native OTel pipeline captures critical data unavailable to hooks, including token counts, API costs, request latencies, and cache usage.
- 🛠️ Correct Use of Hooks: Hooks are now used for their intended purpose: providing deterministic control, capturing high-value qualitative data (like full session transcripts), and triggering real-time notifications (e.g., TTS).
- 📈 Industry-Standard Tooling: V2 is built on a standard, robust observability stack (OTel, Prometheus, Grafana, Loki) that is scalable and widely used in production environments.
- 🚀 One-Command Setup: The entire observability stack, including the frontend and backend, is now orchestrated with a single
docker-compose up command.
🚀 Quick Start
Prerequisites:
1. Configure Environment Variables
Create a .env file in the project root and add your Anthropic API key. This will be used by both Claude Code and the hook scripts.
# .env
ANTHROPIC_API_KEY="sk-ant-..."
CLAUDE_CODE_ENABLE_TELEMETRY=1
OTEL_METRICS_EXPORTER=otlp
OTEL_LOGS_EXPORTER=otlp
OTEL_EXPORTER_OTLP_PROTOCOL=grpc
OTEL_EXPORTER_OTLP_ENDPOINT="http://localhost:4317"
OTEL_LOG_USER_PROMPTS=1 # Set to 1 to log full prompt text
2. Launch the System
Start the entire observability stack, including the Vue frontend and Bun backend:
docker-compose up --build
3. Set Up Claude Code Hooks
Copy the improved .claude directory to any project you want to monitor:
cp -R .claude /path/to/your/project/
Note: The hook scripts have been updated to remove the inefficient LLM summarizer.
4. Start Coding!
Run Claude Code in the configured project directory. Your terminal must have the environment variables from Step 1 loaded (you can source .env or add them to your shell profile).
🔧 Component Details
Observability Core (Docker Compose)
- OTel Collector: Receives OTel data from Claude Code and exports it to Prometheus and Loki.
- Prometheus: Stores all quantitative metrics (costs, token counts, etc.).
- Loki: Stores all logs and event data (API requests, tool usage, etc.).
- Grafana: Visualizes all data from Prometheus and Loki in a pre-built dashboard.
Qualitative Event System (Hooks + Vue App)
The original application now serves a more focused, powerful purpose.
- Hooks (
.claude/hooks):
- No longer calls an LLM on every event.
- The
stop.py hook now captures the entire chat transcript at the end of a session, providing invaluable qualitative context.
- The
notification.py hook remains for real-time TTS alerts.
- Bun Server & Vue Client:
- The Vue app now visualizes a stream of high-signal events like session completions (with full transcripts) and user notifications, complementing the quantitative data in Grafana.
Implementation Files for V2
Here are the new and modified files required to implement this improved system.
1. docker-compose.yml (New)
This file orchestrates the entire system. Place it in the project root.
version: '3.8'
services:
# Observability Stack
otel-collector:
image: otel/opentelemetry-collector-contrib:latest
container_name: otel-collector
command: ["--config=/etc/otel-collector-config.yml"]
volumes:
- ./observability/otel-collector-config.yml:/etc/otel-collector-config.yml
ports:
- "4317:4317" # OTLP gRPC
- "4318:4318" # OTLP HTTP
networks:
- monitoring
prometheus:
image: prom/prometheus:latest
container_name: prometheus
command: ["--config.file=/etc/prometheus/prometheus.yml"]
volumes:
- ./observability/prometheus.yml:/etc/prometheus/prometheus.yml
- prometheus_data:/prometheus
ports:
- "9090:9090"
networks:
- monitoring
loki:
image: grafana/loki:latest
container_name: loki
ports:
- "3100:3100"
networks:
- monitoring
grafana:
image: grafana/grafana:latest
container_name: grafana
ports:
- "3000:3000"
volumes:
- ./observability/grafana/provisioning:/etc/grafana/provisioning
- grafana_data:/var/lib/grafana
networks:
- monitoring
depends_on:
- prometheus
- loki
# Original Application
server:
build:
context: ./apps/server
container_name: multi-agent-server
ports:
- "4000:4000"
volumes:
- ./apps/server/events.db:/app/events.db
networks:
- monitoring
client:
build:
context: ./apps/client
container_name: multi-agent-client
ports:
- "5173:5173"
depends_on:
- server
networks:
- monitoring
networks:
monitoring:
driver: bridge
volumes:
prometheus_data:
grafana_data:
2. OpenTelemetry & Grafana Configuration
Create a new directory observability in the project root to hold these files.
observability/otel-collector-config.yml:
receivers:
otlp:
protocols:
grpc:
http:
processors:
batch:
exporters:
logging:
loglevel: debug
prometheus:
endpoint: "0.0.0.0:8889"
loki:
endpoint: "http://loki:3100/loki/api/v1/push"
service:
pipelines:
metrics:
receivers: [otlp]
processors: [batch]
exporters: [logging, prometheus]
logs:
receivers: [otlp]
processors: [batch]
exporters: [logging, loki]
observability/prometheus.yml:
global:
scrape_interval: 15s
scrape_configs:
- job_name: 'otel-collector'
static_configs:
- targets: ['otel-collector:8889']
observability/grafana/provisioning/datasources/datasources.yml:
apiVersion: 1
datasources:
- name: Prometheus
type: prometheus
access: proxy
url: http://prometheus:9090
isDefault: true
- name: Loki
type: loki
access: proxy
url: http://loki:3100
observability/grafana/provisioning/dashboards/dashboards.yml:
apiVersion: 1
providers:
- name: 'default'
orgId: 1
folder: ''
type: file
disableDeletion: false
editable: true
options:
path: /etc/grafana/provisioning/dashboards
observability/grafana/provisioning/dashboards/claude-code-dashboard.json:
(A minimal dashboard definition to get started. You can build this out in the Grafana UI.)
{
"__inputs": [],
"__requires": [],
"annotations": { "list": [] },
"editable": true,
"gnetId": null,
"graphTooltip": 0,
"id": 1,
"panels": [
{
"type": "stat",
"title": "Total Cost (USD)",
"gridPos": { "h": 4, "w": 6, "x": 0, "y": 0 },
"targets": [{ "expr": "sum(claude_code_cost_usage_total)", "legendFormat": "Total Cost" }],
"options": { "reduceOptions": { "calcs": ["last"], "fields": "" }, "textMode": "auto", "colorMode": "value", "graphMode": "area", "unit": "currencyUSD" }
},
{
"type": "stat",
"title": "Total Sessions",
"gridPos": { "h": 4, "w": 6, "x": 6, "y": 0 },
"targets": [{ "expr": "sum(claude_code_session_count_total)" }],
"options": { "reduceOptions": { "calcs": ["last"], "fields": "" }, "textMode": "auto", "colorMode": "value", "graphMode": "area" }
},
{
"type": "piechart",
"title": "Token Usage by Type",
"gridPos": { "h": 8, "w": 12, "x": 0, "y": 4 },
"targets": [{ "expr": "sum by (type) (claude_code_token_usage_total)" }],
"options": { "displayLabels": ["name", "percent"], "pieType": "donut" }
},
{
"type": "logs",
"title": "Latest Tool Decisions & API Errors",
"gridPos": { "h": 8, "w": 12, "x": 12, "y": 0 },
"targets": [{ "datasource": { "type": "loki", "uid": "Loki" }, "expr": "{job=\"otel-collector\"} | json | event_name=~\"claude_code.tool_decision|claude_code.api_error\"" }],
"options": { "showTime": true, "showLabels": true, "wrapLines": true, "prettifyLogMessage": true }
}
],
"refresh": "10s",
"time": { "from": "now-1h", "to": "now" },
"title": "Claude Code Observability V2"
}
3. Modified Hook Scripts
The only change required is removing the inefficient summarizer.
.claude/hooks/send_event.py (Modified)
The --summarize flag and its logic should be removed.
#!/usr/bin/env -S uv run --script
# /// script
# requires-python = ">=3.8"
# dependencies = ["python-dotenv"]
# ///
import json
import sys
import os
import argparse
import urllib.request
import urllib.error
from datetime import datetime
# ... (keep send_event_to_server function as is) ...
def main():
parser = argparse.ArgumentParser(description='Send Claude Code hook events')
parser.add_argument('--source-app', required=True, help='Source application name')
parser.add_argument('--event-type', required=True, help='Hook event type')
parser.add_argument('--server-url', default='http://localhost:4000/events', help='Server URL')
# The --add-chat flag is now primarily used by the Stop hook.
parser.add_argument('--add-chat', action='store_true', help='Include chat transcript if available')
args = parser.parse_args()
try:
input_data = json.load(sys.stdin)
except json.JSONDecodeError as e:
print(f"Failed to parse JSON input: {e}", file=sys.stderr)
sys.exit(1)
event_data = {
'source_app': args.source_app,
'session_id': input_data.get('session_id', 'unknown'),
'hook_event_type': args.event_type,
'payload': input_data,
'timestamp': int(datetime.now().timestamp() * 1000)
}
if args.add_chat and 'transcript_path' in input_data:
# ... (keep existing chat transcript logic) ...
# Send to server (the summarizer call is now gone)
send_event_to_server(event_data, args.server_url)
sys.exit(0)
if __name__ == '__main__':
main()
.claude/settings.json (Modified)
Update the commands to remove the --summarize flag.
{
"hooks": {
"PreToolUse": [
{
"matcher": "",
"hooks": [
{ "type": "command", "command": "uv run .claude/hooks/pre_tool_use.py" },
{ "type": "command", "command": "uv run .claude/hooks/send_event.py --source-app cc-hooks-v2 --event-type PreToolUse" }
]
}
],
"PostToolUse": [
{
"matcher": "",
"hooks": [
{ "type": "command", "command": "uv run .claude/hooks/send_event.py --source-app cc-hooks-v2 --event-type PostToolUse" }
]
}
],
"Notification": [
{
"matcher": "",
"hooks": [
{ "type": "command", "command": "uv run .claude/hooks/notification.py --notify" },
{ "type": "command", "command": "uv run .claude/hooks/send_event.py --source-app cc-hooks-v2 --event-type Notification" }
]
}
],
"Stop": [
{
"matcher": "",
"hooks": [
// The most valuable qualitative event: capture the full transcript at the end.
{ "type": "command", "command": "uv run .claude/hooks/send_event.py --source-app cc-hooks-v2 --event-type Stop --add-chat" }
]
}
]
}
}
Based on my analysis and the official documentation, here is a proposal for V2 of the Multi-Agent Observability System.
This version fundamentally re-architects the project to align with Claude Code's best practices, leveraging its native OpenTelemetry support for metrics and logs while repurposing the existing frontend/backend for high-value qualitative events. The result is a system that is orders of magnitude more efficient, far more data-rich, and aligned with industry standards for observability.
Here is the new
README.mdfollowed by the necessary code and configuration files.README.md(V2)Multi-Agent Observability System V2
This revised system provides a professional-grade, real-time monitoring solution for Claude Code agents. It leverages Claude Code's native OpenTelemetry (OTel) integration for quantitative metrics and logs, while reserving hooks for powerful, qualitative event-driven enhancements.
This V2 architecture corrects the critical flaws of the original approach by eliminating inefficient, per-event LLM calls and capturing far richer data directly from the agent's core.
🏗️ V2 Architecture
The system is now split into two complementary data pipelines:
1. Quantitative Observability (The Core): For metrics and logs.
Claude Code → OpenTelemetry Collector → Prometheus (Metrics) & Loki (Logs) → Grafana2. Qualitative Events (The Enhancement): For rich, context-aware events.
Claude Code Hooks → Python Scripts → Bun Server → SQLite → WebSocket → Vue Client✨ Key Improvements in V2
lscommand no longer triggers two expensive API calls.docker-compose upcommand.🚀 Quick Start
Prerequisites:
1. Configure Environment Variables
Create a
.envfile in the project root and add your Anthropic API key. This will be used by both Claude Code and the hook scripts.2. Launch the System
Start the entire observability stack, including the Vue frontend and Bun backend:
3. Set Up Claude Code Hooks
Copy the improved
.claudedirectory to any project you want to monitor:Note: The hook scripts have been updated to remove the inefficient LLM summarizer.
4. Start Coding!
Run Claude Code in the configured project directory. Your terminal must have the environment variables from Step 1 loaded (you can
source .envor add them to your shell profile).admin, pass:admin). The Claude Code dashboard will be pre-installed.🔧 Component Details
Observability Core (Docker Compose)
Qualitative Event System (Hooks + Vue App)
The original application now serves a more focused, powerful purpose.
.claude/hooks):stop.pyhook now captures the entire chat transcript at the end of a session, providing invaluable qualitative context.notification.pyhook remains for real-time TTS alerts.Implementation Files for V2
Here are the new and modified files required to implement this improved system.
1.
docker-compose.yml(New)This file orchestrates the entire system. Place it in the project root.
2. OpenTelemetry & Grafana Configuration
Create a new directory
observabilityin the project root to hold these files.observability/otel-collector-config.yml:observability/prometheus.yml:observability/grafana/provisioning/datasources/datasources.yml:observability/grafana/provisioning/dashboards/dashboards.yml:observability/grafana/provisioning/dashboards/claude-code-dashboard.json:(A minimal dashboard definition to get started. You can build this out in the Grafana UI.)
{ "__inputs": [], "__requires": [], "annotations": { "list": [] }, "editable": true, "gnetId": null, "graphTooltip": 0, "id": 1, "panels": [ { "type": "stat", "title": "Total Cost (USD)", "gridPos": { "h": 4, "w": 6, "x": 0, "y": 0 }, "targets": [{ "expr": "sum(claude_code_cost_usage_total)", "legendFormat": "Total Cost" }], "options": { "reduceOptions": { "calcs": ["last"], "fields": "" }, "textMode": "auto", "colorMode": "value", "graphMode": "area", "unit": "currencyUSD" } }, { "type": "stat", "title": "Total Sessions", "gridPos": { "h": 4, "w": 6, "x": 6, "y": 0 }, "targets": [{ "expr": "sum(claude_code_session_count_total)" }], "options": { "reduceOptions": { "calcs": ["last"], "fields": "" }, "textMode": "auto", "colorMode": "value", "graphMode": "area" } }, { "type": "piechart", "title": "Token Usage by Type", "gridPos": { "h": 8, "w": 12, "x": 0, "y": 4 }, "targets": [{ "expr": "sum by (type) (claude_code_token_usage_total)" }], "options": { "displayLabels": ["name", "percent"], "pieType": "donut" } }, { "type": "logs", "title": "Latest Tool Decisions & API Errors", "gridPos": { "h": 8, "w": 12, "x": 12, "y": 0 }, "targets": [{ "datasource": { "type": "loki", "uid": "Loki" }, "expr": "{job=\"otel-collector\"} | json | event_name=~\"claude_code.tool_decision|claude_code.api_error\"" }], "options": { "showTime": true, "showLabels": true, "wrapLines": true, "prettifyLogMessage": true } } ], "refresh": "10s", "time": { "from": "now-1h", "to": "now" }, "title": "Claude Code Observability V2" }3. Modified Hook Scripts
The only change required is removing the inefficient summarizer.
.claude/hooks/send_event.py(Modified)The
--summarizeflag and its logic should be removed..claude/settings.json(Modified)Update the commands to remove the
--summarizeflag.{ "hooks": { "PreToolUse": [ { "matcher": "", "hooks": [ { "type": "command", "command": "uv run .claude/hooks/pre_tool_use.py" }, { "type": "command", "command": "uv run .claude/hooks/send_event.py --source-app cc-hooks-v2 --event-type PreToolUse" } ] } ], "PostToolUse": [ { "matcher": "", "hooks": [ { "type": "command", "command": "uv run .claude/hooks/send_event.py --source-app cc-hooks-v2 --event-type PostToolUse" } ] } ], "Notification": [ { "matcher": "", "hooks": [ { "type": "command", "command": "uv run .claude/hooks/notification.py --notify" }, { "type": "command", "command": "uv run .claude/hooks/send_event.py --source-app cc-hooks-v2 --event-type Notification" } ] } ], "Stop": [ { "matcher": "", "hooks": [ // The most valuable qualitative event: capture the full transcript at the end. { "type": "command", "command": "uv run .claude/hooks/send_event.py --source-app cc-hooks-v2 --event-type Stop --add-chat" } ] } ] } }