Skip to content

[otel-advisor] replace non-standard gen_ai.provider.name with gen_ai.system + engine-to-system mapping #29199

@github-actions

Description

@github-actions

📡 OTel Instrumentation Improvement: use standard gen_ai.system attribute with engine-to-system mapping

Analysis Date: 2026-04-29
Priority: High
Effort: Small (< 2h)

Problem

The dedicated agent span emitted in sendJobConclusionSpan sets a non-standard attribute gen_ai.provider.name (line 918, actions/setup/js/send_otlp_span.cjs) instead of the OTel GenAI semantic-convention attribute gen_ai.system. Every OTel backend — Grafana, Datadog, Honeycomb, Sentry — uses gen_ai.system to identify LLM spans and route them to native GenAI dashboards. Using an unrecognized key means the agent spans are treated as plain INTERNAL spans, making LLM-specific dashboards, cost tracking, and latency panels invisible for all ~200 workflow runs using copilot, claude, codex, and gemini engines.

In addition, the engine ID values ("copilot", "claude", "codex", "gemini") are gh-aw-internal identifiers, not the OTel-standardized system names ("github_models", "anthropic", "openai", "google_vertex_ai"). A mapping table is needed so backends can apply the correct detection logic.

Why This Matters (DevOps Perspective)

The OTel GenAI semantic conventions define gen_ai.system as the discriminating attribute for LLM spans. When it is absent or uses an unrecognized key name:

  • Grafana Tempo / Grafana Cloud: The built-in "LLM Observability" panel does not surface the span, so p50/p95 agent latency and token-cost dashboards are empty.
  • Datadog LLM Observability: The intake pipeline filters on gen_ai.system to classify traces as AI traces; without it, gen_ai.usage.* token attributes are stored but never aggregated into the cost explorer.
  • Honeycomb: The GenAI derived column recipes rely on gen_ai.system; queries like "group by AI provider" silently return no rows.
  • Mean-time-to-diagnose (MTTD): An on-call engineer looking at a timed-out workflow cannot filter traces by "claude vs copilot" without the standard attribute, adding several minutes of manual triage.

The fix is a one-time < 2-hour change that unlocks all of those dashboards without any backend reconfiguration.

Current Behavior
// actions/setup/js/send_otlp_span.cjs  lines 916–918
// Emit gen_ai.provider.name when engineId is available; it may be omitted when
// engine metadata is unavailable, so this span does not guarantee full GenAI spec compliance.
if (engineId) agentAttributes.push(buildAttr("gen_ai.provider.name", engineId));

The attribute name gen_ai.provider.name does not exist in the [OpenTelemetry GenAI semantic conventions]((opentelemetry.io/redacted) No OTel backend will recognise it. The code comment acknowledges the gap ("this span does not guarantee full GenAI spec compliance") but leaves it unresolved.

Real engine IDs in use (from .github/workflows/*.lock.yml):

gh-aw engine_id Runs OTel gen_ai.system standard value
copilot ~132 github_models (or custom github_copilot)
claude ~58 anthropic
codex ~12 openai
gemini ~1 google_vertex_ai / google_ai_studio
Proposed Change

Add a small mapping function near the top of send_otlp_span.cjs and replace the non-standard attribute:

// Proposed addition to actions/setup/js/send_otlp_span.cjs
// Maps gh-aw engine IDs to OTel GenAI semantic-convention gen_ai.system values.
// See: (opentelemetry.io/redacted)
const ENGINE_ID_TO_GEN_AI_SYSTEM = {
  claude:   "anthropic",
  codex:    "openai",
  gemini:   "google_vertex_ai",
  copilot:  "github_models",
};

// In sendJobConclusionSpan, replace (line ~918):
// if (engineId) agentAttributes.push(buildAttr("gen_ai.provider.name", engineId));
// with:
if (engineId) {
  const genAiSystem = ENGINE_ID_TO_GEN_AI_SYSTEM[engineId] || engineId;
  agentAttributes.push(buildAttr("gen_ai.system", genAiSystem));
}

The fallback || engineId means future engines (e.g. opencode, crush) remain queryable while work is done to confirm the correct OTel value.

Expected Outcome

After this change:

  • In Grafana / Honeycomb / Datadog: Agent spans will be classified as LLM spans; built-in GenAI dashboards (cost per provider, latency by model, token usage trends) become functional with zero backend reconfiguration.
  • In the JSONL mirror: /tmp/gh-aw/otel.jsonl entries will contain "key": "gen_ai.system", "value": {"stringValue": "anthropic"} instead of the unrecognized gen_ai.provider.name, making local artifact inspection consistent with what the backend receives.
  • For on-call engineers: Filtering traces by AI provider (e.g. gen_ai.system = anthropic) in Sentry, Grafana, or Honeycomb will return results instead of zero rows.
  • For cost attribution: OTel backends that auto-detect token costs by gen_ai.system + gen_ai.request.model will start producing cost estimates automatically.
Implementation Steps
  • Add the ENGINE_ID_TO_GEN_AI_SYSTEM mapping object near the top of actions/setup/js/send_otlp_span.cjs (alongside the existing SPAN_KIND_* constants)
  • Replace buildAttr("gen_ai.provider.name", engineId) with buildAttr("gen_ai.system", ENGINE_ID_TO_GEN_AI_SYSTEM[engineId] || engineId) on line ~918
  • Update actions/setup/js/action_otlp.test.cjs to assert that the agent span contains gen_ai.system (not gen_ai.provider.name) when GH_AW_INFO_ENGINE_ID is set
  • Run cd actions/setup/js && npx vitest run (or make test-unit) to confirm all tests pass
  • Run make fmt to ensure formatting
  • Open a PR referencing this issue
Evidence from Live Sentry Data

The Sentry MCP tool returned an empty tools manifest ([]) during this analysis run, so no live span payload could be sampled. The finding is based entirely on static code analysis of actions/setup/js/send_otlp_span.cjs (line 918) and cross-referenced against the OTel GenAI semantic convention specification.

The code comment on line 916 itself acknowledges the gap:

// Emit gen_ai.provider.name when engineId is available; it may be omitted when
// engine metadata is unavailable, so this span does not guarantee full GenAI spec compliance.

This is a confirmed static gap; no live data contradicts it.

Related Files
  • actions/setup/js/send_otlp_span.cjs — line 918 (attribute to rename + mapping to add)
  • actions/setup/js/action_otlp.test.cjs — add assertion for gen_ai.system
  • actions/setup/js/handle_agent_failure.cjs — lines 824–829 (ENGINE_ID_TO_LABEL already enumerates the four engine IDs; reuse the same list for the new mapping)

Generated by the Daily OTel Instrumentation Advisor workflow

Generated by Daily OTel Instrumentation Advisor · ● 308.4K ·

  • expires on May 6, 2026, 9:40 PM UTC

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions