Skip to content

feat(prometheus): add metrics for AI cache hits, misses, bypasses, and embedding latency#13659

Open
janiussyafiq wants to merge 1 commit into
apache:masterfrom
janiussyafiq:feat/ai-cache-prometheus-metrics
Open

feat(prometheus): add metrics for AI cache hits, misses, bypasses, and embedding latency#13659
janiussyafiq wants to merge 1 commit into
apache:masterfrom
janiussyafiq:feat/ai-cache-prometheus-metrics

Conversation

@janiussyafiq

@janiussyafiq janiussyafiq commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

Description

Adds Prometheus metrics for the ai-cache plugin (#13578, #13632, #13644), integrated in prometheus/exporter.lua following the existing llm_* metric pattern:

  • apisix_ai_cache_hits_total (with layer="exact"|"semantic"), apisix_ai_cache_misses_total, apisix_ai_cache_bypasses_total — counters sharing the llm_* label set (route_id, service_id, consumer, node, request_type, request_llm_model, llm_model)
  • apisix_ai_cache_embedding_latency — histogram of embedding-call latency in milliseconds (the issue sketched _seconds, but the exporter's latency histograms use milliseconds with DEFAULT_BUCKETS, so this follows the house convention)

Semantics: Redis fail-open lookups count as MISS; fail_mode: error rejections record nothing; recording happens in log phase and is a silent no-op when the prometheus plugin is disabled. Docs: label sections added to prometheus.md (en + zh).

Which issue(s) this PR fixes:

Fixes #13290

Checklist

  • I have explained the need for this PR and the problem it solves
  • I have explained the changes or the new features added to this PR
  • I have added tests corresponding to this change
  • I have updated the documentation to reflect this change
  • I have verified that this change is backward compatible (If not, please discuss on the APISIX mailing list first)

@dosubot dosubot Bot added size:XL This PR changes 500-999 lines, ignoring generated files. enhancement New feature or request labels Jul 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request size:XL This PR changes 500-999 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: add ai-cache plugin for LLM semantic caching

1 participant