Skip to content

feat: inject prometheus registry#3106

Open
paegun wants to merge 7 commits into
authzed:mainfrom
paegun:feature/inject-prometheus-registry
Open

feat: inject prometheus registry#3106
paegun wants to merge 7 commits into
authzed:mainfrom
paegun:feature/inject-prometheus-registry

Conversation

@paegun
Copy link
Copy Markdown

@paegun paegun commented May 8, 2026

Description

Throughout, use of prometheus.Register and prometheus.Unregister lead to test isolation being compromised. This change makes the Prometheus Registerer an injectable.

As expected w/i #2829 , promauto use also needed to be removed.

I did not revert #2957 which worked around the test isolation issue.

Testing

[X] Execute existing unit tests.
[X] Manually test as follows:

  1. run the test server
  2. load a dataset
  3. query the dataset
  4. check metrics.

References

#2829

@paegun paegun requested a review from a team as a code owner May 8, 2026 23:37
@github-actions github-actions Bot added area/cli Affects the command line area/api v1 Affects the v1 API area/datastore Affects the storage system area/tooling Affects the dev or user toolchain (e.g. tests, ci, build tools) area/dispatch Affects dispatching of requests area/api http Affects the HTTP Gateway API labels May 8, 2026
@codecov
Copy link
Copy Markdown

codecov Bot commented May 11, 2026

Codecov Report

❌ Patch coverage is 54.86726% with 153 lines in your changes missing coverage. Please review.
✅ Project coverage is 70.09%. Comparing base (ee7c9a7) to head (d7fd42d).

Files with missing lines Patch % Lines
internal/datastore/mysql/datastore.go 42.86% 7 Missing and 1 partial ⚠️
pkg/cache/metrics.go 20.00% 6 Missing and 2 partials ⚠️
internal/datastore/postgres/postgres.go 46.16% 6 Missing and 1 partial ⚠️
pkg/x509util/certwatcher.go 56.25% 5 Missing and 2 partials ⚠️
internal/datastore/mysql/connection.go 0.00% 6 Missing ⚠️
internal/datastore/proxy/checkingreplicated.go 60.00% 4 Missing and 2 partials ⚠️
internal/datastore/proxy/observable.go 45.46% 4 Missing and 2 partials ⚠️
...nal/datastore/proxy/schemacaching/watchingcache.go 64.71% 4 Missing and 2 partials ⚠️
internal/datastore/proxy/strictreplicated.go 57.15% 4 Missing and 2 partials ⚠️
internal/dispatch/caching/caching.go 68.43% 6 Missing ⚠️
... and 20 more

❌ Your project check has failed because the head coverage (70.09%) is below the target coverage (75.00%). You can increase the head coverage or adjust the target coverage.

❗ There is a different number of reports uploaded between BASE (ee7c9a7) and HEAD (d7fd42d). Click for more details.

HEAD has 4 uploads less than BASE
Flag BASE (ee7c9a7) HEAD (d7fd42d)
25 21
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3106      +/-   ##
==========================================
- Coverage   75.52%   70.09%   -5.42%     
==========================================
  Files         503      503              
  Lines       61820    62064     +244     
==========================================
- Hits        46683    43499    -3184     
- Misses      11722    15410    +3688     
+ Partials     3415     3155     -260     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Copy Markdown
Contributor

@tstirrat15 tstirrat15 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've left a couple of comments. I generally like where this is going but there's a couple of things I'd want to touch up. I also want to get another couple of pairs of eyes on this - I'll poke some folks.

There's also a handful of tests failing. The unit tests are easy to run, but you can use the mage targets to make the e2e and integration tests easy to run. golangci-lint will also catch these kinds of things across the rest of the codebase.

for _, c := range []prometheus.Collector{connectionsPerCRDBNodeCountGauge, pruningTimeHistogram} {
if err := registerer.Register(c); err != nil {
var alreadyRegistered prometheus.AlreadyRegisteredError
if !errors.As(err, &alreadyRegistered) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prefer AsType here

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addressed in 428b7c3

Comment on lines +68 to +71
if registerer == nil {
registerer = prometheus.DefaultRegisterer
}
for _, c := range []prometheus.Collector{connectionsPerCRDBNodeCountGauge, pruningTimeHistogram} {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like this logic could be wrapped with a function like:

func util.RegisterPrometheusMetrics(registerer prometheus.Registerer, errMessage string, metrics... prometheus.Metric) error {}

So that we can cut down on the duplication a bit.

Could also have it return the list of metrics for unregistration, or else have it return a closure that you can call in a Close function that does all of the unregistration for you.

Maybe package all that up in a struct? I'm thinking out loud here; take whatever parts of it make sense to you and work in the context.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addressed in 8af3a73

RegisterPrometheusMetrics returning a closure.

Comment thread internal/dispatch/caching/caching.go Outdated
Comment on lines +466 to +474
if cd.registerer != nil {
cd.registerer.Unregister(cd.checkTotalCounter)
cd.registerer.Unregister(cd.checkFromCacheCounter)
cd.registerer.Unregister(cd.lookupResourcesTotalCounter)
cd.registerer.Unregister(cd.lookupResourcesFromCacheCounter)
cd.registerer.Unregister(cd.lookupSubjectsFromCacheCounter)
cd.registerer.Unregister(cd.lookupSubjectsTotalCounter)
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be cleaner if we have a struct that tracks these and has an Unregister method, or else the registration function returns a function for unregistering.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have need of the counters, ie to increment, so would be due to wrap all metrics interactions, not just register/unregister.

paegun added 4 commits May 12, 2026 06:57
* Address linter, generated options.
* Inject prometheus registerer.
* Add prometheus register utility, returning an unregister function.
* Address failing tests.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/api http Affects the HTTP Gateway API area/api v1 Affects the v1 API area/cli Affects the command line area/datastore Affects the storage system area/dispatch Affects dispatching of requests area/tooling Affects the dev or user toolchain (e.g. tests, ci, build tools)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants