Skip to content

experimental/stats: Expose Telemetry Label Callback#8877

Open
seth-epps wants to merge 15 commits into
grpc:masterfrom
seth-epps:i-8682/expose-recorder
Open

experimental/stats: Expose Telemetry Label Callback#8877
seth-epps wants to merge 15 commits into
grpc:masterfrom
seth-epps:i-8682/expose-recorder

Conversation

@seth-epps
Copy link
Copy Markdown

@seth-epps seth-epps commented Feb 2, 2026

Fixes #8682

Expose a new experimental API for registering a telemetry label callback function.

Some clients may not be instrumented with opentelemetry which restricts valuable information from being propagated to stats handlers. This gives clients the ability to collect otel labels by registering a label callback on the context and collecting the information themselves in their stats handlers.

RELEASE NOTES:

  • experimental/stats: Expose a new experimental API for registering a telemetry label callback function

@seth-epps seth-epps force-pushed the i-8682/expose-recorder branch from 09e6612 to 6dda7c7 Compare February 2, 2026 22:27
@arjan-bal arjan-bal requested a review from mbissa February 3, 2026 07:44
@arjan-bal arjan-bal added the Type: Feature New features or improvements in behavior label Feb 3, 2026
@arjan-bal arjan-bal added this to the 1.80 Release milestone Feb 3, 2026
@codecov
Copy link
Copy Markdown

codecov Bot commented Feb 3, 2026

Codecov Report

❌ Patch coverage is 97.05882% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 83.11%. Comparing base (c1a9239) to head (9f65763).
⚠️ Report is 103 commits behind head on master.

Files with missing lines Patch % Lines
internal/testutils/balancer.go 66.66% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #8877      +/-   ##
==========================================
+ Coverage   80.40%   83.11%   +2.71%     
==========================================
  Files         416      414       -2     
  Lines       33495    33484      -11     
==========================================
+ Hits        26930    27830     +900     
+ Misses       4682     4230     -452     
+ Partials     1883     1424     -459     
Files with missing lines Coverage Δ
experimental/stats/telemetry/labels.go 100.00% <100.00%> (ø)
internal/stats/labels.go 100.00% <100.00%> (ø)
internal/xds/balancer/clusterimpl/picker.go 91.11% <100.00%> (-4.02%) ⬇️
stats/opentelemetry/client_metrics.go 88.81% <100.00%> (+0.70%) ⬆️
internal/testutils/balancer.go 82.28% <66.66%> (ø)

... and 101 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Comment thread internal/xds/balancer/clusterimpl/picker.go Outdated
Comment thread experimental/stats/telemetry_test.go Outdated
Comment thread internal/stats/labels_test.go
Comment thread experimental/stats/telemetry.go Outdated
Comment thread experimental/stats/telemetry.go Outdated
@mbissa mbissa assigned seth-epps and unassigned mbissa Feb 5, 2026
mbissa

This comment was marked as outdated.

@mbissa mbissa requested review from arjan-bal and dfawley February 6, 2026 08:37
@mbissa mbissa assigned arjan-bal and dfawley and unassigned seth-epps Feb 6, 2026
@arjan-bal arjan-bal assigned mbissa and unassigned dfawley and arjan-bal Feb 10, 2026
Comment thread experimental/stats/telemetry.go Outdated
Comment thread experimental/stats/telemetry.go Outdated
Comment thread internal/stats/labels.go Outdated
labels["grpc.lb.locality"] = xdsinternal.LocalityString(lID)
labels["grpc.lb.backend_service"] = d.clusterName
}
stats.UpdateLabels(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we restrict UpdateLabels to only one invocation?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did this in two separate invocations because I wanted to ensure the callbacks were executed in the event of an error above (dropped). If I combine the two we'd potentially miss the label updates

@mbissa
Copy link
Copy Markdown
Contributor

mbissa commented Feb 16, 2026

Hey @seth-epps , apologies for the delay and thanks for your effort. After discussing with other maintainers, I have added a few more comments. Please have a look and let me know if you have any questions around them.

@mbissa mbissa assigned seth-epps and unassigned mbissa Feb 16, 2026
@easwars easwars assigned seth-epps and unassigned easwars Mar 25, 2026
@seth-epps
Copy link
Copy Markdown
Author

That test failure looks like a transient / unrelated flake

@Pranjali-2501 Pranjali-2501 assigned easwars and unassigned seth-epps Mar 30, 2026
@seth-epps
Copy link
Copy Markdown
Author

@easwars Any chance you've had some time to re-review? 🙏

Comment thread internal/stats/labels.go Outdated
Comment thread internal/stats/labels.go
Comment thread internal/stats/labels.go Outdated
Comment thread stats/opentelemetry/client_metrics.go
Comment thread internal/stats/labels.go
Comment on lines +61 to +62
// To ensure callbacks do not mutate the state of the provided label map it is copied
// before execution.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we drop this copy and instead document on LabelCallback that callbacks should not mutate the provided label map?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was done based on the gemini review, which maybe I put too much stock into.
#8877 (comment)

I think my only reservation with dropping the copy is that not enforcing the copy means that someone can interfere with the otel client metrics if they register something in addition and don't pay attention to the recommendation that callbacks not modify the map. That said, I don't know how much guardrails need to be put in place to avoid people making those types of mistakes. I don't mind removing it, let me know if you think the tradeoff is reasonable

Comment thread internal/stats/labels_test.go Outdated
Comment thread internal/stats/labels_test.go Outdated
Comment thread internal/stats/labels_test.go
@easwars
Copy link
Copy Markdown
Contributor

easwars commented Apr 14, 2026

@easwars Any chance you've had some time to re-review? 🙏

Apologies for the delay. I was finally able to make a full pass. Mostly minor comments this time around. We should be able to wrap this one up quickly. Thanks.

@github-actions
Copy link
Copy Markdown

This PR is labeled as requiring an update from the reporter, and no update has been received after 6 days. If no update is provided in the next 7 days, this issue will be automatically closed.

@github-actions github-actions Bot added the stale label Apr 21, 2026
@seth-epps
Copy link
Copy Markdown
Author

@easwars I've addressed most of the feedback, but left a couple notes for the larger discussion points, namely around the maps copy / replacement in the otel callbacks. Let me know what you think!

@seth-epps
Copy link
Copy Markdown
Author

@easwars Just wanted to check in on the latest changes to see if there was any additional feedback / if anything needs to be tweaked

Copy link
Copy Markdown
Contributor

@easwars easwars left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

One super nit in this round and you now have a merge conflict to resolve.

Apologies for the long turnaround time on this review.

// Package telemetry defines stats APIs for interacting with
// telemetry labels.
//
// All APIs in this package are experimental.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Could you please make this notice match some of the other notices that we have:

// # Experimental
//
// Notice: All APIs in this package are EXPERIMENTAL and may be changed or
// removed in a later release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Status: Requires Reporter Clarification Type: Feature New features or improvements in behavior

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Expose the labels package to use with custom stats handler

7 participants