Added `jetstream_total_tokens_in_current_batch` metric #128

Bslabe123 · 2024-08-08T17:53:36Z

Do not merge until #127

FanhaiLu1 · 2024-08-09T03:23:14Z

jetstream/core/metrics/prometheus.py

@@ -255,3 +255,12 @@ def get_request_output_length(self):

  def get_request_success_count_metric(self):
    return self._request_success_count.labels(id=self._id)
+
+  _total_tokens_in_current_batch = Gauge(
+      name="jetstream_total_tokens_in_current_batch",


We have padding in each batch, I feel this metric is not critical or useful. We more care about total response tokens per request. @JoeZijunZhou Please share your thoughts on this metrics.

Following up in the internal doc. The GKE team added this metric in the design doc.

first commit

0d4b583

Bslabe123 requested a review from vipannalla as a code owner August 8, 2024 17:53

Bslabe123 added 3 commits August 8, 2024 17:57

missing idx

9748a00

temporary missing if statemetn

0b109ce

fmt

e7190ce

FanhaiLu1 reviewed Aug 9, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Added `jetstream_total_tokens_in_current_batch` metric #128

Added `jetstream_total_tokens_in_current_batch` metric #128

Uh oh!

Bslabe123 commented Aug 8, 2024 •

edited

Loading

Uh oh!

FanhaiLu1 Aug 9, 2024

Uh oh!

JoeZijunZhou Aug 14, 2024

Uh oh!

Uh oh!

Added jetstream_total_tokens_in_current_batch metric #128

Are you sure you want to change the base?

Added jetstream_total_tokens_in_current_batch metric #128

Uh oh!

Conversation

Bslabe123 commented Aug 8, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

FanhaiLu1 Aug 9, 2024

Choose a reason for hiding this comment

Uh oh!

JoeZijunZhou Aug 14, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Added `jetstream_total_tokens_in_current_batch` metric #128

Added `jetstream_total_tokens_in_current_batch` metric #128

Bslabe123 commented Aug 8, 2024 •

edited

Loading