-
Notifications
You must be signed in to change notification settings - Fork 130
docs: add prometheus + grafana deployment guide #1019
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,31 @@ | ||
apiVersion: rbac.authorization.k8s.io/v1 | ||
kind: ClusterRole | ||
metadata: | ||
name: inference-gateway-metrics-reader | ||
rules: | ||
- nonResourceURLs: | ||
- /metrics | ||
- /debug/pprof/* | ||
verbs: | ||
- get | ||
--- | ||
apiVersion: v1 | ||
kind: ServiceAccount | ||
metadata: | ||
name: inference-gateway-sa-metrics-reader | ||
namespace: monitoring | ||
--- | ||
apiVersion: rbac.authorization.k8s.io/v1 | ||
kind: ClusterRoleBinding | ||
metadata: | ||
name: inference-gateway-sa-metrics-reader-role-binding | ||
namespace: monitoring | ||
subjects: | ||
- kind: ServiceAccount | ||
name: inference-gateway-sa-metrics-reader | ||
namespace: monitoring | ||
roleRef: | ||
apiGroup: rbac.authorization.k8s.io | ||
kind: ClusterRole | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Most of these resources are already covered in https://gateway-api-inference-extension.sigs.k8s.io/guides/metrics-and-observability/#scrape-metrics-pprof-profiles. To avoid confusion and misconfiguration, this file should only contain ServiceAccount and an updated ClusterRoleBinding that includes an additional - kind: ServiceAccount
name: inference-gateway-sa-metrics-reader
namespace: monitoring Maybe use |
||
name: inference-gateway-metrics-reader | ||
--- |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,27 @@ | ||
serviceAccounts: | ||
server: | ||
create: false | ||
name: inference-gateway-sa-metrics-reader | ||
|
||
extraScrapeConfigs: | | ||
- job_name: 'inference-extension-epp' | ||
authorization: | ||
credentials_file: /var/run/secrets/kubernetes.io/serviceaccount/token | ||
scrape_interval: 5s | ||
kubernetes_sd_configs: | ||
- role: endpoints | ||
relabel_configs: | ||
- source_labels: [__meta_kubernetes_service_name] | ||
action: keep | ||
regex: .*-epp$ | ||
- source_labels: [__meta_kubernetes_pod_container_port_number] | ||
action: keep | ||
regex: "9090" | ||
- job_name: vllm | ||
scrape_interval: 5s | ||
kubernetes_sd_configs: | ||
- role: pod | ||
relabel_configs: | ||
- source_labels: [__meta_kubernetes_pod_label_app] | ||
action: keep | ||
regex: vllm-llama3-8b-instruct |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -126,6 +126,66 @@ PROFILE_NAME=heap | |
curl -H "Authorization: Bearer $TOKEN" localhost:9090/debug/pprof/$PROFILE_NAME -o profile.out | ||
go tool pprof -png profile.out | ||
``` | ||
## Setting Up Grafana + Prometheus | ||
|
||
### Grafana | ||
|
||
A simple grafana deployment can be done with the following commands: | ||
|
||
```bash | ||
helm repo add grafana https://grafana.github.io/helm-charts | ||
helm install grafana grafana/grafana --namespace monitoring --create-namespace | ||
``` | ||
|
||
Get the Grafana URL to visit by running these commands in the same shell: | ||
|
||
```bash | ||
export POD_NAME=$(kubectl get pods --namespace monitoring -l "app.kubernetes.io/name=grafana,app.kubernetes.io/instance=grafana" -o jsonpath="{.items[0].metadata.name}") | ||
kubectl --namespace monitoring port-forward $POD_NAME 3000 | ||
``` | ||
Comment on lines
+142
to
+145
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Simplify with a single command that port-forwards the grafana deployment: kubectl -n monitoring port-forward deploy/grafana 3000 Since Grafana is not configured to use the default
Add a note such as "You can now access the Grafana UI from http://127.0.0.1" |
||
|
||
### Prometheus | ||
|
||
We currently have 2 types of prometheus deployments documented: | ||
|
||
1. Self Hosted using the prometheus helm chart | ||
2. Using Google Managed Prometheus | ||
|
||
=== "Self-Hosted" | ||
|
||
Create Necessary ServiceAccount and RBAC Resources: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. nit: s/Necessary/the necessary/ and s/Resources/resources/ or if you update s/Create Necessary ServiceAccount and RBAC Resources:/Create the necessary ServiceAccount resource:/ and then include the |
||
|
||
```bash | ||
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/observability/prometheus/rbac.yaml | ||
``` | ||
|
||
Add the prometheus-community helm repository: | ||
|
||
```bash | ||
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts | ||
``` | ||
|
||
Deploy the prometheus helm chart using this command: | ||
```bash | ||
helm install prometheus prometheus-community/prometheus \ | ||
--namespace monitoring \ | ||
-f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/observability/prometheus/values.yaml | ||
``` | ||
|
||
You can add the prometheus data source to grafana following [This Guide](https://grafana.com/docs/grafana/latest/administration/data-source-management/). | ||
The prometheus server host is by default `http://prometheus-server` | ||
|
||
Notice that the given values file is very simple and will work directly after following the [Getting Started Guide](https://gateway-api-inference-extension.sigs.k8s.io/guides/), you might need to modify it | ||
|
||
=== "Google Managed" | ||
|
||
If you run the inference gateway with [Google Managed Prometheus](https://cloud.google.com/stackdriver/docs/managed-prometheus), please follow the [instructions](https://cloud.google.com/stackdriver/docs/managed-prometheus/query) | ||
to configure Google Managed Prometheus as data source for the grafana dashboard. | ||
|
||
## Load Inference Extension dashboard into Grafana | ||
|
||
Please follow [grafana instructions](https://grafana.com/docs/grafana/latest/dashboards/build-dashboards/import-dashboards/) to load the dashboard json. | ||
The dashboard can be found here [Grafana Dashboard](https://github.com/kubernetes-sigs/gateway-api-inference-extension/blob/main/tools/dashboards/inference_gateway.json) | ||
|
||
## Prometheus Alerts | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ClusterRoleBinding is a cluster-scoped resource, so remove the namespace field (to avoid confusion).