-
Notifications
You must be signed in to change notification settings - Fork 2k
Description
What happened?
When using kube-prometheus to monitor the cluster, it was found that the behavior of the two endpoints (8443 and 9443) configured in the ServiceMonitor for the kube-state-metrics
service was inconsistent. Specifically, when Prometheus attempted to scrape data from https://10.232.145.231:8443/metrics
, a timeout error occurred (context deadline exceeded
), causing the endpoint's status to be DOWN. However, another endpoint of the same service, https://10.232.145.231:9443/metrics
, worked properly with its status showing as UP.
Did you expect to see some different?
I expected all configured endpoints to work normally without timeout errors. In particular, different endpoints of the same service should behave consistently unless there are explicit configuration differences or network issues causing such discrepancies.
How to reproduce it (as minimally and precisely as possible):
- Ensure that kube-prometheus and related components (such as Prometheus, kube-state-metrics, etc.) have been correctly deployed.
- Configure the ServiceMonitor to monitor the two endpoints of the
kube-state-metrics
service: 8443 and 9443. - Observe Prometheus' scraping logs and status page to check if port 8443 shows a timeout error while port 9443 works normally.
Environment
k8s :v1.26.3
kube-state-metircs-services.yaml
apiVersion: v1
kind: Service
metadata:
labels:
app.kubernetes.io/component: exporter
app.kubernetes.io/name: kube-state-metrics
app.kubernetes.io/part-of: kube-prometheus
app.kubernetes.io/version: 2.0.0
name: kube-state-metrics
namespace: base-services
spec:
clusterIP: None
ports:
- name: https-main
port: 8443
targetPort: https-main
- name: https-self
port: 9443
targetPort: https-self
selector:
app.kubernetes.io/component: exporter
app.kubernetes.io/name: kube-state-metrics
app.kubernetes.io/part-of: kube-prometheus
kube-state-metrics-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app.kubernetes.io/component: exporter
app.kubernetes.io/name: kube-state-metrics
app.kubernetes.io/part-of: kube-prometheus
app.kubernetes.io/version: 2.0.0
name: kube-state-metrics
namespace: base-services
spec:
replicas: 2
selector:
matchLabels:
app.kubernetes.io/component: exporter
app.kubernetes.io/name: kube-state-metrics
app.kubernetes.io/part-of: kube-prometheus
template:
metadata:
annotations:
kubectl.kubernetes.io/default-container: kube-state-metrics
labels:
app.kubernetes.io/component: exporter
app.kubernetes.io/name: kube-state-metrics
app.kubernetes.io/part-of: kube-prometheus
app.kubernetes.io/version: 2.0.0
spec:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchLabels:
app.kubernetes.io/name: kube-state-metrics
topologyKey: "kubernetes.io/hostname"
namespaces:
- "base-services"
priorityClassName: system-cluster-critical
containers:
- args:
- --host=127.0.0.1
- --port=8081
- --telemetry-host=127.0.0.1
- --telemetry-port=8082
image: platform/k8s.gcr.io/kube-state-metrics/kube-state-metrics:v2.0.0
name: kube-state-metrics
resources:
limits:
cpu: 100m
memory: 250Mi
requests:
cpu: 10m
memory: 190Mi
securityContext:
runAsUser: 65534
- args:
- --logtostderr
- --secure-listen-address=:8443
- --tls-cipher-suites=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305
- --upstream=http://127.0.0.1:8081/
image: platform/quay.io/brancz/kube-rbac-proxy:v0.8.0
name: kube-rbac-proxy-main
ports:
- containerPort: 8443
name: https-main
resources:
limits:
cpu: 40m
memory: 40Mi
requests:
cpu: 20m
memory: 20Mi
securityContext:
runAsGroup: 65532
runAsNonRoot: true
runAsUser: 65532
- args:
- --logtostderr
- --secure-listen-address=:9443
- --tls-cipher-suites=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305
- --upstream=http://127.0.0.1:8082/
image: platform/quay.io/brancz/kube-rbac-proxy:v0.8.0
name: kube-rbac-proxy-self
ports:
- containerPort: 9443
name: https-self
resources:
limits:
cpu: 20m
memory: 40Mi
requests:
cpu: 10m
memory: 20Mi
securityContext:
runAsGroup: 65532
runAsNonRoot: true
runAsUser: 65532
nodeSelector:
kubernetes.io/os: linux
role-master: "true"
serviceAccountName: kube-state-metrics
---
metadata:
name: "kube-state-budget"
namespace: "base-services"
apiVersion: "policy/v1"
kind: "PodDisruptionBudget"
spec:
minAvailable: 1
selector:
matchLabels:
app.kubernetes.io/name: kube-state-metrics