-
Notifications
You must be signed in to change notification settings - Fork 146
Description
Problem description
Query a.*b*.c.d.*
times out after a minute.
carbonapi's version
v0.16.0
Does this happened before
irrelevant (I didn't have an earlier version)
carbonapi's config
Will provide if relevant.
backend software and config
Go-Carbon 0.17.3.
Config will be provided if needed.
carbonapi performance metrics
- Please enable
expvars
in the config and provide it's output
I believe this is the required expvars config:
expvar:
enabled: true
pprofEnabled: true
listen: ""
IDK what output this question refers to.
- What's system load on a server? What's the memory consumption?
Thetop
doesn't seem to indicate any kind of throttling. I believe the userspace CPU load is somewhat higher during the query but still very low.
Mem -free -h
output:
me@graphite-dev:~$ free -h
total used free shared buff/cache available
Mem: 30Gi 3.1Gi 628Mi 1.0Mi 27Gi 27Gi
Swap: 0B 0B 0B
This is our dev host.
-
If possible, please provide profiler's output (https://blog.golang.org/pprof for instruction how to do that,
debug
handlers can be enabled in carbonapi's config on a separate port). Golang allows different types of profiles. It's always better to give CPU Profile, Memory profile and Lock profile, but that depends on what's the symptoms.
2 CPU profiles which are captured during the request are attached:
-
Debug logs could be helpful as well
Setting log-level to debug has immense implication on performance so for now I've set it to "info" level. The only 2 relevant log entries for this request (greppeda\.\*b\*\.c\.d\.\*
) are:
"level":"ERROR","timestamp":"2024-02-18T13:51:21.454Z","logger":"access","message":"request failed","data":{"handler":"render","carbonapi_uuid":"d1118526-3647-4c52-99ea-49d19b549f74","url":"/render","peer_ip":"10.42.136.247","host":"{our graphite dev host}:4040","format":"json","use_cache":true,"targets":["a.*b*.c.d.*"],"cache_timeout":180,"runtime":59.89556336,"http_code":503,"reason":"timeout while fetching Response","from":1708263920,"until":1708264222,"from_raw":"1708263920","until_raw":"1708264222","uri":"/render","from_cache":false,"used_backend_cache":false,"request_headers":{"X-Grafana-Org-Id":"1","X-Panel-Id":"1"}}}
and some 10's MB's long entry which starts like this:
{"level":"WARN","timestamp":"2024-02-18T13:52:40.013Z","logger":"zipper","message":"errors occurred while getting results","type":"protoV2Group","name":"http://127.0.0.1:8080","type":"fetch","request":"&MultiFetchRequest{Metrics:[]FetchRequest{FetchRequest{Name:
{{ removed for brevity }}
,"carbonapi_uuid":"d1118526-3647-4c52-99ea-49d19b549f74","errors":"max tries exceeded","errorsVerbose":"max tries exceeded\nHTTP Code: 500\n\ngithub.com/go-graphite/carbonapi/zipper/types.init\n\t/root/go/src/github.com/go-graphite/carbonapi/zipper/types/errors.go:25\nruntime.doInit\n\t/usr/local/go/src/runtime/proc.go:6321\nruntime.doInit\n\t/usr/local/go/src/runtime/proc.go:6298\nruntime.doInit\n\t/usr/local/go/src/runtime/proc.go:6298\nruntime.doInit\n\t/usr/local/go/src/runtime/proc.go:6298\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:233\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1594\n\nCaused By: failed to fetch data from server/group\nHTTP Code: 500\n\ngithub.com/go-graphite/carbonapi/zipper/types.init\n\t/root/go/src/github.com/go-graphite/carbonapi/zipper/types/errors.go:27\nruntime.doInit\n\t/usr/local/go/src/runtime/proc.go:6321\nruntime.doInit\n\t/usr/local/go/src/runtime/proc.go:6298\nruntime.doInit\n\t/usr/local/go/src/runtime/proc.go:6298\nruntime.doInit\n\t/usr/local/go/src/runtime/proc.go:6298\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:233\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1594\n\nCaused By: error while fetching Response\n\ngithub.com/go-graphite/carbonapi/zipper/types.init\n\t/root/go/src/github.com/go-graphite/carbonapi/zipper/types/errors.go:34\nruntime.doInit\n\t/usr/local/go/src/runtime/proc.go:6321\nruntime.doInit\n\t/usr/local/go/src/runtime/proc.go:6298\nruntime.doInit\n\t/usr/local/go/src/runtime/proc.go:6298\nruntime.doInit\n\t/usr/local/go/src/runtime/proc.go:6298\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:233\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1594"
Query that causes problems
- GF queries metrics. It's render query. Query target is expressed by
a.*b*.c.d.*
. - According to
go-carbon
logs, there are ~65.5K files (i.e. metrics) returned. - resolution - I think it's irrelevant.
- datapoints amount - Also irrelevant to my mind?
Additional context
The carbonapi
log reports status code of 503. The nginx
reverse proxy indicates code 499 which is according to the internet means a client has halted the connection. Anyway, 1 minute feels too long anyway.