Open
Description
[ This issue was auto-migrated from DA's internal repo (DACH-NY/canton-network-node#12870). Original author: @martinflorian-da ]
What is this about?
Synthetic Monitoring would allow us to do "ping"-like checks to our enhance our monitoring coverage. Off the top of my head, we already want this today for:
- monitoring sequencer availability: (see DA-internal issue)
- "The Scan UI is working"-type checks that we currently abuse scala-based preflight checks for (s.a. DACH-NY/canton-network-node#11928)
We don't want to use Grafana Cloud though and furthermore need to reach hosts that are not on the open Internet. blackbox_exporter seems like a good option that we can deploy in our clusters ourselves.
As an experiment we could:
- deploy an instance as part of the infra strack
- wire it up with Prometheus
- configure it to ping our scan UIs (for example)
- make sure alerts are active if pings/checks fail
...and if it's nice keep it of course.
How important is this and why?
Seems like a good improvement to our monitoring and could allow us to reduce the amount of stuff we do in preflights.