Improve API availability

Currently we have a few major points of failure. Lets discuss potential solutions to address them.

1. Hetzner Load balancer for our 2 API nodes. If the LB goes down we go down. A global LB like Cloudflare LBs would potentially be much more reliable and it would also give us a lot of new stats and lots of options for anti-abuse and attack protection. 
It would also allow us to split the nodes between multiple providers, not just Hetzner.

2. Redis. We have multiple redis instances running for different needs and they all run on a single server. A potential first step would be to move the most critical instances, like probe sync, to a multi-node cluster, preferably not-redis, either a fork or even a cloud option like Cloudflare KV, durable objects, D1, Queues, Sqlite. Lots of products that could fit our needs. 


I don't think there is anything else. Our measurements DBs wont impact the API if it goes down.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Improve API availability #800

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Improve API availability #800

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions