Skip to content

Improve API availability #800

@jimaek

Description

@jimaek

Currently we have a few major points of failure. Lets discuss potential solutions to address them.

  1. Hetzner Load balancer for our 2 API nodes. If the LB goes down we go down. A global LB like Cloudflare LBs would potentially be much more reliable and it would also give us a lot of new stats and lots of options for anti-abuse and attack protection.
    It would also allow us to split the nodes between multiple providers, not just Hetzner.

  2. Redis. We have multiple redis instances running for different needs and they all run on a single server. A potential first step would be to move the most critical instances, like probe sync, to a multi-node cluster, preferably not-redis, either a fork or even a cloud option like Cloudflare KV, durable objects, D1, Queues, Sqlite. Lots of products that could fit our needs.

I don't think there is anything else. Our measurements DBs wont impact the API if it goes down.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions