Ansible playbooks to deploy self-hosted Run:ai on a GPU-accelerated Kubernetes cluster on Brev instances.
You need at least 24 total CPUs across your cluster. Example configurations:
| Setup | Control Plane | Workers | Total CPUs |
|---|---|---|---|
| 3 nodes | 1x 8 CPU | 2x 8 CPU | 24 |
| 2 nodes | 1x 12 CPU | 1x 12 CPU | 24 |
| Minimal | 1x 8 CPU | 2x 8 CPU (GPU) | 24 |
Recommended: Use the same cloud platform and region for all nodes to ensure network compatibility.
Open these ports in Brev UI before setup:
| Node | Ports |
|---|---|
| Control Plane | 6443, 80, 443 |
| Workers | 10250 |
ssh ubuntu@<control-plane-ip>
sudo apt update && sudo apt install -y git ansible python3-pip
sudo snap install yqgit clone https://github.com/chloecrozier/runai_on_brev.git
cd runai_on_brev/cp config.yaml.example config.yaml
nano config.yamlFill in these required values:
Tip: Run
brev shell <instance-name>thenhostname -Ito get internal IPs.
all:
hosts:
runai-control-plane:
ansible_host: "" # Control plane internal IP
external_ip: "" # Control plane external IP listed in the Brev UI
node_role: control
runai-worker-01:
ansible_host: "" # Worker internal IP
node_role: worker
runai-worker-02:
ansible_host: "" # Worker internal IP
node_role: worker
vars:
runai_jfrog_token: "" # Your JFrog tokenansible-playbook -i config.yaml deployment/bring_up_cluster.yamlOn your local computer, add the hosts entry:
sudo sed -i '' '/runai.local/d' /etc/hosts && sudo bash -c 'echo "<EXTERNAL_IP> runai.local" >> /etc/hosts'Then visit: https://runai.local
Default credentials: [email protected] / Abcd!234
In the Run:ai UI:
- Go to Clusters → New Cluster
- Enter a cluster name
- Select "Run:ai control plane is on the same cluster"
- Enter cluster URL:
https://runai.local - Copy the Helm command and add
--set global.customCA.enabled=true:
helm upgrade -i runai-cluster runai/runai-cluster -n runai \
--set controlPlane.url=runai.local \
--set controlPlane.clientSecret=<FROM_UI> \
--set cluster.uid=<FROM_UI> \
--set cluster.url=runai.local \
--version="<FROM_UI>" \
--create-namespace \
--set global.customCA.enabled=true # <-- ADD THIS LINE- Watch pods come up:
watch -n 2 'kubectl get pods -n runai'-
Add to
config.yaml:runai-worker-03: ansible_host: <new-ip> node_role: worker
-
Generate a fresh join token:
kubeadm token create --print-join-command | sudo tee /root/kubeadm_join.sh -
Run the deployment for just the new worker:
ansible-playbook -i config.yaml deployment/bring_up_cluster.yaml --limit runai-worker-03
-
Restart Run:ai to detect new GPUs:
kubectl rollout restart deployment -n runai runai-agent kubectl rollout restart deployment -n runai metrics-exporter