Run:ai on Brev

Ansible playbooks to deploy self-hosted Run:ai on a GPU-accelerated Kubernetes cluster on Brev instances.

Instance Selection

You need at least 24 total CPUs across your cluster. Example configurations:

Setup	Control Plane	Workers	Total CPUs
3 nodes	1x 8 CPU	2x 8 CPU	24
2 nodes	1x 12 CPU	1x 12 CPU	24
Minimal	1x 8 CPU	2x 8 CPU (GPU)	24

Recommended: Use the same cloud platform and region for all nodes to ensure network compatibility.

Network Setup

Open these ports in Brev UI before setup:

Node	Ports
Control Plane	`6443`, `80`, `443`
Workers	`10250`

Quick Start

1. Install Dependencies (on Control Plane)

ssh ubuntu@<control-plane-ip>
sudo apt update && sudo apt install -y git ansible python3-pip
sudo snap install yq

2. Clone Repository

git clone https://github.com/chloecrozier/runai_on_brev.git
cd runai_on_brev/

3. Configure

cp config.yaml.example config.yaml
nano config.yaml

Fill in these required values:

Tip: Run brev shell <instance-name> then hostname -I to get internal IPs.

all:
  hosts:
    runai-control-plane:
      ansible_host: ""      # Control plane internal IP
      external_ip: ""       # Control plane external IP listed in the Brev UI
      node_role: control
    runai-worker-01:
      ansible_host: ""      # Worker internal IP
      node_role: worker
    runai-worker-02:
      ansible_host: ""      # Worker internal IP
      node_role: worker
  vars:
    runai_jfrog_token: ""   # Your JFrog token

4. Deploy

ansible-playbook -i config.yaml deployment/bring_up_cluster.yaml

5. Access Run:ai UI

On your local computer, add the hosts entry:

sudo sed -i '' '/runai.local/d' /etc/hosts && sudo bash -c 'echo "<EXTERNAL_IP> runai.local" >> /etc/hosts'

Then visit: https://runai.local

Default credentials: [email protected] / Abcd!234

6. Register the Cluster

In the Run:ai UI:

Go to Clusters → New Cluster
Enter a cluster name
Select "Run:ai control plane is on the same cluster"
Enter cluster URL: https://runai.local
Copy the Helm command and add --set global.customCA.enabled=true:

helm upgrade -i runai-cluster runai/runai-cluster -n runai \
  --set controlPlane.url=runai.local \
  --set controlPlane.clientSecret=<FROM_UI> \
  --set cluster.uid=<FROM_UI> \
  --set cluster.url=runai.local \
  --version="<FROM_UI>" \
  --create-namespace \
  --set global.customCA.enabled=true  # <-- ADD THIS LINE

Watch pods come up:

watch -n 2 'kubectl get pods -n runai'

Adding More Workers

Add to config.yaml:

runai-worker-03:
  ansible_host: <new-ip>
  node_role: worker

Generate a fresh join token:

kubeadm token create --print-join-command | sudo tee /root/kubeadm_join.sh

Run the deployment for just the new worker:

ansible-playbook -i config.yaml deployment/bring_up_cluster.yaml --limit runai-worker-03

Restart Run:ai to detect new GPUs:

kubectl rollout restart deployment -n runai runai-agent
kubectl rollout restart deployment -n runai metrics-exporter

Name		Name	Last commit message	Last commit date
Latest commit History 63 Commits
deployment		deployment
maintenance		maintenance
workloads		workloads
README.md		README.md
config.yaml.example		config.yaml.example

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Run:ai on Brev

Instance Selection

Network Setup

Quick Start

1. Install Dependencies (on Control Plane)

2. Clone Repository

3. Configure

4. Deploy

5. Access Run:ai UI

6. Register the Cluster

Adding More Workers

About

Uh oh!

Releases

Packages

chloecrozier/runai_on_brev

Folders and files

Latest commit

History

Repository files navigation

Run:ai on Brev

Instance Selection

Network Setup

Quick Start

1. Install Dependencies (on Control Plane)

2. Clone Repository

3. Configure

4. Deploy

5. Access Run:ai UI

6. Register the Cluster

Adding More Workers

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Packages