Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
119 changes: 119 additions & 0 deletions CLAUDE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,119 @@
# CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

## Repository Overview

This is a self-hosted infrastructure repository using Pulumi for Infrastructure-as-Code (IaC) and Ansible for K3s cluster bootstrapping. The primary use case is running a Foundry VTT server for D&D sessions, exposed via Cloudflare Tunnel with Zero Trust authentication.

## Development Commands

### Ansible (K3s Cluster Bootstrap)

```bash
# Bootstrap K3s cluster on the server
cd ansible
ansible-playbook playbook.yml -i inventory.yml -kK

# Note: May require export ANSIBLE_BECOME_EXE=sudo.ws due to Ansible issue #85837
export ANSIBLE_BECOME_EXE=sudo.ws
```

### Pulumi (Infrastructure Management)

```bash
# Navigate to pulumi directory
cd pulumi

# Install dependencies
npm install

# Preview infrastructure changes
pulumi preview --stack homelab

# Deploy infrastructure changes
pulumi up --stack homelab

# View current stack outputs
pulumi stack output --stack homelab

# Destroy infrastructure
pulumi destroy --stack homelab
```

### TypeScript Development

```bash
# Compile TypeScript
cd pulumi
npx tsc

# Type check
npx tsc --noEmit
```

## Architecture

### Dual Network Strategy

- **Cloudflare Tunnel**: Public access for specific applications (Foundry VTT) with Zero Trust authentication
- **Tailscale**: Private access for administrative tasks and all other services

### Infrastructure Components

The Pulumi code manages two main areas:

1. **Cloudflare Resources** (src/cloudflare-cloud.ts):
- Creates a Cloudflare Tunnel for secure public access
- Sets up DNS records pointing to the tunnel
- Configures Zero Trust Access Application with email-based policies
- Stores tunnel token as a Kubernetes Secret

2. **Kubernetes Resources**:
- **Cloudflared Deployment** (src/cloudflared.ts): Runs the Cloudflare tunnel daemon in K8s with 2 replicas
- **Foundry VTT Application** (src/foundry.ts): Complete application stack including:
- Secrets for Foundry credentials
- PersistentVolume backed by local hostPath on the node "new-bermuda"
- PersistentVolumeClaim requesting 25Gi
- Service exposing port 30000
- Deployment running felddy/foundryvtt:13 image

### Configuration Flow

The main entry point (src/index.ts) orchestrates resource creation:
1. Loads configuration from Pulumi config (stack yaml files)
2. Creates Cloudflare resources and obtains tunnel token
3. Deploys cloudflared daemon to K8s with tunnel token
4. Deploys Foundry VTT application to K8s

Configuration is typed via src/types.ts and stored in Pulumi stack files (encrypted).

### Ansible Structure

The Ansible playbook:
- Imports the k3s-io/k3s-ansible collection playbook for cluster setup
- Installs additional dependencies like Helm
- Uses inventory.yml to define the cluster (single server node: "new-bermuda")
- Automatically merges kubeconfig for kubectl access

## Important Details

### Storage
- Foundry uses a local PersistentVolume at `/home/jack/foundrydata` on the node "new-bermuda"
- PV has node affinity to ensure it only runs on that specific node
- Uses manual storage class with Retain reclaim policy

### Networking
- Foundry is exposed on NodePort 30000 internally
- Cloudflare tunnel routes `foundry.<domain>` to `http://foundry:30000`
- Zero Trust policy allows access only to whitelisted email addresses

### K3s Configuration
- Single server node setup (can be extended to include agents)
- K3s version specified in ansible/inventory.yml
- KUBECONFIG is automatically configured for the ansible_user

### Git Branches
- Main branch: `main`
- Current working branch: `pulumi`
- Recent work involved migrating from Argo to pure Pulumi management
110 changes: 76 additions & 34 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,66 +3,108 @@
- Seems to require `export ANSIBLE_BECOME_EXE=sudo.ws` due to [this issue](https://github.com/ansible/ansible/issues/85837)
- Run with `ansible-playbook playbook.yml -i inventory.yml -kK` where the flags have you manually input SSH password

## Argo
- Have to manually configure repo connection / secret
- Could bypass with [SealedSecrets](https://argo-cd.readthedocs.io/en/stable/operator-manual/declarative-setup/#repositories), but don't feel like it yet
# Repo Structure
- `ansible/` - Contains Ansible playbook to bootstrap K3s
- `pulumi` - IaC for managing cloud & k8s resources

## Foundry Service
- Need to manually create foundry-creds secret
# Architecture Notes

## Cloudflared Service
- Need to manually create tunnel-token secret
## Diagram

## Tailscaled
- Had to manually add OAuth client ID / secret in Argo UI for the Helm chart
- Probably a better way to do this
```mermaid
flowchart TB
subgraph Internet["Internet"]
Users["D&D Players"]
end

# Repo Structure
- `ansible/` - Contains Ansible playbook to bootstrap K3s with ArgoCD onto a new machine
- `argo/` - Contains Argo resource definitions
- `argo/applications/` - Resource definition for Argo applications, mostly referencing the resources mirrored in the `k3s` directory
- `k3s/` - Kubernetes resource definitions, grouped into directory by their application
- `pulumi` - IaC for managing cloud resources - in my case, Cloudflare tunnels and zero-trust applications
subgraph Cloudflare["Cloudflare"]
DNS["DNS<br/>foundry.domain.com"]
ZT["Zero Trust<br/>Email Auth"]
Tunnel["Cloudflare Tunnel"]
end

subgraph Tailscale["Tailscale Network"]
TS_Client["Tailscale Client<br/>(Admin/Private Access)"]
end

subgraph Server["Server: new-bermuda"]
subgraph K3s["K3s Cluster"]
subgraph Deployments["Deployments"]
CFD["cloudflared<br/>(2 replicas)"]
Foundry["Foundry VTT<br/>:30000"]
Glance["Glance Dashboard<br/>:8080"]
end

subgraph Storage["Storage"]
PV["PersistentVolume<br/>/home/jack/foundrydata"]
end

subgraph Secrets["Secrets"]
TunnelToken["tunnel-token"]
FoundryCreds["foundry-creds"]
end
end
end

subgraph Management["Management Tools"]
Ansible["Ansible<br/>K3s Bootstrap"]
Pulumi["Pulumi<br/>IaC"]
end

%% Internet flow
Users --> DNS
DNS --> ZT
ZT --> Tunnel
Tunnel --> CFD
CFD --> Foundry

%% Tailscale flow
TS_Client -.->|"Private Access"| Foundry
TS_Client -.->|"Private Access"| Glance

%% Internal connections
CFD --> TunnelToken
Foundry --> FoundryCreds
Foundry --> PV

%% Management
Ansible -.->|"Bootstrap"| K3s
Pulumi -.->|"Manage"| Cloudflare
Pulumi -.->|"Manage"| K3s
```

# Architecture Notes
## Networking
- Cloudflare for 'application' access - in my case, Foundry for DnD sessions
- Tailscale for everything else
- Tailscale for everything else
- [Tailscale K8s operator pod](https://tailscale.com/kb/1236/kubernetes-operator#setup)

## Pulumi
- Used to manage Cloudflare resources
- Creates tunnel & DNS records
- Creates zero-trust application
- Configure and deploy ArgoCD helm chart
- Also creates Kubernetes resources, generally a file per application
- Bootstrap K8s cluster basically
- There is some overlap between Pulumi and Argo, as the Cloudflare resource creates a k8s secret to be used by the cloudflared deployment

## ArgoCD
- Continuous delivery of k8s resources, repo as souce-of-truth
- Cloudflared tunnel deployment
- NOTE: in the future, probably will get more Hardware
- perhaps a stack per machine? maybe? that may not make sense though if a cluster is machine agnostic

# TODO
## Infra
- [x] Write Ansible playbook to bootstrap a server
- [Ref](https://www.reddit.com/r/selfhosted/s/ryBd8BYD8Y)
- [x] K3s
- [x] Set up ArgoCD in declarative manner
- https://argo-cd.readthedocs.io/en/stable/operator-manual/declarative-setup/
- [x] Apply w/ Ansible during bootstrap?
- [ ] Set up Tailscale w/ Argo in K8s cluster
- [X] Service annotation
- [ ] MFA
- [x] Set up Tailscale w/ Pulumi in K8s cluster
- [x] Service annotation
- [x] MFA
- [ ] Add server itself to [Tailscale](https://login.tailscale.com/admin/machines/new-linux)
- [x] Make Argo available to Tailscale
- [ ] Fix HTTPS
- [ ] Have Argo [manage itself](https://argo-cd.readthedocs.io/en/stable/operator-manual/declarative-setup/#manage-argo-cd-using-argo-cd)
- [ ] [Tailnet Lock](https://tailscale.com/kb/1226/tailnet-lock)
- [X] ==Migrate to Pulumi from Argo==
- [ ] ==Split Pulumi stacks out into separate stacks==

## Docs
- [X] Document repo structure in README
- [ ] Make network/arch diagram
- [x] Make network/arch diagram
- [ ] Update board
- [ ] Sanitize and make repo public

## Hardware
- [ ] Get a NAS for backups
Expand Down
13 changes: 0 additions & 13 deletions ansible/playbook.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,16 +12,3 @@
ansible.builtin.apt:
name: helm
state: present
- name: Install ArgoCD
hosts: k3s_cluster
tasks:
- name: Add chart repo
kubernetes.core.helm_repository:
repo_url: "https://argoproj.github.io/argo-helm"
name: argo
- name: Install ArgoCD Helm Chart
kubernetes.core.helm:
name: argo
chart_ref: argo/argo-cd
release_namespace: argo-cd
create_namespace: true
29 changes: 0 additions & 29 deletions argo/applications/cloudflare.yml

This file was deleted.

29 changes: 0 additions & 29 deletions argo/applications/foundry.yml

This file was deleted.

16 changes: 0 additions & 16 deletions argo/applications/tailscale.yaml

This file was deleted.

Loading