thetechhustle · bdorlus · May 18, 2026 · May 14, 2026
diff --git a/docs/course/capstone.md b/docs/course/capstone.md
@@ -1,22 +1,123 @@
 # Capstone: Operate a Small Linux Service
 
-The capstone turns the lessons into one realistic workflow.
+The capstone turns the lessons into one realistic workflow. Treat it like a
+small production change: plan first, work in a disposable environment, capture
+evidence as you go, and leave a handoff that another operator could use.
 
 ## Scenario
 
 You joined a small team that needs a Linux VM to host an internal status page. Your job is to build it, secure it, monitor it, document it, and prove it can recover from common failures.
 
+## Operating constraints
+
+- Use a lab VM, cloud trial instance, or local container that you can delete.
+- Do not run the capstone on a workstation, shared server, or irreplaceable VM.
+- Take a snapshot before the failure drill if your platform supports snapshots.
+- Keep notes in a text file as you work; do not rely on shell history alone.
+- Avoid storing real secrets, customer data, or personal credentials in evidence.
+
+## Suggested stack
+
+Pick one path and keep it simple:
+
+- Ubuntu Server or Debian with Nginx or Apache.
+- Rocky Linux, AlmaLinux, or Fedora Server with Nginx or Apache.
+- A local VM from VirtualBox, VMware, UTM, Multipass, or a short-lived cloud VM.
+
+The exact distribution matters less than the habits: inspect before changing,
+make one change at a time, verify it, and record what happened.
+
 ## Requirements
 
-- Provision a disposable Linux VM.
-- Create a non-root admin user with SSH access.
-- Install and run a web server.
-- Serve a static status page.
-- Configure a firewall so only SSH and HTTP/HTTPS are reachable.
-- Capture logs for the web service and SSH.
-- Add a basic monitoring check.
-- Simulate one failure and recover from it.
-- Write an operator handoff.
+- Provision a disposable Linux system and record how it was created.
+- Create a non-root admin user with SSH access and sudo rights.
+- Update package metadata and install a web server.
+- Serve a static status page with a hostname, owner, and last-updated note.
+- Configure a firewall so only SSH and HTTP or HTTPS are reachable.
+- Capture service logs and SSH authentication logs.
+- Add a basic monitoring check for the web service and disk space.
+- Simulate one failure, recover from it, and show before/after evidence.
+- Write an operator handoff with rollback and next maintenance steps.
+
+## Work phases
+
+### 1. Baseline the host
+
+Capture enough information to prove what system you started with:
+
+```bash
+hostnamectl
+uname -a
+ip address
+ip route
+df -h
+systemctl --failed
+```
+
+Also record the package source evidence for your distribution:
+
+```bash
+cat /etc/os-release
+apt policy 2>/dev/null || dnf repolist 2>/dev/null || yum repolist 2>/dev/null
+```
+
+### 2. Build the service
+
+Install the web server, enable it, and verify it locally before opening the
+firewall:
+
+```bash
+sudo systemctl enable --now nginx 2>/dev/null || sudo systemctl enable --now httpd
+systemctl status nginx 2>/dev/null || systemctl status httpd
+curl -I http://localhost/
+```
+
+Replace the default page with a small status page that includes the service
+purpose, operator contact, and update timestamp. Keep the page boring and easy
+to inspect.
+
+### 3. Secure the access path
+
+Show that access is intentional:
+
+```bash
+id
+sudo -l
+ss -tulpn
+sudo ufw status verbose 2>/dev/null || sudo firewall-cmd --list-all 2>/dev/null
+```
+
+Only SSH and the web service should be reachable unless your lab platform
+requires another management port. If you use a cloud VM, record both the Linux
+firewall state and the cloud security group or firewall rule summary.
+
+### 4. Add lightweight monitoring
+
+Monitoring can be a simple script, cron job, systemd timer, or manual check
+documented in the handoff. It should answer two questions:
+
+- Is the web service responding?
+- Is the host at risk from obvious resource pressure?
+
+Example checks:
+
+```bash
+curl -fsS http://localhost/ >/dev/null
+df -h /
+systemctl is-active nginx 2>/dev/null || systemctl is-active httpd
+```
+
+### 5. Run the failure drill
+
+Choose one reversible failure:
+
+- Stop the web service, detect the outage, and restart it.
+- Move the status page aside, detect the bad response, and restore it.
+- Fill a temporary test directory enough to trigger your disk-space check, then
+  clean it up.
+
+Record the failure time, symptom, detection command, recovery command, and final
+proof that the service is healthy again.
 
 ## Evidence to submit
 
@@ -27,6 +128,37 @@ You joined a small team that needs a Linux VM to host an internal status page. Y
 - Log excerpts that prove the service worked.
 - Handoff note with rollback and next maintenance step.
 
+## Handoff template
+
+Use this structure for the final operator note:
+
+```text
+Service:
+Host:
+Owner:
+Purpose:
+Build summary:
+Access method:
+Firewall policy:
+Monitoring check:
+Failure tested:
+Recovery steps:
+Rollback plan:
+Known risks:
+Next maintenance:
+Evidence location:
+```
+
+## Completion checks
+
+Before you call the capstone complete, verify these are true:
+
+- A fresh SSH session can log in with the non-root admin account.
+- The web page responds locally and from the expected network path.
+- The firewall state matches the intended exposure.
+- Logs show the service start, access test, failure, and recovery.
+- The handoff explains how to operate, recover, and safely retire the lab.
+
 ## Review rubric
 
 | Area | Good evidence |