Homelab Disaster Recovery in Under an Hour
Disk dies. Power surge fries the board. You fat-finger an rm -rf. Whatever the cause, the question is always the same: how fast can I get back to normal? For my homelab, the answer is under an hour — because I designed for exactly this scenario.
The Recovery Timeline
Here's how the hour breaks down:
- 0–15 min: Install Ubuntu Server, configure static IP, firewall, and install Docker
- 15–20 min: Clone the backup repo and verify inventory
- 20–30 min: Rehydrate secrets and inject into config files
- 30–40 min: Run compose-based restore (dry-run first, then apply)
- 40–55 min: Bring up services in dependency order
- 55–60 min: Validate, re-enable backup automation
Phase 1: Host Bootstrap
Start with a minimal Ubuntu Server install. The fresh-install runbook covers every step:
# Post-install essentials sudo apt update && sudo apt upgrade -y sudo apt install -y curl wget git jq net-tools htop vim # Static IP via netplan sudo vim /etc/netplan/00-installer-config.yaml # Firewall sudo ufw enable sudo ufw allow 22/tcp # SSH sudo ufw allow 80/tcp # HTTP sudo ufw allow 443/tcp # HTTPS sudo ufw allow 53/tcp # DNS sudo ufw allow 53/udp # DNS # Docker curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg sudo apt install -y docker-ce docker-ce-cli containerd.io docker-compose-plugin sudo usermod -aG docker $USER
Phase 2: Clone + Secrets
# Clone backup repo git clone https://github.com/AnasSemesmieh/homelab.git ~/homelab-backup cd ~/homelab-backup # Verify inventory cat inventory/include-paths.txt | head -20
Then create the local secrets file (~/.homelab-secrets.env) with all API keys, passwords, and tokens. The SECRETS.md catalog documents every secret: where it's used, how to source it, format validation, and rotation procedures.
# Inject secrets into config files source ~/.homelab-secrets.env sed -i "s|PlexOnlineToken=\"REDACTED\"|PlexOnlineToken=\"$PLEX_ONLINE_TOKEN\"|g" \ configs/plex/config/Library/Application\ Support/Plex\ Media\ Server/Preferences.xml # Verify all REDACTED values replaced grep -r "REDACTED" configs/ && echo "ERROR" || echo "OK: All secrets injected"
Phase 3: Compose-Based Restore
This is the clever part. Instead of manually copying files, I built a Docker Compose restore helper that maps the backup into the right directories:
# Preview what will be copied (dry-run) docker compose -f restore/docker-compose.restore.yml run --rm \ -e DRY_RUN=true restore-configs # Apply (no overwrite — safe for existing files) docker compose -f restore/docker-compose.restore.yml run --rm \ -e DRY_RUN=false -e OVERWRITE=false restore-configs
The restore container mounts configs/ read-only and copies files to /home/anas/, preserving the directory structure. Each service gets its docker-compose.yml and config files placed exactly where they need to be.
Phase 4: Ordered Bring-up
Infrastructure first, then applications:
# 1. Traefik (reverse proxy + TLS) cd ~/traefik && docker compose up -d # 2. Pi-hole (DNS) cd ~/pihole && docker compose up -d # 3. Applications cd ~/homepage && docker compose up -d cd ~/plex && docker compose up -d cd ~/tautulli && docker compose up -d cd ~/arrstack && docker compose up -d cd ~/immich && docker compose up -d cd ~/homeassistant && docker compose up -d
Phase 5: Validation
# All containers running?
docker ps --format "table {{.Names}}\t{{.Status}}\t{{.Ports}}"
# Service connectivity
docker exec traefik nslookup pihole
docker exec homepage curl -s http://traefik:8080/ping
# Re-enable backup automation
crontab -e # Add daily + weekly cron jobs
./scripts/backup-and-push.sh # Test first run
Restoring from a Specific Snapshot
If I need a point-in-time restore instead of latest:
# List available snapshots git tag -l "backup-*" # Checkout specific snapshot git checkout tags/backup-2026-05-09 # Restore from that snapshot docker compose -f restore/docker-compose.restore.yml run --rm \ -e DRY_RUN=false -e OVERWRITE=true restore-configs # Return to latest git checkout main
Rollback Procedures
The repo also supports rollbacks for individual services (revert a broken config change) and database state recovery for stateful services like Immich's PostgreSQL. Every scenario is documented in the runbook with exact commands.
Next: Secrets Management for Homelabs — the full lifecycle of redaction, scanning, and rehydration.