Homelab Disaster Recovery in Under an Hour

By Anas Semesmieh · May 7, 2026 · Homelab · Disaster Recovery

Disk dies. Power surge fries the board. You fat-finger an rm -rf. Whatever the cause, the question is always the same: how fast can I get back to normal? For my homelab, the answer is under an hour — because I designed for exactly this scenario.

Full runbooks: FRESH-INSTALL.md · RESTORE.md

The Recovery Timeline

Here's how the hour breaks down:

Phase 1: Host Bootstrap

Start with a minimal Ubuntu Server install. The fresh-install runbook covers every step:

# Post-install essentials
sudo apt update && sudo apt upgrade -y
sudo apt install -y curl wget git jq net-tools htop vim

# Static IP via netplan
sudo vim /etc/netplan/00-installer-config.yaml

# Firewall
sudo ufw enable
sudo ufw allow 22/tcp    # SSH
sudo ufw allow 80/tcp    # HTTP
sudo ufw allow 443/tcp   # HTTPS
sudo ufw allow 53/tcp    # DNS
sudo ufw allow 53/udp    # DNS

# Docker
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg
sudo apt install -y docker-ce docker-ce-cli containerd.io docker-compose-plugin
sudo usermod -aG docker $USER

Phase 2: Clone + Secrets

# Clone backup repo
git clone https://github.com/AnasSemesmieh/homelab.git ~/homelab-backup
cd ~/homelab-backup

# Verify inventory
cat inventory/include-paths.txt | head -20

Then create the local secrets file (~/.homelab-secrets.env) with all API keys, passwords, and tokens. The SECRETS.md catalog documents every secret: where it's used, how to source it, format validation, and rotation procedures.

# Inject secrets into config files
source ~/.homelab-secrets.env
sed -i "s|PlexOnlineToken=\"REDACTED\"|PlexOnlineToken=\"$PLEX_ONLINE_TOKEN\"|g" \
  configs/plex/config/Library/Application\ Support/Plex\ Media\ Server/Preferences.xml

# Verify all REDACTED values replaced
grep -r "REDACTED" configs/ && echo "ERROR" || echo "OK: All secrets injected"

Phase 3: Compose-Based Restore

This is the clever part. Instead of manually copying files, I built a Docker Compose restore helper that maps the backup into the right directories:

# Preview what will be copied (dry-run)
docker compose -f restore/docker-compose.restore.yml run --rm \
  -e DRY_RUN=true restore-configs

# Apply (no overwrite — safe for existing files)
docker compose -f restore/docker-compose.restore.yml run --rm \
  -e DRY_RUN=false -e OVERWRITE=false restore-configs

The restore container mounts configs/ read-only and copies files to /home/anas/, preserving the directory structure. Each service gets its docker-compose.yml and config files placed exactly where they need to be.

Phase 4: Ordered Bring-up

Infrastructure first, then applications:

# 1. Traefik (reverse proxy + TLS)
cd ~/traefik && docker compose up -d

# 2. Pi-hole (DNS)
cd ~/pihole && docker compose up -d

# 3. Applications
cd ~/homepage && docker compose up -d
cd ~/plex && docker compose up -d
cd ~/tautulli && docker compose up -d
cd ~/arrstack && docker compose up -d
cd ~/immich && docker compose up -d
cd ~/homeassistant && docker compose up -d

Phase 5: Validation

# All containers running?
docker ps --format "table {{.Names}}\t{{.Status}}\t{{.Ports}}"

# Service connectivity
docker exec traefik nslookup pihole
docker exec homepage curl -s http://traefik:8080/ping

# Re-enable backup automation
crontab -e  # Add daily + weekly cron jobs
./scripts/backup-and-push.sh  # Test first run

Restoring from a Specific Snapshot

If I need a point-in-time restore instead of latest:

# List available snapshots
git tag -l "backup-*"

# Checkout specific snapshot
git checkout tags/backup-2026-05-09

# Restore from that snapshot
docker compose -f restore/docker-compose.restore.yml run --rm \
  -e DRY_RUN=false -e OVERWRITE=true restore-configs

# Return to latest
git checkout main

Rollback Procedures

The repo also supports rollbacks for individual services (revert a broken config change) and database state recovery for stateful services like Immich's PostgreSQL. Every scenario is documented in the runbook with exact commands.

The real value isn't the restore time — it's the confidence. Knowing I can rebuild everything from scratch means I experiment freely. Break things on Saturday, restore on Sunday.

Next: Secrets Management for Homelabs — the full lifecycle of redaction, scanning, and rehydration.