- Add 260124-nextcloud-maintenance.yml playbook for database indices and mimetypes - Add run-maintenance-all-servers.sh script to run maintenance on all servers - Update ansible.cfg with IdentitiesOnly SSH option to prevent auth failures - Remove orphaned SSH keys for deleted servers (black, dev, purple, white, edge) - Remove obsolete edge-traefik and nat-gateway roles - Remove old upgrade playbooks and fix-private-network playbook - Update host_vars for egel, ree, zwaan - Update diun webhook configuration Successfully ran maintenance on all 17 active servers: - Database indices optimized - Mimetypes updated (145-157 new types on most servers) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> |
||
|---|---|---|
| .. | ||
| add-client-to-monitoring.sh | ||
| add-client-to-terraform.sh | ||
| check-client-versions.sh | ||
| client-status.sh | ||
| collect-client-versions.sh | ||
| configure-oidc.sh | ||
| deploy-client.sh | ||
| destroy-client.sh | ||
| detect-version-drift.sh | ||
| generate-client-keys.sh | ||
| generate-passwords.sh | ||
| get-passwords.sh | ||
| health-check.sh | ||
| list-clients.sh | ||
| load-secrets-env.sh | ||
| README.md | ||
| rebuild-client.sh | ||
| remove-client-from-monitoring.sh | ||
| resize-client-volume.sh | ||
| run-maintenance-all-servers.sh | ||
| update-registry.sh | ||
Management Scripts
Automated scripts for managing client infrastructure.
Prerequisites
Set SOPS Age key location (optional, scripts use default):
export SOPS_AGE_KEY_FILE="./keys/age-key.txt"
Note: The Hetzner API token is now automatically loaded from SOPS-encrypted secrets/shared.sops.yaml. No need to manually set HCLOUD_TOKEN.
Scripts
1. Deploy Fresh Client
Purpose: Deploy a brand new client from scratch
Usage:
./scripts/deploy-client.sh <client_name>
What it does (automatically):
- Generates SSH key (if missing) - Unique per-client key pair
- Creates secrets file (if missing) - From template, opens in editor
- Provisions VPS server (if not exists)
- Sets up base system (Docker, Traefik)
- Deploys Authentik + Nextcloud
- Configures SSO integration automatically
Time: ~10-15 minutes
Example:
# Just run the script - it handles everything!
./scripts/deploy-client.sh newclient
# Script will:
# 1. Generate keys/ssh/newclient + keys/ssh/newclient.pub
# 2. Copy secrets/clients/template.sops.yaml → secrets/clients/newclient.sops.yaml
# 3. Open SOPS editor for you to customize secrets
# 4. Continue with deployment
Requirements:
- Client must be defined in
tofu/terraform.tfvars - SOPS Age key available at
keys/age-key.txt(or setSOPS_AGE_KEY_FILE)
2. Rebuild Client
Purpose: Destroy and recreate a client's infrastructure from scratch
Usage:
./scripts/rebuild-client.sh <client_name>
What it does:
- Destroys existing infrastructure (asks for confirmation)
- Provisions new VPS server
- Sets up base system
- Deploys applications
- Configures SSO
Time: ~10-15 minutes
Example:
./scripts/rebuild-client.sh test
Warning: This is destructive - all data on the server will be lost!
3. Destroy Client
Purpose: Completely remove a client's infrastructure
Usage:
./scripts/destroy-client.sh <client_name>
What it does:
- Stops and removes all Docker containers
- Removes all Docker volumes
- Destroys VPS server via OpenTofu
- Removes DNS records
Time: ~2-3 minutes
Example:
./scripts/destroy-client.sh test
Warning: This is destructive and irreversible! All data will be lost.
Note: Secrets file is preserved after destruction.
Workflow Examples
Deploy a New Client (Fully Automated)
# 1. Add to terraform.tfvars
vim tofu/terraform.tfvars
# Add:
# newclient = {
# server_type = "cx22"
# location = "fsn1"
# subdomain = "newclient"
# apps = ["authentik", "nextcloud"]
# }
# 2. Deploy (script handles SSH key + secrets automatically)
./scripts/deploy-client.sh newclient
# That's it! Script will:
# - Generate SSH key if missing
# - Create secrets file from template if missing (opens editor)
# - Deploy everything
Test Changes (Rebuild)
# Make changes to Ansible roles/playbooks
# Test by rebuilding
./scripts/rebuild-client.sh test
# Verify changes worked
Clean Up
# Remove test infrastructure
./scripts/destroy-client.sh test
Script Output
All scripts provide:
- ✓ Colored output (green = success, yellow = warning, red = error)
- Progress indicators for each step
- Total time taken
- Service URLs and credentials
- Next steps guidance
Error Handling
Scripts will exit if:
- Required environment variables not set
- Secrets file doesn't exist
- Confirmation not provided (for destructive operations)
- Any command fails (set -e)
Safety Features
Destroy Script
- Requires typing client name to confirm
- Shows what will be deleted
- Preserves secrets file
Rebuild Script
- Asks for confirmation before destroying
- 10-second delay after destroy before rebuilding
- Shows existing infrastructure before proceeding
Deploy Script
- Checks for existing infrastructure
- Skips provisioning if server exists
- Validates secrets file exists
Integration with CI/CD
These scripts can be used in automation:
# Non-interactive deployment
export SOPS_AGE_KEY_FILE="..."
./scripts/deploy-client.sh production
For rebuild (skip confirmation):
# Modify rebuild-client.sh to accept --yes flag
./scripts/rebuild-client.sh production --yes
Troubleshooting
Script fails with "HCLOUD_TOKEN not set"
The token should be automatically loaded from SOPS. If this fails:
-
Ensure SOPS Age key is available:
export SOPS_AGE_KEY_FILE="./keys/age-key.txt" ls -la keys/age-key.txt -
Verify token is in shared secrets:
sops -d secrets/shared.sops.yaml | grep hcloud_token -
Manually load secrets:
source scripts/load-secrets-env.sh
Script fails with "Secrets file not found"
Create the secrets file:
cp secrets/clients/test.sops.yaml secrets/clients/<client>.sops.yaml
sops secrets/clients/<client>.sops.yaml
Server not reachable during destroy
This is normal if server is already destroyed. The script will skip Docker cleanup and proceed to OpenTofu destroy.
OpenTofu state conflicts
If multiple people are managing infrastructure:
cd tofu
tofu state pull
tofu state push
Consider using remote state (S3, Terraform Cloud, etc.)
Performance
Typical timings:
| Operation | Time |
|---|---|
| Deploy fresh | 10-15 min |
| Rebuild | 10-15 min |
| Destroy | 2-3 min |
Breakdown:
- Infrastructure provisioning: 2 min
- Server initialization: 1 min
- Base system setup: 3 min
- Application deployment: 5-7 min
See Also
- AUTOMATION_STATUS.md - Full automation details
- sso-automation.md - SSO integration workflow
- architecture-decisions.md - Design decisions