diff --git a/docs/maintenance-tracking.md b/docs/maintenance-tracking.md new file mode 100644 index 0000000..b48314f --- /dev/null +++ b/docs/maintenance-tracking.md @@ -0,0 +1,477 @@ +# Maintenance and Version Tracking + +Comprehensive guide to tracking software versions, maintenance history, and detecting version drift across all deployed clients. + +## Overview + +The infrastructure tracks: +- **Software versions** - Authentik, Nextcloud, Traefik, Ubuntu +- **Maintenance dates** - Last update, security patches, OS updates +- **Version drift** - Clients running different versions +- **Update history** - Audit trail of changes + +All version and maintenance data is stored in [`clients/registry.yml`](../clients/registry.yml). + +## Registry Structure + +Each client tracks versions and maintenance: + +```yaml +clients: + myclient: + versions: + authentik: "2025.10.3" + nextcloud: "30.0.17" + traefik: "v3.0" + ubuntu: "24.04" + + maintenance: + last_full_update: 2026-01-17 + last_security_patch: 2026-01-17 + last_os_update: 2026-01-17 + last_backup_verified: null +``` + +## Version Management Scripts + +### Collect Client Versions + +Query actual deployed versions from a running server: + +```bash +# Collect versions from dev client +./scripts/collect-client-versions.sh dev +``` + +This script: +- Connects to the server via Ansible +- Queries Docker container image tags +- Queries Ubuntu OS version +- Updates the registry automatically + +**Output:** +``` +Collecting versions for client: dev + +Querying deployed versions... +Collecting Docker container versions... +✓ Versions collected + +Collected versions: + Authentik: 2025.10.3 + Nextcloud: 30.0.17 + Traefik: v3.0 + Ubuntu: 24.04 + +✓ Registry updated +``` + +**Requirements:** +- Server must be deployed and reachable +- `HCLOUD_TOKEN` environment variable set +- Ansible configured with dynamic inventory + +### Check All Client Versions + +Compare versions across all clients: + +```bash +# Default: Table format with color coding +./scripts/check-client-versions.sh + +# Export as CSV +./scripts/check-client-versions.sh --format=csv + +# Export as JSON +./scripts/check-client-versions.sh --format=json + +# Show only clients with outdated versions +./scripts/check-client-versions.sh --outdated +``` + +**Table output:** +``` +═══════════════════════════════════════════════════════════════════════════════ + CLIENT VERSION REPORT +═══════════════════════════════════════════════════════════════════════════════ + +CLIENT STATUS AUTHENTIK NEXTCLOUD TRAEFIK UBUNTU +────────────────────────────────────────────────────────────────────────────────────────────── +dev deployed 2025.10.3 30.0.17 v3.0 24.04 +client1 deployed 2025.10.2 30.0.16 v3.0 24.04 + +Latest versions: + Authentik: 2025.10.3 + Nextcloud: 30.0.17 + Traefik: v3.0 + Ubuntu: 24.04 + +Note: Red indicates outdated version +``` + +**CSV output:** +```csv +client,status,authentik,nextcloud,traefik,ubuntu,last_update,outdated +dev,deployed,2025.10.3,30.0.17,v3.0,24.04,2026-01-17,no +client1,deployed,2025.10.2,30.0.16,v3.0,24.04,2026-01-10,yes +``` + +**JSON output:** +```json +{ + "latest_versions": { + "authentik": "2025.10.3", + "nextcloud": "30.0.17", + "traefik": "v3.0", + "ubuntu": "24.04" + }, + "clients": [ + { + "name": "dev", + "status": "deployed", + "versions": { + "authentik": "2025.10.3", + "nextcloud": "30.0.17", + "traefik": "v3.0", + "ubuntu": "24.04" + }, + "last_update": "2026-01-17", + "outdated": false + } + ] +} +``` + +### Detect Version Drift + +Identify clients with outdated versions: + +```bash +# Default: Check all deployed clients +./scripts/detect-version-drift.sh + +# Check clients not updated in 30+ days +./scripts/detect-version-drift.sh --threshold=30 + +# Check specific application only +./scripts/detect-version-drift.sh --app=authentik + +# Summary output for monitoring +./scripts/detect-version-drift.sh --format=summary +``` + +**Output when drift detected:** +``` +⚠ VERSION DRIFT DETECTED + +Clients with outdated versions: + +• client1 + Authentik: 2025.10.2 → 2025.10.3 + Nextcloud: 30.0.16 → 30.0.17 + +• client2 + Last update: 2025-12-15 (>30 days ago) + +Recommended actions: + +1. Test updates on canary server first: + ./scripts/rebuild-client.sh dev + +2. Verify canary health: + ./scripts/client-status.sh dev + +3. Update outdated clients: + ./scripts/rebuild-client.sh client1 + ./scripts/rebuild-client.sh client2 +``` + +**Exit codes:** +- `0` - No drift detected (all clients up to date) +- `1` - Drift detected (action needed) +- `2` - Error (script failure) + +**Summary format** (useful for monitoring): +``` +Status: DRIFT DETECTED +Drift: Yes +Clients checked: 5 +Clients with outdated versions: 2 +Clients not updated in 30 days: 1 +Affected clients: client1 client2 +``` + +## Automatic Version Collection + +Version collection is **automatically performed** after deployments: + +### On New Deployment + +`./scripts/deploy-client.sh myclient`: +1. Provisions infrastructure +2. Deploys applications +3. Updates registry with server info +4. **Collects and records versions** ← Automatic + +### On Rebuild + +`./scripts/rebuild-client.sh myclient`: +1. Destroys old infrastructure +2. Provisions new infrastructure +3. Deploys applications +4. Updates registry +5. **Collects and records versions** ← Automatic + +If automatic collection fails (server not ready, network issue): +``` +⚠ Could not collect versions automatically +Run manually later: ./scripts/collect-client-versions.sh myclient +``` + +## Maintenance Workflows + +### Security Update Workflow + +1. **Check current state** + ```bash + ./scripts/check-client-versions.sh + ``` + +2. **Update canary first** (dev server) + ```bash + ./scripts/rebuild-client.sh dev + ``` + +3. **Verify canary** + ```bash + # Check health + ./scripts/client-status.sh dev + + # Verify versions updated + ./scripts/collect-client-versions.sh dev + ``` + +4. **Detect drift** (identify outdated clients) + ```bash + ./scripts/detect-version-drift.sh + ``` + +5. **Roll out to production** + ```bash + # Update each client + ./scripts/rebuild-client.sh client1 + ./scripts/rebuild-client.sh client2 + + # Or batch update (be careful!) + for client in $(./scripts/list-clients.sh --role=production --format=csv | tail -n +2 | cut -d, -f1); do + ./scripts/rebuild-client.sh "$client" + sleep 300 # Wait 5 minutes between updates + done + ``` + +6. **Verify all updated** + ```bash + ./scripts/detect-version-drift.sh + ``` + +### Monthly Maintenance Check + +Run these checks monthly: + +```bash +# 1. Version report +./scripts/check-client-versions.sh > reports/versions-$(date +%Y-%m).txt + +# 2. Drift detection +./scripts/detect-version-drift.sh --threshold=30 + +# 3. Client health +for client in $(./scripts/list-clients.sh --status=deployed --format=csv | tail -n +2 | cut -d, -f1); do + ./scripts/client-status.sh "$client" +done +``` + +### Update Maintenance Dates + +Deployment scripts automatically update `last_full_update`. For other maintenance: + +```bash +# After security patches (OS level) +yq eval -i ".clients.myclient.maintenance.last_security_patch = \"$(date +%Y-%m-%d)\"" clients/registry.yml + +# After OS updates +yq eval -i ".clients.myclient.maintenance.last_os_update = \"$(date +%Y-%m-%d)\"" clients/registry.yml + +# After backup verification +yq eval -i ".clients.myclient.maintenance.last_backup_verified = \"$(date +%Y-%m-%d)\"" clients/registry.yml + +# Commit changes +git add clients/registry.yml +git commit -m "chore: Update maintenance dates" +git push +``` + +## Integration with Monitoring + +### Continuous Drift Detection + +Set up a cron job or CI pipeline: + +```bash +#!/bin/bash +# check-drift.sh - Run daily + +cd /path/to/infrastructure + +# Check for drift +if ! ./scripts/detect-version-drift.sh --format=summary; then + # Send alert (Slack, email, etc.) + ./scripts/detect-version-drift.sh | mail -s "Version Drift Detected" ops@example.com +fi +``` + +### Export for External Tools + +```bash +# Export version data as JSON for monitoring tools +./scripts/check-client-versions.sh --format=json > /var/monitoring/client-versions.json + +# Export drift status +./scripts/detect-version-drift.sh --format=summary > /var/monitoring/drift-status.txt +``` + +### Prometheus Metrics + +Convert to Prometheus format: + +```bash +#!/bin/bash +# export-metrics.sh + +# Count clients by drift status +total=$(./scripts/list-clients.sh --status=deployed --format=csv | tail -n +2 | wc -l) +outdated=$(./scripts/check-client-versions.sh --format=csv --outdated | tail -n +2 | wc -l) +uptodate=$((total - outdated)) + +echo "# HELP clients_total Total number of deployed clients" +echo "# TYPE clients_total gauge" +echo "clients_total $total" + +echo "# HELP clients_outdated Number of clients with outdated versions" +echo "# TYPE clients_outdated gauge" +echo "clients_outdated $outdated" + +echo "# HELP clients_uptodate Number of clients with latest versions" +echo "# TYPE clients_uptodate gauge" +echo "clients_uptodate $uptodate" +``` + +## Version Pinning + +To prevent automatic updates, pin versions in Ansible roles: + +```yaml +# roles/authentik/defaults/main.yml +authentik_version: "2025.10.3" # Pinned version + +# To update: +# 1. Change pinned version +# 2. Update canary: ./scripts/rebuild-client.sh dev +# 3. Verify and roll out +``` + +## Troubleshooting + +### Version Collection Fails + +**Problem:** `collect-client-versions.sh` cannot reach server + +**Solutions:** +1. Check server is deployed and running: + ```bash + ./scripts/client-status.sh myclient + ``` + +2. Verify HCLOUD_TOKEN is set: + ```bash + echo $HCLOUD_TOKEN + ``` + +3. Test Ansible connectivity: + ```bash + cd ansible + ansible -i hcloud.yml myclient -m ping + ``` + +4. Check Docker containers are running: + ```bash + ansible -i hcloud.yml myclient -m shell -a "docker ps" + ``` + +### Incorrect Version Reported + +**Problem:** Registry shows wrong version + +**Solutions:** +1. Re-collect versions manually: + ```bash + ./scripts/collect-client-versions.sh myclient + ``` + +2. Verify Docker images: + ```bash + ansible -i hcloud.yml myclient -m shell -a "docker images" + ``` + +3. Check container inspect: + ```bash + ansible -i hcloud.yml myclient -m shell -a "docker inspect authentik-server | jq '.[0].Config.Image'" + ``` + +### Version Drift False Positives + +**Problem:** Drift detected for canary with intentionally different version + +**Solution:** Use `--app` filter to check specific applications: +```bash +# Check only production-critical apps +./scripts/detect-version-drift.sh --app=authentik +./scripts/detect-version-drift.sh --app=nextcloud +``` + +## Best Practices + +1. **Always test on canary first** + - Update `dev` client before production + - Verify health before wider rollout + +2. **Stagger production updates** + - Don't update all clients simultaneously + - Wait 5-10 minutes between updates + - Monitor each update for issues + +3. **Track maintenance in registry** + - Keep `last_full_update` current + - Record `last_security_patch` dates + - Document backup verification + +4. **Regular drift checks** + - Run weekly: `detect-version-drift.sh` + - Address drift within 7 days + - Maintain version consistency + +5. **Document version changes** + - Add notes to registry when pinning versions + - Commit registry changes with descriptive messages + - Track major version upgrades separately + +6. **Automate reporting** + - Export weekly version reports + - Alert on drift detection + - Dashboard for version overview + +## Related Documentation + +- [Client Registry](client-registry.md) - Registry system overview +- [Deployment Guide](deployment.md) - Deployment procedures +- [SSH Key Management](ssh-key-management.md) - Security and access diff --git a/scripts/check-client-versions.sh b/scripts/check-client-versions.sh new file mode 100755 index 0000000..bd8128d --- /dev/null +++ b/scripts/check-client-versions.sh @@ -0,0 +1,251 @@ +#!/usr/bin/env bash +# +# Report software versions across all clients +# +# Usage: ./scripts/check-client-versions.sh [options] +# +# Options: +# --format=table Show as colorized table (default) +# --format=csv Export as CSV +# --format=json Export as JSON +# --app= Filter by application (authentik|nextcloud|traefik|ubuntu) +# --outdated Show only clients with outdated versions + +set -euo pipefail + +# Colors for output +RED='\033[0;31m' +GREEN='\033[0;32m' +YELLOW='\033[1;33m' +BLUE='\033[0;34m' +CYAN='\033[0;36m' +NC='\033[0m' # No Color + +# Script directory +SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" +PROJECT_ROOT="$(dirname "$SCRIPT_DIR")" +REGISTRY_FILE="$PROJECT_ROOT/clients/registry.yml" + +# Default options +FORMAT="table" +FILTER_APP="" +SHOW_OUTDATED=false + +# Parse arguments +for arg in "$@"; do + case $arg in + --format=*) + FORMAT="${arg#*=}" + ;; + --app=*) + FILTER_APP="${arg#*=}" + ;; + --outdated) + SHOW_OUTDATED=true + ;; + *) + echo "Unknown option: $arg" + echo "Usage: $0 [--format=table|csv|json] [--app=] [--outdated]" + exit 1 + ;; + esac +done + +# Check if yq is available +if ! command -v yq &> /dev/null; then + echo -e "${RED}Error: 'yq' not found. Install with: brew install yq${NC}" + exit 1 +fi + +# Check if registry exists +if [ ! -f "$REGISTRY_FILE" ]; then + echo -e "${RED}Error: Registry file not found: $REGISTRY_FILE${NC}" + exit 1 +fi + +# Get list of clients +CLIENTS=$(yq eval '.clients | keys | .[]' "$REGISTRY_FILE" 2>/dev/null) + +if [ -z "$CLIENTS" ]; then + echo -e "${YELLOW}No clients found in registry${NC}" + exit 0 +fi + +# Determine latest versions (from canary/dev or most common) +declare -A LATEST_VERSIONS +LATEST_VERSIONS[authentik]=$(yq eval '.clients | to_entries | .[].value.versions.authentik' "$REGISTRY_FILE" | sort -V | tail -1) +LATEST_VERSIONS[nextcloud]=$(yq eval '.clients | to_entries | .[].value.versions.nextcloud' "$REGISTRY_FILE" | sort -V | tail -1) +LATEST_VERSIONS[traefik]=$(yq eval '.clients | to_entries | .[].value.versions.traefik' "$REGISTRY_FILE" | sort -V | tail -1) +LATEST_VERSIONS[ubuntu]=$(yq eval '.clients | to_entries | .[].value.versions.ubuntu' "$REGISTRY_FILE" | sort -V | tail -1) + +# Function to check if version is outdated +is_outdated() { + local app=$1 + local version=$2 + local latest=${LATEST_VERSIONS[$app]} + + if [ "$version" != "$latest" ] && [ "$version" != "null" ] && [ "$version" != "unknown" ]; then + return 0 + else + return 1 + fi +} + +case $FORMAT in + table) + echo -e "${BLUE}═══════════════════════════════════════════════════════════════════════════════${NC}" + echo -e "${BLUE} CLIENT VERSION REPORT${NC}" + echo -e "${BLUE}═══════════════════════════════════════════════════════════════════════════════${NC}" + echo "" + + # Header + printf "${CYAN}%-15s %-15s %-15s %-15s %-15s %-15s${NC}\n" \ + "CLIENT" "STATUS" "AUTHENTIK" "NEXTCLOUD" "TRAEFIK" "UBUNTU" + echo -e "${CYAN}$(printf '─%.0s' {1..90})${NC}" + + # Rows + for client in $CLIENTS; do + status=$(yq eval ".clients.\"$client\".status" "$REGISTRY_FILE") + authentik=$(yq eval ".clients.\"$client\".versions.authentik" "$REGISTRY_FILE") + nextcloud=$(yq eval ".clients.\"$client\".versions.nextcloud" "$REGISTRY_FILE") + traefik=$(yq eval ".clients.\"$client\".versions.traefik" "$REGISTRY_FILE") + ubuntu=$(yq eval ".clients.\"$client\".versions.ubuntu" "$REGISTRY_FILE") + + # Skip if filtering by outdated and not outdated + if [ "$SHOW_OUTDATED" = true ]; then + has_outdated=false + is_outdated "authentik" "$authentik" && has_outdated=true + is_outdated "nextcloud" "$nextcloud" && has_outdated=true + is_outdated "traefik" "$traefik" && has_outdated=true + is_outdated "ubuntu" "$ubuntu" && has_outdated=true + + if [ "$has_outdated" = false ]; then + continue + fi + fi + + # Colorize versions (red if outdated) + authentik_color=$NC + is_outdated "authentik" "$authentik" && authentik_color=$RED + + nextcloud_color=$NC + is_outdated "nextcloud" "$nextcloud" && nextcloud_color=$RED + + traefik_color=$NC + is_outdated "traefik" "$traefik" && traefik_color=$RED + + ubuntu_color=$NC + is_outdated "ubuntu" "$ubuntu" && ubuntu_color=$RED + + # Status color + status_color=$GREEN + [ "$status" != "deployed" ] && status_color=$YELLOW + + printf "%-15s ${status_color}%-15s${NC} ${authentik_color}%-15s${NC} ${nextcloud_color}%-15s${NC} ${traefik_color}%-15s${NC} ${ubuntu_color}%-15s${NC}\n" \ + "$client" "$status" "$authentik" "$nextcloud" "$traefik" "$ubuntu" + done + + echo "" + echo -e "${CYAN}Latest versions:${NC}" + echo " Authentik: ${LATEST_VERSIONS[authentik]}" + echo " Nextcloud: ${LATEST_VERSIONS[nextcloud]}" + echo " Traefik: ${LATEST_VERSIONS[traefik]}" + echo " Ubuntu: ${LATEST_VERSIONS[ubuntu]}" + echo "" + echo -e "${YELLOW}Note: ${RED}Red${NC} indicates outdated version${NC}" + echo "" + ;; + + csv) + # CSV header + echo "client,status,authentik,nextcloud,traefik,ubuntu,last_update,outdated" + + # CSV rows + for client in $CLIENTS; do + status=$(yq eval ".clients.\"$client\".status" "$REGISTRY_FILE") + authentik=$(yq eval ".clients.\"$client\".versions.authentik" "$REGISTRY_FILE") + nextcloud=$(yq eval ".clients.\"$client\".versions.nextcloud" "$REGISTRY_FILE") + traefik=$(yq eval ".clients.\"$client\".versions.traefik" "$REGISTRY_FILE") + ubuntu=$(yq eval ".clients.\"$client\".versions.ubuntu" "$REGISTRY_FILE") + last_update=$(yq eval ".clients.\"$client\".maintenance.last_full_update" "$REGISTRY_FILE") + + # Check if any version is outdated + outdated="no" + is_outdated "authentik" "$authentik" && outdated="yes" + is_outdated "nextcloud" "$nextcloud" && outdated="yes" + is_outdated "traefik" "$traefik" && outdated="yes" + is_outdated "ubuntu" "$ubuntu" && outdated="yes" + + # Skip if filtering by outdated + if [ "$SHOW_OUTDATED" = true ] && [ "$outdated" = "no" ]; then + continue + fi + + echo "$client,$status,$authentik,$nextcloud,$traefik,$ubuntu,$last_update,$outdated" + done + ;; + + json) + # Build JSON array + echo "{" + echo " \"latest_versions\": {" + echo " \"authentik\": \"${LATEST_VERSIONS[authentik]}\"," + echo " \"nextcloud\": \"${LATEST_VERSIONS[nextcloud]}\"," + echo " \"traefik\": \"${LATEST_VERSIONS[traefik]}\"," + echo " \"ubuntu\": \"${LATEST_VERSIONS[ubuntu]}\"" + echo " }," + echo " \"clients\": [" + + first=true + for client in $CLIENTS; do + status=$(yq eval ".clients.\"$client\".status" "$REGISTRY_FILE") + authentik=$(yq eval ".clients.\"$client\".versions.authentik" "$REGISTRY_FILE") + nextcloud=$(yq eval ".clients.\"$client\".versions.nextcloud" "$REGISTRY_FILE") + traefik=$(yq eval ".clients.\"$client\".versions.traefik" "$REGISTRY_FILE") + ubuntu=$(yq eval ".clients.\"$client\".versions.ubuntu" "$REGISTRY_FILE") + last_update=$(yq eval ".clients.\"$client\".maintenance.last_full_update" "$REGISTRY_FILE") + + # Check if any version is outdated + outdated=false + is_outdated "authentik" "$authentik" && outdated=true + is_outdated "nextcloud" "$nextcloud" && outdated=true + is_outdated "traefik" "$traefik" && outdated=true + is_outdated "ubuntu" "$ubuntu" && outdated=true + + # Skip if filtering by outdated + if [ "$SHOW_OUTDATED" = true ] && [ "$outdated" = false ]; then + continue + fi + + if [ "$first" = false ]; then + echo " ," + fi + first=false + + cat < +# +# Queries the deployed server for actual running versions: +# - Docker container image versions +# - Ubuntu OS version +# - Updates the client registry with collected versions + +set -euo pipefail + +# Colors for output +RED='\033[0;31m' +GREEN='\033[0;32m' +YELLOW='\033[1;33m' +BLUE='\033[0;34m' +NC='\033[0m' # No Color + +# Script directory +SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" +PROJECT_ROOT="$(dirname "$SCRIPT_DIR")" +REGISTRY_FILE="$PROJECT_ROOT/clients/registry.yml" + +# Check arguments +if [ $# -ne 1 ]; then + echo -e "${RED}Error: Client name required${NC}" + echo "Usage: $0 " + echo "" + echo "Example: $0 dev" + exit 1 +fi + +CLIENT_NAME="$1" + +# Check if yq is available +if ! command -v yq &> /dev/null; then + echo -e "${RED}Error: 'yq' not found. Install with: brew install yq${NC}" + exit 1 +fi + +# Check required environment variables +if [ -z "${HCLOUD_TOKEN:-}" ]; then + echo -e "${RED}Error: HCLOUD_TOKEN environment variable not set${NC}" + echo "Export your Hetzner Cloud API token:" + echo " export HCLOUD_TOKEN='your-token-here'" + exit 1 +fi + +# Check if registry exists +if [ ! -f "$REGISTRY_FILE" ]; then + echo -e "${RED}Error: Registry file not found: $REGISTRY_FILE${NC}" + exit 1 +fi + +# Check if client exists in registry +if yq eval ".clients.\"$CLIENT_NAME\"" "$REGISTRY_FILE" | grep -q "null"; then + echo -e "${RED}Error: Client '$CLIENT_NAME' not found in registry${NC}" + exit 1 +fi + +echo -e "${BLUE}Collecting versions for client: $CLIENT_NAME${NC}" +echo "" + +cd "$PROJECT_ROOT/ansible" + +# Check if server is reachable +if ! timeout 10 ~/.local/bin/ansible -i hcloud.yml "$CLIENT_NAME" -m ping -o &>/dev/null; then + echo -e "${RED}Error: Cannot reach server for client '$CLIENT_NAME'${NC}" + echo "Server may not be deployed or network is unreachable" + exit 1 +fi + +echo -e "${YELLOW}Querying deployed versions...${NC}" +echo "" + +# Query Docker container versions +echo "Collecting Docker container versions..." + +# Function to extract version from image tag +extract_version() { + local image=$1 + # Extract version after the colon, or return "latest" + if [[ $image == *":"* ]]; then + echo "$image" | awk -F: '{print $2}' + else + echo "latest" + fi +} + +# Collect Authentik version +AUTHENTIK_IMAGE=$(~/.local/bin/ansible -i hcloud.yml "$CLIENT_NAME" -m shell -a "docker inspect authentik-server 2>/dev/null | jq -r '.[0].Config.Image' 2>/dev/null || echo 'unknown'" -o 2>/dev/null | tail -1 | awk '{print $NF}') +AUTHENTIK_VERSION=$(extract_version "$AUTHENTIK_IMAGE") + +# Collect Nextcloud version +NEXTCLOUD_IMAGE=$(~/.local/bin/ansible -i hcloud.yml "$CLIENT_NAME" -m shell -a "docker inspect nextcloud 2>/dev/null | jq -r '.[0].Config.Image' 2>/dev/null || echo 'unknown'" -o 2>/dev/null | tail -1 | awk '{print $NF}') +NEXTCLOUD_VERSION=$(extract_version "$NEXTCLOUD_IMAGE") + +# Collect Traefik version +TRAEFIK_IMAGE=$(~/.local/bin/ansible -i hcloud.yml "$CLIENT_NAME" -m shell -a "docker inspect traefik 2>/dev/null | jq -r '.[0].Config.Image' 2>/dev/null || echo 'unknown'" -o 2>/dev/null | tail -1 | awk '{print $NF}') +TRAEFIK_VERSION=$(extract_version "$TRAEFIK_IMAGE") + +# Collect Ubuntu version +UBUNTU_VERSION=$(~/.local/bin/ansible -i hcloud.yml "$CLIENT_NAME" -m shell -a "lsb_release -rs 2>/dev/null || echo 'unknown'" -o 2>/dev/null | tail -1 | awk '{print $NF}') + +echo -e "${GREEN}✓ Versions collected${NC}" +echo "" + +# Display collected versions +echo "Collected versions:" +echo " Authentik: $AUTHENTIK_VERSION" +echo " Nextcloud: $NEXTCLOUD_VERSION" +echo " Traefik: $TRAEFIK_VERSION" +echo " Ubuntu: $UBUNTU_VERSION" +echo "" + +# Update registry +echo -e "${YELLOW}Updating registry...${NC}" + +# Update versions in registry +yq eval -i ".clients.\"$CLIENT_NAME\".versions.authentik = \"$AUTHENTIK_VERSION\"" "$REGISTRY_FILE" +yq eval -i ".clients.\"$CLIENT_NAME\".versions.nextcloud = \"$NEXTCLOUD_VERSION\"" "$REGISTRY_FILE" +yq eval -i ".clients.\"$CLIENT_NAME\".versions.traefik = \"$TRAEFIK_VERSION\"" "$REGISTRY_FILE" +yq eval -i ".clients.\"$CLIENT_NAME\".versions.ubuntu = \"$UBUNTU_VERSION\"" "$REGISTRY_FILE" + +echo -e "${GREEN}✓ Registry updated${NC}" +echo "" +echo "Updated: $REGISTRY_FILE" +echo "" +echo "To view registry:" +echo " ./scripts/client-status.sh $CLIENT_NAME" diff --git a/scripts/deploy-client.sh b/scripts/deploy-client.sh index 22360d8..aa70b12 100755 --- a/scripts/deploy-client.sh +++ b/scripts/deploy-client.sh @@ -211,6 +211,16 @@ echo "" echo -e "${GREEN}✓ Registry updated${NC}" echo "" +# Collect deployed versions +echo -e "${YELLOW}Collecting deployed versions...${NC}" + +"$SCRIPT_DIR/collect-client-versions.sh" "$CLIENT_NAME" 2>/dev/null || { + echo -e "${YELLOW}⚠ Could not collect versions automatically${NC}" + echo "Run manually later: ./scripts/collect-client-versions.sh $CLIENT_NAME" +} + +echo "" + # Calculate duration END_TIME=$(date +%s) DURATION=$((END_TIME - START_TIME)) diff --git a/scripts/detect-version-drift.sh b/scripts/detect-version-drift.sh new file mode 100755 index 0000000..36cf9c0 --- /dev/null +++ b/scripts/detect-version-drift.sh @@ -0,0 +1,228 @@ +#!/usr/bin/env bash +# +# Detect version drift between clients +# +# Usage: ./scripts/detect-version-drift.sh [options] +# +# Options: +# --threshold= Only report clients not updated in X days (default: 30) +# --app= Check specific app only (authentik|nextcloud|traefik|ubuntu) +# --format=table Show as table (default) +# --format=summary Show summary only +# +# Exit codes: +# 0 - No drift detected +# 1 - Drift detected +# 2 - Error + +set -euo pipefail + +# Colors for output +RED='\033[0;31m' +GREEN='\033[0;32m' +YELLOW='\033[1;33m' +BLUE='\033[0;34m' +CYAN='\033[0;36m' +NC='\033[0m' # No Color + +# Script directory +SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" +PROJECT_ROOT="$(dirname "$SCRIPT_DIR")" +REGISTRY_FILE="$PROJECT_ROOT/clients/registry.yml" + +# Default options +THRESHOLD_DAYS=30 +FILTER_APP="" +FORMAT="table" + +# Parse arguments +for arg in "$@"; do + case $arg in + --threshold=*) + THRESHOLD_DAYS="${arg#*=}" + ;; + --app=*) + FILTER_APP="${arg#*=}" + ;; + --format=*) + FORMAT="${arg#*=}" + ;; + *) + echo "Unknown option: $arg" + echo "Usage: $0 [--threshold=] [--app=] [--format=table|summary]" + exit 2 + ;; + esac +done + +# Check if yq is available +if ! command -v yq &> /dev/null; then + echo -e "${RED}Error: 'yq' not found. Install with: brew install yq${NC}" + exit 2 +fi + +# Check if registry exists +if [ ! -f "$REGISTRY_FILE" ]; then + echo -e "${RED}Error: Registry file not found: $REGISTRY_FILE${NC}" + exit 2 +fi + +# Get list of deployed clients only +CLIENTS=$(yq eval '.clients | to_entries | map(select(.value.status == "deployed")) | .[].key' "$REGISTRY_FILE" 2>/dev/null) + +if [ -z "$CLIENTS" ]; then + echo -e "${YELLOW}No deployed clients found${NC}" + exit 0 +fi + +# Determine latest versions +declare -A LATEST_VERSIONS +LATEST_VERSIONS[authentik]=$(yq eval '.clients | to_entries | .[].value.versions.authentik' "$REGISTRY_FILE" | sort -V | tail -1) +LATEST_VERSIONS[nextcloud]=$(yq eval '.clients | to_entries | .[].value.versions.nextcloud' "$REGISTRY_FILE" | sort -V | tail -1) +LATEST_VERSIONS[traefik]=$(yq eval '.clients | to_entries | .[].value.versions.traefik' "$REGISTRY_FILE" | sort -V | tail -1) +LATEST_VERSIONS[ubuntu]=$(yq eval '.clients | to_entries | .[].value.versions.ubuntu' "$REGISTRY_FILE" | sort -V | tail -1) + +# Calculate date threshold +if command -v gdate &> /dev/null; then + # macOS with GNU coreutils + THRESHOLD_DATE=$(gdate -d "$THRESHOLD_DAYS days ago" +%Y-%m-%d) +elif date --version &> /dev/null 2>&1; then + # GNU date (Linux) + THRESHOLD_DATE=$(date -d "$THRESHOLD_DAYS days ago" +%Y-%m-%d) +else + # BSD date (macOS default) + THRESHOLD_DATE=$(date -v-${THRESHOLD_DAYS}d +%Y-%m-%d) +fi + +# Counters +DRIFT_FOUND=0 +OUTDATED_COUNT=0 +STALE_COUNT=0 + +# Arrays to store drift details +declare -a DRIFT_CLIENTS +declare -a DRIFT_DETAILS + +# Analyze each client +for client in $CLIENTS; do + authentik=$(yq eval ".clients.\"$client\".versions.authentik" "$REGISTRY_FILE") + nextcloud=$(yq eval ".clients.\"$client\".versions.nextcloud" "$REGISTRY_FILE") + traefik=$(yq eval ".clients.\"$client\".versions.traefik" "$REGISTRY_FILE") + ubuntu=$(yq eval ".clients.\"$client\".versions.ubuntu" "$REGISTRY_FILE") + last_update=$(yq eval ".clients.\"$client\".maintenance.last_full_update" "$REGISTRY_FILE") + + has_drift=false + drift_reasons=() + + # Check version drift + if [ -z "$FILTER_APP" ] || [ "$FILTER_APP" = "authentik" ]; then + if [ "$authentik" != "${LATEST_VERSIONS[authentik]}" ] && [ "$authentik" != "null" ] && [ "$authentik" != "unknown" ]; then + has_drift=true + drift_reasons+=("Authentik: $authentik → ${LATEST_VERSIONS[authentik]}") + fi + fi + + if [ -z "$FILTER_APP" ] || [ "$FILTER_APP" = "nextcloud" ]; then + if [ "$nextcloud" != "${LATEST_VERSIONS[nextcloud]}" ] && [ "$nextcloud" != "null" ] && [ "$nextcloud" != "unknown" ]; then + has_drift=true + drift_reasons+=("Nextcloud: $nextcloud → ${LATEST_VERSIONS[nextcloud]}") + fi + fi + + if [ -z "$FILTER_APP" ] || [ "$FILTER_APP" = "traefik" ]; then + if [ "$traefik" != "${LATEST_VERSIONS[traefik]}" ] && [ "$traefik" != "null" ] && [ "$traefik" != "unknown" ]; then + has_drift=true + drift_reasons+=("Traefik: $traefik → ${LATEST_VERSIONS[traefik]}") + fi + fi + + if [ -z "$FILTER_APP" ] || [ "$FILTER_APP" = "ubuntu" ]; then + if [ "$ubuntu" != "${LATEST_VERSIONS[ubuntu]}" ] && [ "$ubuntu" != "null" ] && [ "$ubuntu" != "unknown" ]; then + has_drift=true + drift_reasons+=("Ubuntu: $ubuntu → ${LATEST_VERSIONS[ubuntu]}") + fi + fi + + # Check if update is stale (older than threshold) + is_stale=false + if [ "$last_update" != "null" ] && [ -n "$last_update" ]; then + if [[ "$last_update" < "$THRESHOLD_DATE" ]]; then + is_stale=true + drift_reasons+=("Last update: $last_update (>$THRESHOLD_DAYS days ago)") + fi + fi + + # Record drift + if [ "$has_drift" = true ] || [ "$is_stale" = true ]; then + DRIFT_FOUND=1 + DRIFT_CLIENTS+=("$client") + DRIFT_DETAILS+=("$(IFS='; '; echo "${drift_reasons[*]}")") + + [ "$has_drift" = true ] && ((OUTDATED_COUNT++)) || true + [ "$is_stale" = true ] && ((STALE_COUNT++)) || true + fi +done + +# Output results +case $FORMAT in + table) + if [ $DRIFT_FOUND -eq 0 ]; then + echo -e "${GREEN}✓ No version drift detected${NC}" + echo "" + echo "All deployed clients are running latest versions:" + echo " Authentik: ${LATEST_VERSIONS[authentik]}" + echo " Nextcloud: ${LATEST_VERSIONS[nextcloud]}" + echo " Traefik: ${LATEST_VERSIONS[traefik]}" + echo " Ubuntu: ${LATEST_VERSIONS[ubuntu]}" + echo "" + else + echo -e "${RED}⚠ VERSION DRIFT DETECTED${NC}" + echo "" + echo -e "${CYAN}Clients with outdated versions:${NC}" + echo "" + + for i in "${!DRIFT_CLIENTS[@]}"; do + client="${DRIFT_CLIENTS[$i]}" + details="${DRIFT_DETAILS[$i]}" + + echo -e "${YELLOW}• $client${NC}" + IFS=';' read -ra REASONS <<< "$details" + for reason in "${REASONS[@]}"; do + echo " $reason" + done + echo "" + done + + echo -e "${CYAN}Recommended actions:${NC}" + echo "" + echo "1. Test updates on canary server first:" + echo " ${BLUE}./scripts/rebuild-client.sh dev${NC}" + echo "" + echo "2. Verify canary health:" + echo " ${BLUE}./scripts/client-status.sh dev${NC}" + echo "" + echo "3. Update outdated clients:" + for client in "${DRIFT_CLIENTS[@]}"; do + echo " ${BLUE}./scripts/rebuild-client.sh $client${NC}" + done + echo "" + fi + ;; + + summary) + if [ $DRIFT_FOUND -eq 0 ]; then + echo "Status: OK" + echo "Drift: No" + echo "Clients checked: $(echo "$CLIENTS" | wc -l | xargs)" + else + echo "Status: DRIFT DETECTED" + echo "Drift: Yes" + echo "Clients checked: $(echo "$CLIENTS" | wc -l | xargs)" + echo "Clients with outdated versions: $OUTDATED_COUNT" + echo "Clients not updated in $THRESHOLD_DAYS days: $STALE_COUNT" + echo "Affected clients: ${DRIFT_CLIENTS[*]}" + fi + ;; +esac + +exit $DRIFT_FOUND diff --git a/scripts/rebuild-client.sh b/scripts/rebuild-client.sh index 24a53be..e570a38 100755 --- a/scripts/rebuild-client.sh +++ b/scripts/rebuild-client.sh @@ -212,6 +212,16 @@ echo "" echo -e "${GREEN}✓ Registry updated${NC}" echo "" +# Collect deployed versions +echo -e "${YELLOW}Collecting deployed versions...${NC}" + +"$SCRIPT_DIR/collect-client-versions.sh" "$CLIENT_NAME" 2>/dev/null || { + echo -e "${YELLOW}⚠ Could not collect versions automatically${NC}" + echo "Run manually later: ./scripts/collect-client-versions.sh $CLIENT_NAME" +} + +echo "" + # Calculate duration END_TIME=$(date +%s) DURATION=$((END_TIME - START_TIME))