feat: Add version tracking and maintenance monitoring (issue #15)

Complete implementation of automatic version tracking and drift detection:

New Scripts:
- scripts/collect-client-versions.sh: Query deployed versions from Docker
  - Connects via Ansible to running servers
  - Extracts versions from container images
  - Updates registry automatically

- scripts/check-client-versions.sh: Compare versions across clients
  - Multiple formats: table (colorized), CSV, JSON
  - Filter by outdated versions
  - Highlights drift with color coding

- scripts/detect-version-drift.sh: Identify version differences
  - Detects clients with outdated versions
  - Threshold-based staleness detection (default 30 days)
  - Actionable recommendations
  - Exit code 1 if drift detected (CI/monitoring friendly)

Updated Scripts:
- scripts/deploy-client.sh: Auto-collect versions after deployment
- scripts/rebuild-client.sh: Auto-collect versions after rebuild

Documentation:
- docs/maintenance-tracking.md: Complete maintenance guide
  - Version management workflows
  - Security update procedures
  - Monitoring integration examples
  - Troubleshooting guide

Features:
 Automatic version collection from deployed servers
 Multi-client version comparison reports
 Version drift detection with recommendations
 Integration with deployment workflows
 Export to CSV/JSON for external tools
 Canary-first update workflow support

Usage Examples:
```bash
# Collect versions
./scripts/collect-client-versions.sh dev

# Compare all clients
./scripts/check-client-versions.sh

# Detect drift
./scripts/detect-version-drift.sh

# Export for monitoring
./scripts/check-client-versions.sh --format=json
```

Closes #15

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
Pieter 2026-01-17 20:53:15 +01:00
parent bf4659f662
commit 0c4d536246
6 changed files with 1108 additions and 0 deletions

View file

@ -0,0 +1,477 @@
# Maintenance and Version Tracking
Comprehensive guide to tracking software versions, maintenance history, and detecting version drift across all deployed clients.
## Overview
The infrastructure tracks:
- **Software versions** - Authentik, Nextcloud, Traefik, Ubuntu
- **Maintenance dates** - Last update, security patches, OS updates
- **Version drift** - Clients running different versions
- **Update history** - Audit trail of changes
All version and maintenance data is stored in [`clients/registry.yml`](../clients/registry.yml).
## Registry Structure
Each client tracks versions and maintenance:
```yaml
clients:
myclient:
versions:
authentik: "2025.10.3"
nextcloud: "30.0.17"
traefik: "v3.0"
ubuntu: "24.04"
maintenance:
last_full_update: 2026-01-17
last_security_patch: 2026-01-17
last_os_update: 2026-01-17
last_backup_verified: null
```
## Version Management Scripts
### Collect Client Versions
Query actual deployed versions from a running server:
```bash
# Collect versions from dev client
./scripts/collect-client-versions.sh dev
```
This script:
- Connects to the server via Ansible
- Queries Docker container image tags
- Queries Ubuntu OS version
- Updates the registry automatically
**Output:**
```
Collecting versions for client: dev
Querying deployed versions...
Collecting Docker container versions...
✓ Versions collected
Collected versions:
Authentik: 2025.10.3
Nextcloud: 30.0.17
Traefik: v3.0
Ubuntu: 24.04
✓ Registry updated
```
**Requirements:**
- Server must be deployed and reachable
- `HCLOUD_TOKEN` environment variable set
- Ansible configured with dynamic inventory
### Check All Client Versions
Compare versions across all clients:
```bash
# Default: Table format with color coding
./scripts/check-client-versions.sh
# Export as CSV
./scripts/check-client-versions.sh --format=csv
# Export as JSON
./scripts/check-client-versions.sh --format=json
# Show only clients with outdated versions
./scripts/check-client-versions.sh --outdated
```
**Table output:**
```
═══════════════════════════════════════════════════════════════════════════════
CLIENT VERSION REPORT
═══════════════════════════════════════════════════════════════════════════════
CLIENT STATUS AUTHENTIK NEXTCLOUD TRAEFIK UBUNTU
──────────────────────────────────────────────────────────────────────────────────────────────
dev deployed 2025.10.3 30.0.17 v3.0 24.04
client1 deployed 2025.10.2 30.0.16 v3.0 24.04
Latest versions:
Authentik: 2025.10.3
Nextcloud: 30.0.17
Traefik: v3.0
Ubuntu: 24.04
Note: Red indicates outdated version
```
**CSV output:**
```csv
client,status,authentik,nextcloud,traefik,ubuntu,last_update,outdated
dev,deployed,2025.10.3,30.0.17,v3.0,24.04,2026-01-17,no
client1,deployed,2025.10.2,30.0.16,v3.0,24.04,2026-01-10,yes
```
**JSON output:**
```json
{
"latest_versions": {
"authentik": "2025.10.3",
"nextcloud": "30.0.17",
"traefik": "v3.0",
"ubuntu": "24.04"
},
"clients": [
{
"name": "dev",
"status": "deployed",
"versions": {
"authentik": "2025.10.3",
"nextcloud": "30.0.17",
"traefik": "v3.0",
"ubuntu": "24.04"
},
"last_update": "2026-01-17",
"outdated": false
}
]
}
```
### Detect Version Drift
Identify clients with outdated versions:
```bash
# Default: Check all deployed clients
./scripts/detect-version-drift.sh
# Check clients not updated in 30+ days
./scripts/detect-version-drift.sh --threshold=30
# Check specific application only
./scripts/detect-version-drift.sh --app=authentik
# Summary output for monitoring
./scripts/detect-version-drift.sh --format=summary
```
**Output when drift detected:**
```
⚠ VERSION DRIFT DETECTED
Clients with outdated versions:
• client1
Authentik: 2025.10.2 → 2025.10.3
Nextcloud: 30.0.16 → 30.0.17
• client2
Last update: 2025-12-15 (>30 days ago)
Recommended actions:
1. Test updates on canary server first:
./scripts/rebuild-client.sh dev
2. Verify canary health:
./scripts/client-status.sh dev
3. Update outdated clients:
./scripts/rebuild-client.sh client1
./scripts/rebuild-client.sh client2
```
**Exit codes:**
- `0` - No drift detected (all clients up to date)
- `1` - Drift detected (action needed)
- `2` - Error (script failure)
**Summary format** (useful for monitoring):
```
Status: DRIFT DETECTED
Drift: Yes
Clients checked: 5
Clients with outdated versions: 2
Clients not updated in 30 days: 1
Affected clients: client1 client2
```
## Automatic Version Collection
Version collection is **automatically performed** after deployments:
### On New Deployment
`./scripts/deploy-client.sh myclient`:
1. Provisions infrastructure
2. Deploys applications
3. Updates registry with server info
4. **Collects and records versions** ← Automatic
### On Rebuild
`./scripts/rebuild-client.sh myclient`:
1. Destroys old infrastructure
2. Provisions new infrastructure
3. Deploys applications
4. Updates registry
5. **Collects and records versions** ← Automatic
If automatic collection fails (server not ready, network issue):
```
⚠ Could not collect versions automatically
Run manually later: ./scripts/collect-client-versions.sh myclient
```
## Maintenance Workflows
### Security Update Workflow
1. **Check current state**
```bash
./scripts/check-client-versions.sh
```
2. **Update canary first** (dev server)
```bash
./scripts/rebuild-client.sh dev
```
3. **Verify canary**
```bash
# Check health
./scripts/client-status.sh dev
# Verify versions updated
./scripts/collect-client-versions.sh dev
```
4. **Detect drift** (identify outdated clients)
```bash
./scripts/detect-version-drift.sh
```
5. **Roll out to production**
```bash
# Update each client
./scripts/rebuild-client.sh client1
./scripts/rebuild-client.sh client2
# Or batch update (be careful!)
for client in $(./scripts/list-clients.sh --role=production --format=csv | tail -n +2 | cut -d, -f1); do
./scripts/rebuild-client.sh "$client"
sleep 300 # Wait 5 minutes between updates
done
```
6. **Verify all updated**
```bash
./scripts/detect-version-drift.sh
```
### Monthly Maintenance Check
Run these checks monthly:
```bash
# 1. Version report
./scripts/check-client-versions.sh > reports/versions-$(date +%Y-%m).txt
# 2. Drift detection
./scripts/detect-version-drift.sh --threshold=30
# 3. Client health
for client in $(./scripts/list-clients.sh --status=deployed --format=csv | tail -n +2 | cut -d, -f1); do
./scripts/client-status.sh "$client"
done
```
### Update Maintenance Dates
Deployment scripts automatically update `last_full_update`. For other maintenance:
```bash
# After security patches (OS level)
yq eval -i ".clients.myclient.maintenance.last_security_patch = \"$(date +%Y-%m-%d)\"" clients/registry.yml
# After OS updates
yq eval -i ".clients.myclient.maintenance.last_os_update = \"$(date +%Y-%m-%d)\"" clients/registry.yml
# After backup verification
yq eval -i ".clients.myclient.maintenance.last_backup_verified = \"$(date +%Y-%m-%d)\"" clients/registry.yml
# Commit changes
git add clients/registry.yml
git commit -m "chore: Update maintenance dates"
git push
```
## Integration with Monitoring
### Continuous Drift Detection
Set up a cron job or CI pipeline:
```bash
#!/bin/bash
# check-drift.sh - Run daily
cd /path/to/infrastructure
# Check for drift
if ! ./scripts/detect-version-drift.sh --format=summary; then
# Send alert (Slack, email, etc.)
./scripts/detect-version-drift.sh | mail -s "Version Drift Detected" ops@example.com
fi
```
### Export for External Tools
```bash
# Export version data as JSON for monitoring tools
./scripts/check-client-versions.sh --format=json > /var/monitoring/client-versions.json
# Export drift status
./scripts/detect-version-drift.sh --format=summary > /var/monitoring/drift-status.txt
```
### Prometheus Metrics
Convert to Prometheus format:
```bash
#!/bin/bash
# export-metrics.sh
# Count clients by drift status
total=$(./scripts/list-clients.sh --status=deployed --format=csv | tail -n +2 | wc -l)
outdated=$(./scripts/check-client-versions.sh --format=csv --outdated | tail -n +2 | wc -l)
uptodate=$((total - outdated))
echo "# HELP clients_total Total number of deployed clients"
echo "# TYPE clients_total gauge"
echo "clients_total $total"
echo "# HELP clients_outdated Number of clients with outdated versions"
echo "# TYPE clients_outdated gauge"
echo "clients_outdated $outdated"
echo "# HELP clients_uptodate Number of clients with latest versions"
echo "# TYPE clients_uptodate gauge"
echo "clients_uptodate $uptodate"
```
## Version Pinning
To prevent automatic updates, pin versions in Ansible roles:
```yaml
# roles/authentik/defaults/main.yml
authentik_version: "2025.10.3" # Pinned version
# To update:
# 1. Change pinned version
# 2. Update canary: ./scripts/rebuild-client.sh dev
# 3. Verify and roll out
```
## Troubleshooting
### Version Collection Fails
**Problem:** `collect-client-versions.sh` cannot reach server
**Solutions:**
1. Check server is deployed and running:
```bash
./scripts/client-status.sh myclient
```
2. Verify HCLOUD_TOKEN is set:
```bash
echo $HCLOUD_TOKEN
```
3. Test Ansible connectivity:
```bash
cd ansible
ansible -i hcloud.yml myclient -m ping
```
4. Check Docker containers are running:
```bash
ansible -i hcloud.yml myclient -m shell -a "docker ps"
```
### Incorrect Version Reported
**Problem:** Registry shows wrong version
**Solutions:**
1. Re-collect versions manually:
```bash
./scripts/collect-client-versions.sh myclient
```
2. Verify Docker images:
```bash
ansible -i hcloud.yml myclient -m shell -a "docker images"
```
3. Check container inspect:
```bash
ansible -i hcloud.yml myclient -m shell -a "docker inspect authentik-server | jq '.[0].Config.Image'"
```
### Version Drift False Positives
**Problem:** Drift detected for canary with intentionally different version
**Solution:** Use `--app` filter to check specific applications:
```bash
# Check only production-critical apps
./scripts/detect-version-drift.sh --app=authentik
./scripts/detect-version-drift.sh --app=nextcloud
```
## Best Practices
1. **Always test on canary first**
- Update `dev` client before production
- Verify health before wider rollout
2. **Stagger production updates**
- Don't update all clients simultaneously
- Wait 5-10 minutes between updates
- Monitor each update for issues
3. **Track maintenance in registry**
- Keep `last_full_update` current
- Record `last_security_patch` dates
- Document backup verification
4. **Regular drift checks**
- Run weekly: `detect-version-drift.sh`
- Address drift within 7 days
- Maintain version consistency
5. **Document version changes**
- Add notes to registry when pinning versions
- Commit registry changes with descriptive messages
- Track major version upgrades separately
6. **Automate reporting**
- Export weekly version reports
- Alert on drift detection
- Dashboard for version overview
## Related Documentation
- [Client Registry](client-registry.md) - Registry system overview
- [Deployment Guide](deployment.md) - Deployment procedures
- [SSH Key Management](ssh-key-management.md) - Security and access

251
scripts/check-client-versions.sh Executable file
View file

@ -0,0 +1,251 @@
#!/usr/bin/env bash
#
# Report software versions across all clients
#
# Usage: ./scripts/check-client-versions.sh [options]
#
# Options:
# --format=table Show as colorized table (default)
# --format=csv Export as CSV
# --format=json Export as JSON
# --app=<name> Filter by application (authentik|nextcloud|traefik|ubuntu)
# --outdated Show only clients with outdated versions
set -euo pipefail
# Colors for output
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
CYAN='\033[0;36m'
NC='\033[0m' # No Color
# Script directory
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(dirname "$SCRIPT_DIR")"
REGISTRY_FILE="$PROJECT_ROOT/clients/registry.yml"
# Default options
FORMAT="table"
FILTER_APP=""
SHOW_OUTDATED=false
# Parse arguments
for arg in "$@"; do
case $arg in
--format=*)
FORMAT="${arg#*=}"
;;
--app=*)
FILTER_APP="${arg#*=}"
;;
--outdated)
SHOW_OUTDATED=true
;;
*)
echo "Unknown option: $arg"
echo "Usage: $0 [--format=table|csv|json] [--app=<name>] [--outdated]"
exit 1
;;
esac
done
# Check if yq is available
if ! command -v yq &> /dev/null; then
echo -e "${RED}Error: 'yq' not found. Install with: brew install yq${NC}"
exit 1
fi
# Check if registry exists
if [ ! -f "$REGISTRY_FILE" ]; then
echo -e "${RED}Error: Registry file not found: $REGISTRY_FILE${NC}"
exit 1
fi
# Get list of clients
CLIENTS=$(yq eval '.clients | keys | .[]' "$REGISTRY_FILE" 2>/dev/null)
if [ -z "$CLIENTS" ]; then
echo -e "${YELLOW}No clients found in registry${NC}"
exit 0
fi
# Determine latest versions (from canary/dev or most common)
declare -A LATEST_VERSIONS
LATEST_VERSIONS[authentik]=$(yq eval '.clients | to_entries | .[].value.versions.authentik' "$REGISTRY_FILE" | sort -V | tail -1)
LATEST_VERSIONS[nextcloud]=$(yq eval '.clients | to_entries | .[].value.versions.nextcloud' "$REGISTRY_FILE" | sort -V | tail -1)
LATEST_VERSIONS[traefik]=$(yq eval '.clients | to_entries | .[].value.versions.traefik' "$REGISTRY_FILE" | sort -V | tail -1)
LATEST_VERSIONS[ubuntu]=$(yq eval '.clients | to_entries | .[].value.versions.ubuntu' "$REGISTRY_FILE" | sort -V | tail -1)
# Function to check if version is outdated
is_outdated() {
local app=$1
local version=$2
local latest=${LATEST_VERSIONS[$app]}
if [ "$version" != "$latest" ] && [ "$version" != "null" ] && [ "$version" != "unknown" ]; then
return 0
else
return 1
fi
}
case $FORMAT in
table)
echo -e "${BLUE}═══════════════════════════════════════════════════════════════════════════════${NC}"
echo -e "${BLUE} CLIENT VERSION REPORT${NC}"
echo -e "${BLUE}═══════════════════════════════════════════════════════════════════════════════${NC}"
echo ""
# Header
printf "${CYAN}%-15s %-15s %-15s %-15s %-15s %-15s${NC}\n" \
"CLIENT" "STATUS" "AUTHENTIK" "NEXTCLOUD" "TRAEFIK" "UBUNTU"
echo -e "${CYAN}$(printf '─%.0s' {1..90})${NC}"
# Rows
for client in $CLIENTS; do
status=$(yq eval ".clients.\"$client\".status" "$REGISTRY_FILE")
authentik=$(yq eval ".clients.\"$client\".versions.authentik" "$REGISTRY_FILE")
nextcloud=$(yq eval ".clients.\"$client\".versions.nextcloud" "$REGISTRY_FILE")
traefik=$(yq eval ".clients.\"$client\".versions.traefik" "$REGISTRY_FILE")
ubuntu=$(yq eval ".clients.\"$client\".versions.ubuntu" "$REGISTRY_FILE")
# Skip if filtering by outdated and not outdated
if [ "$SHOW_OUTDATED" = true ]; then
has_outdated=false
is_outdated "authentik" "$authentik" && has_outdated=true
is_outdated "nextcloud" "$nextcloud" && has_outdated=true
is_outdated "traefik" "$traefik" && has_outdated=true
is_outdated "ubuntu" "$ubuntu" && has_outdated=true
if [ "$has_outdated" = false ]; then
continue
fi
fi
# Colorize versions (red if outdated)
authentik_color=$NC
is_outdated "authentik" "$authentik" && authentik_color=$RED
nextcloud_color=$NC
is_outdated "nextcloud" "$nextcloud" && nextcloud_color=$RED
traefik_color=$NC
is_outdated "traefik" "$traefik" && traefik_color=$RED
ubuntu_color=$NC
is_outdated "ubuntu" "$ubuntu" && ubuntu_color=$RED
# Status color
status_color=$GREEN
[ "$status" != "deployed" ] && status_color=$YELLOW
printf "%-15s ${status_color}%-15s${NC} ${authentik_color}%-15s${NC} ${nextcloud_color}%-15s${NC} ${traefik_color}%-15s${NC} ${ubuntu_color}%-15s${NC}\n" \
"$client" "$status" "$authentik" "$nextcloud" "$traefik" "$ubuntu"
done
echo ""
echo -e "${CYAN}Latest versions:${NC}"
echo " Authentik: ${LATEST_VERSIONS[authentik]}"
echo " Nextcloud: ${LATEST_VERSIONS[nextcloud]}"
echo " Traefik: ${LATEST_VERSIONS[traefik]}"
echo " Ubuntu: ${LATEST_VERSIONS[ubuntu]}"
echo ""
echo -e "${YELLOW}Note: ${RED}Red${NC} indicates outdated version${NC}"
echo ""
;;
csv)
# CSV header
echo "client,status,authentik,nextcloud,traefik,ubuntu,last_update,outdated"
# CSV rows
for client in $CLIENTS; do
status=$(yq eval ".clients.\"$client\".status" "$REGISTRY_FILE")
authentik=$(yq eval ".clients.\"$client\".versions.authentik" "$REGISTRY_FILE")
nextcloud=$(yq eval ".clients.\"$client\".versions.nextcloud" "$REGISTRY_FILE")
traefik=$(yq eval ".clients.\"$client\".versions.traefik" "$REGISTRY_FILE")
ubuntu=$(yq eval ".clients.\"$client\".versions.ubuntu" "$REGISTRY_FILE")
last_update=$(yq eval ".clients.\"$client\".maintenance.last_full_update" "$REGISTRY_FILE")
# Check if any version is outdated
outdated="no"
is_outdated "authentik" "$authentik" && outdated="yes"
is_outdated "nextcloud" "$nextcloud" && outdated="yes"
is_outdated "traefik" "$traefik" && outdated="yes"
is_outdated "ubuntu" "$ubuntu" && outdated="yes"
# Skip if filtering by outdated
if [ "$SHOW_OUTDATED" = true ] && [ "$outdated" = "no" ]; then
continue
fi
echo "$client,$status,$authentik,$nextcloud,$traefik,$ubuntu,$last_update,$outdated"
done
;;
json)
# Build JSON array
echo "{"
echo " \"latest_versions\": {"
echo " \"authentik\": \"${LATEST_VERSIONS[authentik]}\","
echo " \"nextcloud\": \"${LATEST_VERSIONS[nextcloud]}\","
echo " \"traefik\": \"${LATEST_VERSIONS[traefik]}\","
echo " \"ubuntu\": \"${LATEST_VERSIONS[ubuntu]}\""
echo " },"
echo " \"clients\": ["
first=true
for client in $CLIENTS; do
status=$(yq eval ".clients.\"$client\".status" "$REGISTRY_FILE")
authentik=$(yq eval ".clients.\"$client\".versions.authentik" "$REGISTRY_FILE")
nextcloud=$(yq eval ".clients.\"$client\".versions.nextcloud" "$REGISTRY_FILE")
traefik=$(yq eval ".clients.\"$client\".versions.traefik" "$REGISTRY_FILE")
ubuntu=$(yq eval ".clients.\"$client\".versions.ubuntu" "$REGISTRY_FILE")
last_update=$(yq eval ".clients.\"$client\".maintenance.last_full_update" "$REGISTRY_FILE")
# Check if any version is outdated
outdated=false
is_outdated "authentik" "$authentik" && outdated=true
is_outdated "nextcloud" "$nextcloud" && outdated=true
is_outdated "traefik" "$traefik" && outdated=true
is_outdated "ubuntu" "$ubuntu" && outdated=true
# Skip if filtering by outdated
if [ "$SHOW_OUTDATED" = true ] && [ "$outdated" = false ]; then
continue
fi
if [ "$first" = false ]; then
echo " ,"
fi
first=false
cat <<EOF
{
"name": "$client",
"status": "$status",
"versions": {
"authentik": "$authentik",
"nextcloud": "$nextcloud",
"traefik": "$traefik",
"ubuntu": "$ubuntu"
},
"last_update": "$last_update",
"outdated": $outdated
}
EOF
done
echo ""
echo " ]"
echo "}"
;;
*)
echo -e "${RED}Error: Unknown format '$FORMAT'${NC}"
echo "Valid formats: table, csv, json"
exit 1
;;
esac

View file

@ -0,0 +1,132 @@
#!/usr/bin/env bash
#
# Collect deployed software versions from a client and update registry
#
# Usage: ./scripts/collect-client-versions.sh <client_name>
#
# Queries the deployed server for actual running versions:
# - Docker container image versions
# - Ubuntu OS version
# - Updates the client registry with collected versions
set -euo pipefail
# Colors for output
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m' # No Color
# Script directory
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(dirname "$SCRIPT_DIR")"
REGISTRY_FILE="$PROJECT_ROOT/clients/registry.yml"
# Check arguments
if [ $# -ne 1 ]; then
echo -e "${RED}Error: Client name required${NC}"
echo "Usage: $0 <client_name>"
echo ""
echo "Example: $0 dev"
exit 1
fi
CLIENT_NAME="$1"
# Check if yq is available
if ! command -v yq &> /dev/null; then
echo -e "${RED}Error: 'yq' not found. Install with: brew install yq${NC}"
exit 1
fi
# Check required environment variables
if [ -z "${HCLOUD_TOKEN:-}" ]; then
echo -e "${RED}Error: HCLOUD_TOKEN environment variable not set${NC}"
echo "Export your Hetzner Cloud API token:"
echo " export HCLOUD_TOKEN='your-token-here'"
exit 1
fi
# Check if registry exists
if [ ! -f "$REGISTRY_FILE" ]; then
echo -e "${RED}Error: Registry file not found: $REGISTRY_FILE${NC}"
exit 1
fi
# Check if client exists in registry
if yq eval ".clients.\"$CLIENT_NAME\"" "$REGISTRY_FILE" | grep -q "null"; then
echo -e "${RED}Error: Client '$CLIENT_NAME' not found in registry${NC}"
exit 1
fi
echo -e "${BLUE}Collecting versions for client: $CLIENT_NAME${NC}"
echo ""
cd "$PROJECT_ROOT/ansible"
# Check if server is reachable
if ! timeout 10 ~/.local/bin/ansible -i hcloud.yml "$CLIENT_NAME" -m ping -o &>/dev/null; then
echo -e "${RED}Error: Cannot reach server for client '$CLIENT_NAME'${NC}"
echo "Server may not be deployed or network is unreachable"
exit 1
fi
echo -e "${YELLOW}Querying deployed versions...${NC}"
echo ""
# Query Docker container versions
echo "Collecting Docker container versions..."
# Function to extract version from image tag
extract_version() {
local image=$1
# Extract version after the colon, or return "latest"
if [[ $image == *":"* ]]; then
echo "$image" | awk -F: '{print $2}'
else
echo "latest"
fi
}
# Collect Authentik version
AUTHENTIK_IMAGE=$(~/.local/bin/ansible -i hcloud.yml "$CLIENT_NAME" -m shell -a "docker inspect authentik-server 2>/dev/null | jq -r '.[0].Config.Image' 2>/dev/null || echo 'unknown'" -o 2>/dev/null | tail -1 | awk '{print $NF}')
AUTHENTIK_VERSION=$(extract_version "$AUTHENTIK_IMAGE")
# Collect Nextcloud version
NEXTCLOUD_IMAGE=$(~/.local/bin/ansible -i hcloud.yml "$CLIENT_NAME" -m shell -a "docker inspect nextcloud 2>/dev/null | jq -r '.[0].Config.Image' 2>/dev/null || echo 'unknown'" -o 2>/dev/null | tail -1 | awk '{print $NF}')
NEXTCLOUD_VERSION=$(extract_version "$NEXTCLOUD_IMAGE")
# Collect Traefik version
TRAEFIK_IMAGE=$(~/.local/bin/ansible -i hcloud.yml "$CLIENT_NAME" -m shell -a "docker inspect traefik 2>/dev/null | jq -r '.[0].Config.Image' 2>/dev/null || echo 'unknown'" -o 2>/dev/null | tail -1 | awk '{print $NF}')
TRAEFIK_VERSION=$(extract_version "$TRAEFIK_IMAGE")
# Collect Ubuntu version
UBUNTU_VERSION=$(~/.local/bin/ansible -i hcloud.yml "$CLIENT_NAME" -m shell -a "lsb_release -rs 2>/dev/null || echo 'unknown'" -o 2>/dev/null | tail -1 | awk '{print $NF}')
echo -e "${GREEN}✓ Versions collected${NC}"
echo ""
# Display collected versions
echo "Collected versions:"
echo " Authentik: $AUTHENTIK_VERSION"
echo " Nextcloud: $NEXTCLOUD_VERSION"
echo " Traefik: $TRAEFIK_VERSION"
echo " Ubuntu: $UBUNTU_VERSION"
echo ""
# Update registry
echo -e "${YELLOW}Updating registry...${NC}"
# Update versions in registry
yq eval -i ".clients.\"$CLIENT_NAME\".versions.authentik = \"$AUTHENTIK_VERSION\"" "$REGISTRY_FILE"
yq eval -i ".clients.\"$CLIENT_NAME\".versions.nextcloud = \"$NEXTCLOUD_VERSION\"" "$REGISTRY_FILE"
yq eval -i ".clients.\"$CLIENT_NAME\".versions.traefik = \"$TRAEFIK_VERSION\"" "$REGISTRY_FILE"
yq eval -i ".clients.\"$CLIENT_NAME\".versions.ubuntu = \"$UBUNTU_VERSION\"" "$REGISTRY_FILE"
echo -e "${GREEN}✓ Registry updated${NC}"
echo ""
echo "Updated: $REGISTRY_FILE"
echo ""
echo "To view registry:"
echo " ./scripts/client-status.sh $CLIENT_NAME"

View file

@ -211,6 +211,16 @@ echo ""
echo -e "${GREEN}✓ Registry updated${NC}"
echo ""
# Collect deployed versions
echo -e "${YELLOW}Collecting deployed versions...${NC}"
"$SCRIPT_DIR/collect-client-versions.sh" "$CLIENT_NAME" 2>/dev/null || {
echo -e "${YELLOW}⚠ Could not collect versions automatically${NC}"
echo "Run manually later: ./scripts/collect-client-versions.sh $CLIENT_NAME"
}
echo ""
# Calculate duration
END_TIME=$(date +%s)
DURATION=$((END_TIME - START_TIME))

228
scripts/detect-version-drift.sh Executable file
View file

@ -0,0 +1,228 @@
#!/usr/bin/env bash
#
# Detect version drift between clients
#
# Usage: ./scripts/detect-version-drift.sh [options]
#
# Options:
# --threshold=<days> Only report clients not updated in X days (default: 30)
# --app=<name> Check specific app only (authentik|nextcloud|traefik|ubuntu)
# --format=table Show as table (default)
# --format=summary Show summary only
#
# Exit codes:
# 0 - No drift detected
# 1 - Drift detected
# 2 - Error
set -euo pipefail
# Colors for output
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
CYAN='\033[0;36m'
NC='\033[0m' # No Color
# Script directory
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(dirname "$SCRIPT_DIR")"
REGISTRY_FILE="$PROJECT_ROOT/clients/registry.yml"
# Default options
THRESHOLD_DAYS=30
FILTER_APP=""
FORMAT="table"
# Parse arguments
for arg in "$@"; do
case $arg in
--threshold=*)
THRESHOLD_DAYS="${arg#*=}"
;;
--app=*)
FILTER_APP="${arg#*=}"
;;
--format=*)
FORMAT="${arg#*=}"
;;
*)
echo "Unknown option: $arg"
echo "Usage: $0 [--threshold=<days>] [--app=<name>] [--format=table|summary]"
exit 2
;;
esac
done
# Check if yq is available
if ! command -v yq &> /dev/null; then
echo -e "${RED}Error: 'yq' not found. Install with: brew install yq${NC}"
exit 2
fi
# Check if registry exists
if [ ! -f "$REGISTRY_FILE" ]; then
echo -e "${RED}Error: Registry file not found: $REGISTRY_FILE${NC}"
exit 2
fi
# Get list of deployed clients only
CLIENTS=$(yq eval '.clients | to_entries | map(select(.value.status == "deployed")) | .[].key' "$REGISTRY_FILE" 2>/dev/null)
if [ -z "$CLIENTS" ]; then
echo -e "${YELLOW}No deployed clients found${NC}"
exit 0
fi
# Determine latest versions
declare -A LATEST_VERSIONS
LATEST_VERSIONS[authentik]=$(yq eval '.clients | to_entries | .[].value.versions.authentik' "$REGISTRY_FILE" | sort -V | tail -1)
LATEST_VERSIONS[nextcloud]=$(yq eval '.clients | to_entries | .[].value.versions.nextcloud' "$REGISTRY_FILE" | sort -V | tail -1)
LATEST_VERSIONS[traefik]=$(yq eval '.clients | to_entries | .[].value.versions.traefik' "$REGISTRY_FILE" | sort -V | tail -1)
LATEST_VERSIONS[ubuntu]=$(yq eval '.clients | to_entries | .[].value.versions.ubuntu' "$REGISTRY_FILE" | sort -V | tail -1)
# Calculate date threshold
if command -v gdate &> /dev/null; then
# macOS with GNU coreutils
THRESHOLD_DATE=$(gdate -d "$THRESHOLD_DAYS days ago" +%Y-%m-%d)
elif date --version &> /dev/null 2>&1; then
# GNU date (Linux)
THRESHOLD_DATE=$(date -d "$THRESHOLD_DAYS days ago" +%Y-%m-%d)
else
# BSD date (macOS default)
THRESHOLD_DATE=$(date -v-${THRESHOLD_DAYS}d +%Y-%m-%d)
fi
# Counters
DRIFT_FOUND=0
OUTDATED_COUNT=0
STALE_COUNT=0
# Arrays to store drift details
declare -a DRIFT_CLIENTS
declare -a DRIFT_DETAILS
# Analyze each client
for client in $CLIENTS; do
authentik=$(yq eval ".clients.\"$client\".versions.authentik" "$REGISTRY_FILE")
nextcloud=$(yq eval ".clients.\"$client\".versions.nextcloud" "$REGISTRY_FILE")
traefik=$(yq eval ".clients.\"$client\".versions.traefik" "$REGISTRY_FILE")
ubuntu=$(yq eval ".clients.\"$client\".versions.ubuntu" "$REGISTRY_FILE")
last_update=$(yq eval ".clients.\"$client\".maintenance.last_full_update" "$REGISTRY_FILE")
has_drift=false
drift_reasons=()
# Check version drift
if [ -z "$FILTER_APP" ] || [ "$FILTER_APP" = "authentik" ]; then
if [ "$authentik" != "${LATEST_VERSIONS[authentik]}" ] && [ "$authentik" != "null" ] && [ "$authentik" != "unknown" ]; then
has_drift=true
drift_reasons+=("Authentik: $authentik${LATEST_VERSIONS[authentik]}")
fi
fi
if [ -z "$FILTER_APP" ] || [ "$FILTER_APP" = "nextcloud" ]; then
if [ "$nextcloud" != "${LATEST_VERSIONS[nextcloud]}" ] && [ "$nextcloud" != "null" ] && [ "$nextcloud" != "unknown" ]; then
has_drift=true
drift_reasons+=("Nextcloud: $nextcloud${LATEST_VERSIONS[nextcloud]}")
fi
fi
if [ -z "$FILTER_APP" ] || [ "$FILTER_APP" = "traefik" ]; then
if [ "$traefik" != "${LATEST_VERSIONS[traefik]}" ] && [ "$traefik" != "null" ] && [ "$traefik" != "unknown" ]; then
has_drift=true
drift_reasons+=("Traefik: $traefik${LATEST_VERSIONS[traefik]}")
fi
fi
if [ -z "$FILTER_APP" ] || [ "$FILTER_APP" = "ubuntu" ]; then
if [ "$ubuntu" != "${LATEST_VERSIONS[ubuntu]}" ] && [ "$ubuntu" != "null" ] && [ "$ubuntu" != "unknown" ]; then
has_drift=true
drift_reasons+=("Ubuntu: $ubuntu${LATEST_VERSIONS[ubuntu]}")
fi
fi
# Check if update is stale (older than threshold)
is_stale=false
if [ "$last_update" != "null" ] && [ -n "$last_update" ]; then
if [[ "$last_update" < "$THRESHOLD_DATE" ]]; then
is_stale=true
drift_reasons+=("Last update: $last_update (>$THRESHOLD_DAYS days ago)")
fi
fi
# Record drift
if [ "$has_drift" = true ] || [ "$is_stale" = true ]; then
DRIFT_FOUND=1
DRIFT_CLIENTS+=("$client")
DRIFT_DETAILS+=("$(IFS='; '; echo "${drift_reasons[*]}")")
[ "$has_drift" = true ] && ((OUTDATED_COUNT++)) || true
[ "$is_stale" = true ] && ((STALE_COUNT++)) || true
fi
done
# Output results
case $FORMAT in
table)
if [ $DRIFT_FOUND -eq 0 ]; then
echo -e "${GREEN}✓ No version drift detected${NC}"
echo ""
echo "All deployed clients are running latest versions:"
echo " Authentik: ${LATEST_VERSIONS[authentik]}"
echo " Nextcloud: ${LATEST_VERSIONS[nextcloud]}"
echo " Traefik: ${LATEST_VERSIONS[traefik]}"
echo " Ubuntu: ${LATEST_VERSIONS[ubuntu]}"
echo ""
else
echo -e "${RED}⚠ VERSION DRIFT DETECTED${NC}"
echo ""
echo -e "${CYAN}Clients with outdated versions:${NC}"
echo ""
for i in "${!DRIFT_CLIENTS[@]}"; do
client="${DRIFT_CLIENTS[$i]}"
details="${DRIFT_DETAILS[$i]}"
echo -e "${YELLOW}$client${NC}"
IFS=';' read -ra REASONS <<< "$details"
for reason in "${REASONS[@]}"; do
echo " $reason"
done
echo ""
done
echo -e "${CYAN}Recommended actions:${NC}"
echo ""
echo "1. Test updates on canary server first:"
echo " ${BLUE}./scripts/rebuild-client.sh dev${NC}"
echo ""
echo "2. Verify canary health:"
echo " ${BLUE}./scripts/client-status.sh dev${NC}"
echo ""
echo "3. Update outdated clients:"
for client in "${DRIFT_CLIENTS[@]}"; do
echo " ${BLUE}./scripts/rebuild-client.sh $client${NC}"
done
echo ""
fi
;;
summary)
if [ $DRIFT_FOUND -eq 0 ]; then
echo "Status: OK"
echo "Drift: No"
echo "Clients checked: $(echo "$CLIENTS" | wc -l | xargs)"
else
echo "Status: DRIFT DETECTED"
echo "Drift: Yes"
echo "Clients checked: $(echo "$CLIENTS" | wc -l | xargs)"
echo "Clients with outdated versions: $OUTDATED_COUNT"
echo "Clients not updated in $THRESHOLD_DAYS days: $STALE_COUNT"
echo "Affected clients: ${DRIFT_CLIENTS[*]}"
fi
;;
esac
exit $DRIFT_FOUND

View file

@ -212,6 +212,16 @@ echo ""
echo -e "${GREEN}✓ Registry updated${NC}"
echo ""
# Collect deployed versions
echo -e "${YELLOW}Collecting deployed versions...${NC}"
"$SCRIPT_DIR/collect-client-versions.sh" "$CLIENT_NAME" 2>/dev/null || {
echo -e "${YELLOW}⚠ Could not collect versions automatically${NC}"
echo "Run manually later: ./scripts/collect-client-versions.sh $CLIENT_NAME"
}
echo ""
# Calculate duration
END_TIME=$(date +%s)
DURATION=$((END_TIME - START_TIME))