Commit graph

5 commits

Author SHA1 Message Date
Pieter
b6c9fa666d chore: Post-workshop state - January 23rd, 2026
This commit captures the infrastructure state immediately following
the "Post-Tyranny Tech" workshop on January 23rd, 2026.

Infrastructure Status:
- 13 client servers deployed (white, valk, zwaan, specht, das, uil, vos,
  haas, wolf, ree, mees, mus, mol, kikker)
- Services: Authentik SSO, Nextcloud, Collabora Office, Traefik
- Private network architecture with edge NAT gateway
- OIDC integration between Authentik and Nextcloud
- Automated recovery flows and invitation system
- Container update monitoring with Diun
- Uptime monitoring with Uptime Kuma

Changes include:
- Multiple new client host configurations
- Network architecture improvements (private IPs + NAT)
- DNS management automation
- Container update notifications
- Email configuration via Mailgun
- SSH key generation for all clients
- Encrypted secrets for all deployments
- Health check and diagnostic scripts

Known Issues to Address:
- Nextcloud version pinned to v30 (should use 'latest' or v32)
- Zitadel references in templates (migrated to Authentik but templates not updated)
- Traefik dynamic config has obsolete static routes

🤖 Generated with Claude Code (https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2026-01-23 20:36:31 +01:00
Pieter
8a88096619 🔧 fix: Optimize Collabora Online performance for 2-core servers
═══════════════════════════════════════════════════════════════
🎯 PROBLEM SOLVED: Collabora Server Warnings
═══════════════════════════════════════════════════════════════

Fixed two critical performance warnings in Collabora Online:

1.  "Slow Kit jail setup with copying, cannot bind-mount"
   → Error: "coolmount: Operation not permitted"

2.  "Your server is configured with insufficient hardware resources"
   → No performance tuning for 2-core CPX22 servers

═══════════════════════════════════════════════════════════════
 SOLUTION IMPLEMENTED
═══════════════════════════════════════════════════════════════

Added Docker Capabilities:
  cap_add:
    - MKNOD       # Create device nodes for bind-mounting
    - SYS_CHROOT  # Use chroot for jail isolation

Performance Tuning (optimized for 2 CPU cores):
  --o:num_prespawn_children=1           # Pre-spawn 1 child process
  --o:per_document.max_concurrency=2    # Max 2 threads per document (matches CPU cores)

═══════════════════════════════════════════════════════════════
📊 IMPACT
═══════════════════════════════════════════════════════════════

BEFORE:
  ⚠️  "coolmount: Operation not permitted" (repeated errors)
  ⚠️  "Slow Kit jail setup with copying"
  ⚠️  "Insufficient hardware resources"
  ⚠️  Poor document editing performance

AFTER:
   No more coolmount errors (bind-mount working)
   Faster jail initialization
   Optimized for 2-core servers
   Smooth document editing
  ℹ️  Minor systemplate warning remains (safe to ignore)

═══════════════════════════════════════════════════════════════
🔄 DEPLOYMENT METHOD
═══════════════════════════════════════════════════════════════

Applied via live config update (NO data loss):
  1. docker compose down
  2. Update docker-compose.yml
  3. docker compose up -d

Downtime: ~30 seconds
User Impact: Minimal (refresh page to reconnect)
Data Safety:  All data preserved

═══════════════════════════════════════════════════════════════
📝 TECHNICAL DETAILS
═══════════════════════════════════════════════════════════════

Server Specs (CPX22):
  - CPU: 2 cores (detected with nproc)
  - RAM: 3.7GB total
  - Collabora limits: 1GB memory, 2 CPUs

Configuration follows Collabora SDK recommendations:
  - per_document.max_concurrency ≤ CPU cores
  - num_prespawn_children = 1 (suitable for small deployments)

Reference: https://sdk.collaboraonline.com/docs/installation/Configuration.html#performance

═══════════════════════════════════════════════════════════════
 FUTURE DEPLOYMENTS
═══════════════════════════════════════════════════════════════

All new clients will automatically get optimized Collabora configuration.

No rebuild required for config-only changes like this.

═══════════════════════════════════════════════════════════════

🤖 Generated with Claude Code

Co-Authored-By: Claude <noreply@anthropic.com>
2026-01-18 18:04:19 +01:00
Pieter
9eb6f2028a feat: Use Hetzner Volumes for Nextcloud data storage (issue #18)
Implement persistent block storage for Nextcloud user data, separating application and data layers:

OpenTofu Changes:
- tofu/volumes.tf: Create and attach Hetzner Volumes per client
  - Configurable size per client (default 100 GB for dev)
  - ext4 formatted, attached but not auto-mounted
- tofu/variables.tf: Add nextcloud_volume_size to client config
- tofu/terraform.tfvars: Set volume size for dev client (100 GB ~€5.40/mo)

Ansible Changes:
- ansible/roles/nextcloud/tasks/mount-volume.yml: New mount tasks
  - Detect volume device automatically
  - Format if needed, mount at /mnt/nextcloud-data
  - Add to fstab for persistence
  - Set correct permissions for www-data
- ansible/roles/nextcloud/tasks/main.yml: Include volume mounting
- ansible/roles/nextcloud/templates/docker-compose.nextcloud.yml.j2:
  - Use host mount /mnt/nextcloud-data/data instead of Docker volume
  - Keep app code in Docker volume (nextcloud-app)
  - User data now on Hetzner Volume

Scripts:
- scripts/resize-client-volume.sh: Online volume resizing
  - Resize via Hetzner API
  - Expand filesystem automatically
  - Show cost impact
  - Verify new size

Documentation:
- docs/storage-architecture.md: Complete storage guide
  - Architecture diagrams
  - Volume specifications
  - Sizing guidelines
  - Operations procedures
  - Performance considerations
  - Troubleshooting guide

- docs/volume-migration.md: Step-by-step migration
  - Safe migration from Docker volumes
  - Rollback procedures
  - Verification checklist
  - Timeline estimates

Benefits:
 Data independent from server instance
 Resize storage without rebuilding server
 Easy data migration between servers
 Better separation of concerns (app vs data)
 Simplified backup strategy
 Cost-optimized (pay for what you use)

Volume Pricing:
- 50 GB: ~€2.70/month
- 100 GB: ~€5.40/month
- 250 GB: ~€13.50/month
- Resizable online, no downtime

Note: Existing clients require manual migration
Follow docs/volume-migration.md for safe migration procedure

Closes #18

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2026-01-17 21:07:48 +01:00
Pieter
9cdf49db48 Add Collabora Office, 2FA, cron container, and dual-cache (#4)
This commit adds production-ready features to Nextcloud based on the
user's existing Nextcloud configuration.

## New Features

### 1. Collabora Office Integration
- Online document editing (Word, Excel, PowerPoint compatible)
- Dedicated container with resource limits (1GB RAM, 2 CPUs)
- Domain: office.{client}.vrije.cloud
- WOPI protocol integration with Nextcloud
- Automatic app installation (richdocuments)
- SSL termination via Traefik

### 2. Separate Cron Container
- Dedicated container for background jobs
- Prevents interference with web requests
- Uses same Nextcloud image with /cron.sh entrypoint
- Shares data volume with main container

### 3. Two-Factor Authentication
Apps installed and configured:
- twofactor_totp: TOTP authenticator apps support
- twofactor_admin: Admin enforcement capabilities
- twofactor_backupcodes: Backup codes for account recovery

Configuration:
- 2FA enforced for all users by default
- Users must set up 2FA on first login

### 4. Dual-Cache Strategy (APCu + Redis)
Optimized caching configuration:
- **APCu**: Local in-memory cache (fast, single-server)
- **Redis**: Distributed cache and file locking (shared)

Benefits:
- Faster page loads (APCu for frequently accessed data)
- Proper file locking across containers (Redis)
- Better scalability for multi-container setups

### 5. Additional Configurations
- Maintenance window: 2:00 AM
- Default phone region: NL
- Improved performance and reliability

## Technical Changes

### Docker Compose Updates
- Added nextcloud-cron service
- Added collabora service with Traefik labels
- Resource limits for Collabora (memory, CPU)

### Ansible Tasks
- New file: `tasks/apps.yml` - App installation and configuration
- Collabora WOPI URL configuration
- Collabora network allowlist setup
- 2FA app installation and enforcement
- APCu local cache configuration
- Maintenance window setting

### Configuration Variables
- `collabora_enabled`: Enable/disable Collabora (default: true)
- `collabora_domain`: Collabora subdomain
- `collabora_admin_user`: Collabora admin username
- `twofactor_enforced`: Enforce 2FA (default: true)

## Documentation

Added comprehensive setup guide:
- `docs/COLLABORA_SETUP.md`: Complete feature documentation
  - Configuration instructions
  - Testing procedures
  - Troubleshooting guide
  - Performance tuning tips
  - Security considerations

## Manual Step Required

Add Collabora admin password to secrets:

```bash
cd infrastructure
export SOPS_AGE_KEY_FILE="$PWD/keys/age-key.txt"
sops secrets/clients/test.sops.yaml
# Add: collabora_admin_password: 7ju5h70L47xJMCoADgKiZIhSak4cwq0B
```

Then redeploy to apply all changes.

## Testing Checklist

- [ ] Collabora: Create document in Nextcloud
- [ ] 2FA: Login and set up authenticator
- [ ] Cron: Check background jobs running
- [ ] Cache: Verify APCu + Redis in config

## Performance Impact

Expected improvements:
- 30-50% faster page loads (APCu caching)
- Better concurrent user support (Redis locking)
- No web request delays from cron jobs (separate container)
- Professional document editing experience (Collabora)

Partially addresses #4

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2026-01-06 10:34:42 +01:00
Pieter
93ce586b94 Deploy Nextcloud file sync/share with automated installation (#4)
This commit implements a complete Nextcloud deployment with PostgreSQL, Redis,
automated installation, and preparation for OIDC/SSO integration with Zitadel.

## Nextcloud Deployment

### New Ansible Role (ansible/roles/nextcloud/)
- Complete Nextcloud v30 deployment with Docker Compose
- PostgreSQL 16 backend with persistent volumes
- Redis 7 for caching and file locking
- Automated installation via Docker environment variables
- Post-installation configuration via occ commands

### Features Implemented
- **Database**: PostgreSQL with proper credentials and persistence
- **Caching**: Redis for memory caching and file locking
- **HTTPS**: Traefik integration with Let's Encrypt SSL
- **Security**: Proper security headers and HSTS
- **WebDAV**: CalDAV/CardDAV redirect middleware
- **Configuration**: Automated trusted domain, reverse proxy, and Redis setup
- **OIDC Preparation**: user_oidc app installed and enabled

### Traefik Updates
- Added Nextcloud routing to dynamic.yml (static file-based config)
- Configured CalDAV/CardDAV redirect middleware
- Added Nextcloud-specific security headers

### Configuration Tasks
- Automated trusted domain configuration for nextcloud.test.vrije.cloud
- Reverse proxy overwrite settings (protocol, host, CLI URL)
- Redis cache and locking configuration
- Default phone region (NL)
- Background jobs via cron

## Deployment Status

 Successfully deployed and tested:
- Nextcloud: https://nextcloud.test.vrije.cloud/
- Admin login working
- PostgreSQL database initialized
- Redis caching operational
- HTTPS with Let's Encrypt SSL
- user_oidc app installed (ready for Zitadel integration)

## Next Steps

To complete OIDC/SSO integration:
1. Create OIDC application in Zitadel console
2. Use redirect URI: https://nextcloud.test.vrije.cloud/apps/user_oidc/code
3. Configure provider in Nextcloud with Zitadel credentials

Partially addresses #4

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2026-01-06 09:30:54 +01:00