Updates to Uptime Kuma monitoring setup:
DNS Configuration:
- Added DNS A record for status.vrije.cloud -> 94.130.231.155
- Updated Uptime Kuma container to use status.vrije.cloud domain
- HTTPS access via nginx-proxy with Let's Encrypt SSL
Automated Monitor Management:
- Created scripts/add-client-to-monitoring.sh
- Created scripts/remove-client-from-monitoring.sh
- Integrated monitoring into deploy-client.sh (step 5/5)
- Integrated monitoring into destroy-client.sh (step 0/7)
- Deployment now prompts to add monitors after success
- Destruction now prompts to remove monitors before deletion
Email Notification Setup:
- Created docs/uptime-kuma-email-setup.md with complete guide
- SMTP configuration using smtp.strato.com
- Credentials: server@postxsociety.org
- Alerts sent to mail@postxsociety.org
Documentation:
- Updated docs/monitoring.md with new domain
- Added email setup reference
- Replaced all URLs to use status.vrije.cloud
Benefits:
✅ Friendly domain instead of IP address
✅ HTTPS access with auto-SSL
✅ Automated monitoring reminders on deploy/destroy
✅ Complete email notification guide
✅ Streamlined workflow for monitor management
Note: Monitor creation/deletion currently manual (API automation planned)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Resolves#17
Deployed Uptime Kuma on external monitoring server for centralized
monitoring of all PTT client services.
Implementation:
- Deployed Uptime Kuma v1 on external server (94.130.231.155)
- Configured Docker Compose with nginx-proxy integration
- Created comprehensive monitoring documentation
Architecture:
- Independent monitoring server (not part of PTT infrastructure)
- Can monitor infrastructure failures and dev server
- Access: http://94.130.231.155:3001
- Future DNS: https://status.postxsociety.cloud
Monitors to configure (manual setup required):
- HTTP(S) endpoint monitoring for Authentik and Nextcloud
- SSL certificate expiration monitoring
- Per-client monitors for: dev, green
Documentation:
- Complete setup guide in docs/monitoring.md
- Monitor configuration instructions
- Management and troubleshooting procedures
- Integration guidelines for deployment scripts
Next Steps:
1. Access http://94.130.231.155:3001 to create admin account
2. Configure monitors for each client as per docs/monitoring.md
3. Set up email notifications for alerts
4. (Optional) Configure DNS for status.postxsociety.cloud
5. (Future) Automate monitor creation via Uptime Kuma API
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Implement persistent block storage for Nextcloud user data, separating application and data layers:
OpenTofu Changes:
- tofu/volumes.tf: Create and attach Hetzner Volumes per client
- Configurable size per client (default 100 GB for dev)
- ext4 formatted, attached but not auto-mounted
- tofu/variables.tf: Add nextcloud_volume_size to client config
- tofu/terraform.tfvars: Set volume size for dev client (100 GB ~€5.40/mo)
Ansible Changes:
- ansible/roles/nextcloud/tasks/mount-volume.yml: New mount tasks
- Detect volume device automatically
- Format if needed, mount at /mnt/nextcloud-data
- Add to fstab for persistence
- Set correct permissions for www-data
- ansible/roles/nextcloud/tasks/main.yml: Include volume mounting
- ansible/roles/nextcloud/templates/docker-compose.nextcloud.yml.j2:
- Use host mount /mnt/nextcloud-data/data instead of Docker volume
- Keep app code in Docker volume (nextcloud-app)
- User data now on Hetzner Volume
Scripts:
- scripts/resize-client-volume.sh: Online volume resizing
- Resize via Hetzner API
- Expand filesystem automatically
- Show cost impact
- Verify new size
Documentation:
- docs/storage-architecture.md: Complete storage guide
- Architecture diagrams
- Volume specifications
- Sizing guidelines
- Operations procedures
- Performance considerations
- Troubleshooting guide
- docs/volume-migration.md: Step-by-step migration
- Safe migration from Docker volumes
- Rollback procedures
- Verification checklist
- Timeline estimates
Benefits:
✅ Data independent from server instance
✅ Resize storage without rebuilding server
✅ Easy data migration between servers
✅ Better separation of concerns (app vs data)
✅ Simplified backup strategy
✅ Cost-optimized (pay for what you use)
Volume Pricing:
- 50 GB: ~€2.70/month
- 100 GB: ~€5.40/month
- 250 GB: ~€13.50/month
- Resizable online, no downtime
Note: Existing clients require manual migration
Follow docs/volume-migration.md for safe migration procedure
Closes#18🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Add comprehensive client registry for tracking all deployed infrastructure:
Registry System:
- Single source of truth in clients/registry.yml
- Tracks status, server specs, versions, maintenance history
- Supports canary deployment workflow
- Automatic updates via deployment scripts
New Scripts:
- scripts/list-clients.sh: List/filter clients (table/json/csv/summary)
- scripts/client-status.sh: Detailed client info with health checks
- scripts/update-registry.sh: Manual registry updates
Updated Scripts:
- scripts/deploy-client.sh: Auto-updates registry on deploy
- scripts/rebuild-client.sh: Auto-updates registry on rebuild
- scripts/destroy-client.sh: Marks clients as destroyed
Documentation:
- docs/client-registry.md: Complete registry reference
- clients/README.md: Quick start guide
Status tracking: pending → deployed → maintenance → destroyed
Role support: canary (dev) and production clients
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Added Authentik as the identity provider for SSO authentication:
Why Authentik:
- MIT license (truly open source, most permissive)
- Simple Docker Compose deployment (no manual wizards)
- Lightweight Python-based architecture
- Comprehensive protocol support (SAML, OAuth2/OIDC, LDAP, RADIUS)
- No Redis required as of v2025.10 (all caching in PostgreSQL)
- Active development and strong community
Implementation:
- Created complete Authentik Ansible role
- Docker Compose with server + worker architecture
- PostgreSQL 16 database backend
- Traefik integration with Let's Encrypt SSL
- Bootstrap tasks for initial setup guidance
- Health checks and proper service dependencies
Architecture decisions updated:
- Documented comparison: Authentik vs Zitadel vs Keycloak
- Explained Zitadel removal (FirstInstance bugs)
- Added deployment example and configuration notes
Next steps:
- Update documentation (PROJECT_REFERENCE.md, README.md)
- Create Authentik agent configuration
- Add secrets template
- Test deployment on test server
🤖 Generated with Claude Code
Co-Authored-By: Claude <noreply@anthropic.com>
Removed Zitadel identity provider due to:
- Critical bugs with FirstInstance initialization in v2.63.7
- Requirement for manual setup (not scalable for multi-tenant)
- User preference for Authentik in future
Changes:
- Removed entire Zitadel Ansible role and all tasks
- Removed Zitadel agent configuration (.claude/agents/zitadel.md)
- Updated deploy.yml playbook (removed Zitadel role)
- Updated architecture decisions document
- Updated PROJECT_REFERENCE.md (removed Zitadel sections)
- Updated README.md (removed Zitadel references)
- Cleaned up Zitadel deployment from test server
- Updated secrets file (removed Zitadel credentials)
Architecture now focuses on:
- Nextcloud as standalone file sync/collaboration platform
- May add Authentik or other identity provider in future if needed
🤖 Generated with Claude Code
Co-Authored-By: Claude <noreply@anthropic.com>
Security fixes:
- Remove hardcoded Collabora password from COLLABORA_SETUP.md
- Replace with placeholder and password generation instructions
- Rotate exposed Collabora password in test.sops.yaml
- New password: NX3NEpOMogUOcADjB0B2y1QGuRTSeDUn (SOPS encrypted)
The old password was exposed in documentation and needs to be
rotated on the test server. Future deployments will use the new
password from the encrypted secrets file.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
This commit eliminates all manual configuration steps for OIDC/SSO setup,
making the infrastructure fully scalable to dozens or hundreds of servers.
## Automation Overview
The deployment now automatically:
1. Authenticates with Zitadel using admin credentials
2. Creates OIDC application via Zitadel Management API
3. Retrieves client ID and secret
4. Configures Nextcloud OIDC provider
**Zero manual steps required!**
## New Components
### Zitadel OIDC Automation
- `files/get_admin_token.sh`: OAuth2 authentication script
- `files/create_oidc_app.py`: Python script for OIDC app creation via API
- `tasks/oidc-apps.yml`: Ansible orchestration for full automation
### API Integration
- Uses Zitadel Management API v1
- Resource Owner Password Credentials flow for admin auth
- Creates OIDC apps with proper security settings:
- Authorization Code + Refresh Token grants
- JWT access tokens
- Role and UserInfo assertions enabled
- Proper redirect URI configuration
### Nextcloud Integration
- Updated `tasks/oidc.yml` to auto-configure provider
- Receives credentials from Zitadel automation
- Configures discovery URI automatically
- Handles idempotency (skips if already configured)
## Scalability Benefits
### Before (Manual)
```
1. Deploy infrastructure
2. Login to Zitadel console
3. Create OIDC app manually
4. Copy client ID/secret
5. SSH to server
6. Run occ command with credentials
```
**Time per server: ~10-15 minutes**
### After (Automated)
```
1. Deploy infrastructure
```
**Time per server: ~0 minutes (fully automated)**
### Impact
- 10 servers: Save ~2 hours of manual work
- 50 servers: Save ~10 hours of manual work
- 100 servers: Save ~20 hours of manual work
## Security
- Admin credentials encrypted with SOPS
- Access tokens are ephemeral (generated per deployment)
- Client secrets never logged (`no_log: true`)
- All API calls over HTTPS only
- Credentials passed via Ansible facts (memory only)
## Documentation
Added comprehensive documentation:
- `docs/OIDC_AUTOMATION.md`: Full automation guide
- How it works
- Technical implementation details
- Troubleshooting guide
- Security considerations
## Testing
The automation is idempotent and handles:
- ✅ First-time setup (creates app)
- ✅ Subsequent runs (skips if exists)
- ✅ Error handling (fails gracefully)
- ✅ Credential validation
## Next Steps
Users can immediately login via SSO after deployment:
1. Visit https://nextcloud.{client}.vrije.cloud
2. Click "Login with Zitadel"
3. Enter Zitadel credentials
4. Automatically logged into Nextcloud
Closes#4🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>