Post-Tyranny-Tech-Infrastru.../ansible
Pieter 8866411ef3 Implement fully automated OIDC/SSO provisioning (#4)
This commit eliminates all manual configuration steps for OIDC/SSO setup,
making the infrastructure fully scalable to dozens or hundreds of servers.

## Automation Overview

The deployment now automatically:
1. Authenticates with Zitadel using admin credentials
2. Creates OIDC application via Zitadel Management API
3. Retrieves client ID and secret
4. Configures Nextcloud OIDC provider

**Zero manual steps required!**

## New Components

### Zitadel OIDC Automation
- `files/get_admin_token.sh`: OAuth2 authentication script
- `files/create_oidc_app.py`: Python script for OIDC app creation via API
- `tasks/oidc-apps.yml`: Ansible orchestration for full automation

### API Integration
- Uses Zitadel Management API v1
- Resource Owner Password Credentials flow for admin auth
- Creates OIDC apps with proper security settings:
  - Authorization Code + Refresh Token grants
  - JWT access tokens
  - Role and UserInfo assertions enabled
  - Proper redirect URI configuration

### Nextcloud Integration
- Updated `tasks/oidc.yml` to auto-configure provider
- Receives credentials from Zitadel automation
- Configures discovery URI automatically
- Handles idempotency (skips if already configured)

## Scalability Benefits

### Before (Manual)
```
1. Deploy infrastructure
2. Login to Zitadel console
3. Create OIDC app manually
4. Copy client ID/secret
5. SSH to server
6. Run occ command with credentials
```

**Time per server: ~10-15 minutes**

### After (Automated)
```
1. Deploy infrastructure
```

**Time per server: ~0 minutes (fully automated)**

### Impact
- 10 servers: Save ~2 hours of manual work
- 50 servers: Save ~10 hours of manual work
- 100 servers: Save ~20 hours of manual work

## Security

- Admin credentials encrypted with SOPS
- Access tokens are ephemeral (generated per deployment)
- Client secrets never logged (`no_log: true`)
- All API calls over HTTPS only
- Credentials passed via Ansible facts (memory only)

## Documentation

Added comprehensive documentation:
- `docs/OIDC_AUTOMATION.md`: Full automation guide
- How it works
- Technical implementation details
- Troubleshooting guide
- Security considerations

## Testing

The automation is idempotent and handles:
-  First-time setup (creates app)
-  Subsequent runs (skips if exists)
-  Error handling (fails gracefully)
-  Credential validation

## Next Steps

Users can immediately login via SSO after deployment:
1. Visit https://nextcloud.{client}.vrije.cloud
2. Click "Login with Zitadel"
3. Enter Zitadel credentials
4. Automatically logged into Nextcloud

Closes #4

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2026-01-06 09:49:16 +01:00
..
playbooks Deploy Nextcloud file sync/share with automated installation (#4) 2026-01-06 09:30:54 +01:00
roles Implement fully automated OIDC/SSO provisioning (#4) 2026-01-06 09:49:16 +01:00
ansible.cfg Complete Ansible base configuration (#2) 2025-12-27 14:13:15 +01:00
hcloud.yml WIP: Ansible base configuration - common role (#2) 2025-12-27 14:00:22 +01:00
README.md WIP: Ansible base configuration - common role (#2) 2025-12-27 14:00:22 +01:00

Ansible Configuration Management

Ansible playbooks and roles for configuring and managing the multi-tenant VPS infrastructure.

Prerequisites

1. Install Ansible (via pipx - isolated environment)

Why pipx? Isolates Ansible in its own Python environment, preventing conflicts.

# Install pipx
brew install pipx
pipx ensurepath

# Install Ansible
pipx install --include-deps ansible

# Install required dependencies
pipx inject ansible requests python-dateutil

# Verify installation
ansible --version

2. Install Ansible Collections

ansible-galaxy collection install hetzner.hcloud community.sops community.general

3. Set Hetzner Cloud API Token

export HCLOUD_TOKEN="your-hetzner-cloud-api-token"

Or add to your shell profile (~/.zshrc or ~/.bashrc):

export HCLOUD_TOKEN="your-token-here"

Quick Start

Test Dynamic Inventory

cd ansible
ansible-inventory --graph

You should see your servers grouped by labels.

Ping All Servers

ansible all -m ping

Run Setup Playbook

# Full setup (common + docker + traefik)
ansible-playbook playbooks/setup.yml

# Specific server
ansible-playbook playbooks/setup.yml --limit test

# Dry run (check mode)
ansible-playbook playbooks/setup.yml --check

Directory Structure

ansible/
├── ansible.cfg              # Ansible configuration
├── hcloud.yml               # Hetzner Cloud dynamic inventory
├── playbooks/               # Playbook definitions
│   ├── setup.yml            # Initial server setup
│   ├── deploy.yml           # Deploy/update applications
│   └── upgrade.yml          # System upgrades
├── roles/                   # Role definitions
│   ├── common/              # Base system hardening
│   ├── docker/              # Docker + Docker Compose
│   ├── traefik/             # Reverse proxy
│   ├── zitadel/             # Identity provider
│   ├── nextcloud/           # File sync/share
│   └── backup/              # Restic backup
└── group_vars/              # Group variables
    └── all.yml              # Variables for all hosts

Roles

common

Base system configuration and security hardening:

  • SSH hardening (key-only auth, no root password)
  • UFW firewall configuration
  • Fail2ban for SSH protection
  • Automatic security updates
  • Timezone and locale setup

Variables (roles/common/defaults/main.yml):

  • common_timezone: System timezone (default: Europe/Amsterdam)
  • common_ssh_port: SSH port (default: 22)
  • common_ufw_allowed_ports: List of allowed firewall ports

docker

Docker and Docker Compose installation:

  • Latest Docker Engine from official repository
  • Docker Compose V2
  • Docker daemon configuration
  • User permissions for Docker

traefik

Reverse proxy with automatic SSL:

  • Traefik v3 with Docker provider
  • Let's Encrypt automatic certificate generation
  • HTTP to HTTPS redirection
  • Dashboard (optional)

zitadel

Identity provider deployment (see Zitadel Agent for details)

nextcloud

File sync/share deployment (see Nextcloud Agent for details)

backup

Restic backup configuration to Hetzner Storage Box

Playbooks

setup.yml

Initial server provisioning and configuration:

ansible-playbook playbooks/setup.yml

Runs roles in order:

  1. common - Base hardening
  2. docker - Container platform
  3. traefik - Reverse proxy

deploy.yml

Deploy or update applications:

ansible-playbook playbooks/deploy.yml

Runs application-specific roles based on server labels.

Dynamic Inventory

The hcloud.yml inventory automatically queries Hetzner Cloud API for servers.

Server Grouping:

  • By client: client_test, client_alpha
  • By role: role_app_server
  • By location: location_fsn1, location_nbg1

View inventory:

ansible-inventory --graph
ansible-inventory --list
ansible-inventory --host test

Common Tasks

Check Server Connectivity

ansible all -m ping

Run Ad-hoc Command

ansible all -a "uptime"
ansible all -a "df -h"

Update All Packages

ansible all -m apt -a "update_cache=yes upgrade=dist"

Restart Service

ansible all -m service -a "name=docker state=restarted"

Limit to Specific Hosts

# Single host
ansible-playbook playbooks/setup.yml --limit test

# Multiple hosts
ansible-playbook playbooks/setup.yml --limit "test,alpha"

# Group
ansible-playbook playbooks/setup.yml --limit client_test

Development Workflow

Creating a New Role

cd ansible/roles
mkdir -p newrole/{tasks,handlers,templates,defaults,files}

Minimum structure:

  • defaults/main.yml - Default variables
  • tasks/main.yml - Main task list
  • handlers/main.yml - Service handlers (optional)
  • templates/ - Jinja2 templates (optional)

Testing Changes

# Syntax check
ansible-playbook playbooks/setup.yml --syntax-check

# Dry run (no changes)
ansible-playbook playbooks/setup.yml --check

# Limit to test server
ansible-playbook playbooks/setup.yml --limit test

# Verbose output
ansible-playbook playbooks/setup.yml -v
ansible-playbook playbooks/setup.yml -vvv  # Very verbose

Troubleshooting

"No inventory was parsed"

  • Ensure HCLOUD_TOKEN environment variable is set
  • Verify token has read access
  • Check hcloud.yml syntax

"Failed to connect to host"

  • Verify server is running: tofu show
  • Check SSH key is correct: ssh -i ~/.ssh/ptt_infrastructure root@<ip>
  • Verify firewall allows SSH from your IP

"Permission denied (publickey)"

  • Ensure ~/.ssh/ptt_infrastructure private key exists
  • Check ansible.cfg points to correct key
  • Verify public key was added to server via OpenTofu

"Module not found"

  • Install missing Ansible collection:
    ansible-galaxy collection install <collection-name>
    

Ansible is slow

  • Enable SSH pipelining (already configured in ansible.cfg)
  • Use --forks to increase parallelism: ansible-playbook playbooks/setup.yml --forks 20
  • Enable fact caching (already configured)

Security Notes

  • Ansible connects as root user via SSH key
  • No passwords are used anywhere
  • SSH hardening applied automatically via common role
  • UFW firewall enabled by default
  • Fail2ban protects SSH
  • Automatic security updates enabled

Next Steps

After initial setup:

  1. Deploy Zitadel: Follow Zitadel Agent instructions
  2. Deploy Nextcloud: Follow Nextcloud Agent instructions
  3. Configure backups: Use backup role
  4. Set up monitoring: Configure Uptime Kuma

Resources