Collects posts, replies, and mentions for a list of Bluesky accounts, runs AI-powered toxicity analysis across 12 categories, and presents results on a web dashboard. Everything runs in Docker.
## Architecture
```
┌─────────────────────────────────────────┐
│ Docker Compose │
│ │
accounts.yml ───▶│ collector ──▶ PostgreSQL ◀── web (Flask) │
│ │ ▲ │
│ ▼ │ │
│ analyzer ──────────┘ │
│ │ │
│ ▼ │
│ OpenAI API │
│ │
│ scheduler (Ofelia) ── cron triggers │
└─────────────────────────────────────────┘
```
Four services:
- **db** — PostgreSQL 16 (Alpine), stores all data
- **collector** — Python async service that fetches posts and mentions from Bluesky
- **scheduler** — [Ofelia](https://github.com/mcuadros/ofelia) cron that triggers collection (every 4h) and analysis (every 4h + 30min offset)
- **web** — Flask + Gunicorn dashboard on port 5001
## Quick Start
```bash
# 1. Copy and edit your environment config
cp .env.example .env
# Fill in: BSKY_HANDLE, BSKY_APP_PASSWORD, OPENAI_API_KEY
# 2. Add your target accounts to config/accounts.yml
| `sexism` | Gender-based discrimination or harassment |
| `homophobia` | Anti-LGBTQ+ rhetoric |
| `insult` | Personal attacks, name-calling |
| `dehumanization` | Comparing people to animals, vermin, disease |
| `extremism` | Far-right/left rhetoric, Nazi glorification, Great Replacement |
| `ableism` | Disability-targeting language, mental health slurs |
The prompt is tuned for Dutch political discourse, recognizing coded terms like "gelukszoekers", "kutmarokkanen", "landverrader", "linkse ratten", etc. Political disagreement and criticism are not scored as toxic — only genuine hostility, hate, and threats.
### Batch Processing
Posts are sent to the API in batches (default 10 per call) to minimize cost and API calls. The ~500-token system prompt is sent once per batch instead of once per post, cutting input token cost by ~60%.
The scheduled cron runs the analyzer automatically every 4 hours (30 minutes after each collection), so new posts are scored without manual intervention.
## Web Dashboard
Access at `http://localhost:5001` (or your configured `WEB_PORT`).
"SELECT id, started_at, status, posts_scored, mentions_scored, cost_usd FROM analysis_runs ORDER BY started_at DESC LIMIT 5;"
```
### Rebuilding After Code Changes
```bash
docker compose build collector web
docker compose up -d
```
### Add/Remove Accounts
Edit `config/accounts.yml` — changes take effect on the next scheduled or manual run. Removed accounts are marked inactive but their data is preserved.
### First Run / Backfill
The first run pages back up to `MAX_PAGES_PER_ACCOUNT` pages (default 5000 posts). For a deeper backfill, temporarily increase this value: