Initial commit: Bluesky collector with toxicity analysis

- Bluesky post collector with mention tracking - PostgreSQL database for storage - OpenAI-based toxicity analysis - Web UI for viewing and analyzing posts - Docker compose setup for deployment 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
2026-02-08 13:54:36 +01:00 · 2026-02-08 13:54:36 +01:00 · b1fd78e0c1
commit b1fd78e0c1
46 changed files with 7324 additions and 0 deletions
--- a/.dockerignore
+++ b/.dockerignore
@ -0,0 +1,7 @@
 __pycache__
 *.pyc
 .env
 logs/
 .git
 .gitignore
 README.md
--- a/.env.example
+++ b/.env.example
@ -0,0 +1,27 @@
 # PostgreSQL
 POSTGRES_USER=bluesky
 POSTGRES_PASSWORD=changeme
 POSTGRES_PORT=5433
 # Collector settings
 LOG_LEVEL=INFO
 MAX_PAGES_PER_ACCOUNT=50
 MENTION_LOOKBACK_HOURS=12
 # Bluesky authentication (required for mention search)
 # Create an app password at: Settings → App Passwords
 BSKY_HANDLE=
 BSKY_APP_PASSWORD=
 # Toxicity Analyzer (OpenAI)
 # Get a key at: https://platform.openai.com/api-keys
 OPENAI_API_KEY=
 ANALYZER_MODEL=gpt-4.1-nano
 ANALYZER_CONCURRENCY=3
 ANALYZER_BATCH_SIZE=10
 # Web UI
 WEB_PORT=5001
 # Scheduling is controlled by ofelia cron in docker-compose.yml
 # Default: every 4 hours ("0 0 */4 * * *")
--- a/.gitignore
+++ b/.gitignore
@ -0,0 +1,38 @@
 # Environment variables and secrets
 .env
 .env.local
 *.key
 *.pem
 # Python
 __pycache__/
 *.pyc
 *.pyo
 *.pyd
 .Python
 *.so
 *.egg
 *.egg-info/
 dist/
 build/
 .pytest_cache/
 .coverage
 htmlcov/
 # Logs and data
 logs/
 *.log
 # OS files
 .DS_Store
 Thumbs.db
 # IDE
 .vscode/
 .idea/
 *.swp
 *.swo
 *~
 # Docker volumes (if any)
 postgres_data/
--- a/12
+++ b/12
@ -0,0 +1,12 @@
 FROM python:3.12-slim
 WORKDIR /app
 COPY requirements.txt .
 RUN pip install --no-cache-dir -r requirements.txt
 COPY src/ ./src/
 # Keep container alive for ofelia job-exec scheduling.
 # To run manually: docker compose exec collector python -m src.collector
 CMD ["tail", "-f", "/dev/null"]
--- a/README.md
+++ b/README.md
@ -0,0 +1,281 @@
 # Bluesky Account Monitor
 Collects posts, replies, and mentions for a list of Bluesky accounts, runs AI-powered toxicity analysis across 12 categories, and presents results on a web dashboard. Everything runs in Docker.
 ## Architecture
 ```
                    ┌─────────────────────────────────────────┐
                    │               Docker Compose             │
                    │                                          │
  accounts.yml ───▶│  collector ──▶ PostgreSQL ◀── web (Flask) │
                    │      │              ▲                     │
                    │      ▼              │                     │
                    │  analyzer ──────────┘                     │
                    │      │                                    │
                    │      ▼                                    │
                    │  OpenAI API                               │
                    │                                          │
                    │  scheduler (Ofelia) ── cron triggers     │
                    └─────────────────────────────────────────┘
 ```
 Four services:
 - **db** — PostgreSQL 16 (Alpine), stores all data
 - **collector** — Python async service that fetches posts and mentions from Bluesky
 - **scheduler** — [Ofelia](https://github.com/mcuadros/ofelia) cron that triggers collection (every 4h) and analysis (every 4h + 30min offset)
 - **web** — Flask + Gunicorn dashboard on port 5001
 ## Quick Start
 ```bash
 # 1. Copy and edit your environment config
 cp .env.example .env
 # Fill in: BSKY_HANDLE, BSKY_APP_PASSWORD, OPENAI_API_KEY
 # 2. Add your target accounts to config/accounts.yml
 # 3. Start everything
 docker compose up -d
 # 4. Run the toxicity schema migration
 docker compose exec -T db psql -U bluesky -d bluesky < scripts/02-toxicity.sql
 # 5. Trigger an immediate first collection
 docker compose exec collector python -m src
 # 6. Run a test toxicity analysis (100 posts)
 docker compose exec -e ANALYZER_LIMIT=100 collector python -m src.analyzer
 # 7. Open the dashboard
 open http://localhost:5001
 ```
 ## Collection
 ### What Gets Collected
 | Source | API Endpoint | Stored In |
 |--------|-------------|-----------|
 | User's own posts & replies | `getAuthorFeed` (public) | `posts` table |
 | Posts mentioning a user | `searchPosts` (requires auth) | `mentions` table |
 All records include a `raw_json` JSONB column with the full API response for future-proof analysis.
 ### How It Works
 - **Scheduled polling** via Ofelia — runs every 4 hours by default
 - **Incremental collection** — only fetches posts newer than the last run
 - **Rate limit aware** — reads API response headers and sleeps when approaching limits
 - **Deduplication** — posts are upserted by URI; engagement counts are refreshed on re-encounters
 ## Toxicity Analysis
 The analyzer classifies every post and mention using OpenAI's GPT-4.1-nano, scoring content on 12 categories from 0.0 (absent) to 1.0 (extreme):
 | Category | What it detects |
 |----------|----------------|
 | `toxic` | Rude, disrespectful, or aggressive language |
 | `threat` | Violence, harm, intimidation, calls to action |
 | `hate_speech` | Targeting any protected characteristic |
 | `racism` | Race/ethnicity-based hostility |
 | `antisemitism` | Anti-Jewish hate, conspiracy theories, coded language |
 | `islamophobia` | Anti-Muslim hate, "omvolking" narratives |
 | `sexism` | Gender-based discrimination or harassment |
 | `homophobia` | Anti-LGBTQ+ rhetoric |
 | `insult` | Personal attacks, name-calling |
 | `dehumanization` | Comparing people to animals, vermin, disease |
 | `extremism` | Far-right/left rhetoric, Nazi glorification, Great Replacement |
 | `ableism` | Disability-targeting language, mental health slurs |
 The prompt is tuned for Dutch political discourse, recognizing coded terms like "gelukszoekers", "kutmarokkanen", "landverrader", "linkse ratten", etc. Political disagreement and criticism are not scored as toxic — only genuine hostility, hate, and threats.
 ### Batch Processing
 Posts are sent to the API in batches (default 10 per call) to minimize cost and API calls. The ~500-token system prompt is sent once per batch instead of once per post, cutting input token cost by ~60%.
 | | 1 post/call | 10 posts/call (default) |
 |---|---|---|
 | API calls for 60K posts | 60,000 | 6,000 |
 | Estimated cost | ~$5.10 | ~$2.40 |
 ### Running the Analyzer
 ```bash
 # Test run (100 posts)
 docker compose exec -e ANALYZER_LIMIT=100 collector python -m src.analyzer
 # Full run (all unscored posts)
 docker compose exec collector python -m src.analyzer
 # Check logs
 docker compose logs collector | grep analyzer
 cat logs/analyzer.log
 ```
 The scheduled cron runs the analyzer automatically every 4 hours (30 minutes after each collection), so new posts are scored without manual intervention.
 ## Web Dashboard
 Access at `http://localhost:5001` (or your configured `WEB_PORT`).
 ### Pages
 - **Dashboard** — Overview of collection runs, account count, post/mention totals
 - **Accounts** — List of tracked accounts with post counts and last activity
 - **Statuses** — Browse all collected posts with filters and search
 - **Mentions** — Browse mentions of tracked accounts
 - **Analysis** — Toxicity overview: trend charts, category breakdown, recent analysis runs
 - **Flagged Content** — Posts scoring above the flag threshold (default 0.5), filterable by category and type
 - **Account Toxicity** — Per-account toxicity breakdown with comparative charts
 - **Export** — Download data as CSV
 ## Configuration
 ### accounts.yml
 ```yaml
 accounts:
  - handle: alice.bsky.social
  - handle: bob.bsky.social
  - handle: some-org.bsky.social
 ```
 ### Environment Variables
 #### Collection
 | Variable | Default | Description |
 |----------|---------|-------------|
 | `POSTGRES_PASSWORD` | `changeme` | Database password |
 | `POSTGRES_PORT` | `5432` | Exposed PostgreSQL port |
 | `LOG_LEVEL` | `INFO` | Python log level |
 | `MAX_PAGES_PER_ACCOUNT` | `50` | Max API pages per account per run (50 pages = 5000 posts) |
 | `MENTION_LOOKBACK_HOURS` | `12` | How far back to search mentions on first run |
 | `BSKY_HANDLE` | — | Your Bluesky handle (required for mention search) |
 | `BSKY_APP_PASSWORD` | — | App password from Settings > App Passwords |
 #### Toxicity Analysis
 | Variable | Default | Description |
 |----------|---------|-------------|
 | `OPENAI_API_KEY` | — | OpenAI API key (required) |
 | `ANALYZER_MODEL` | `gpt-4.1-nano` | OpenAI model for classification |
 | `ANALYZER_CONCURRENCY` | `3` | Max concurrent API calls (batches in flight) |
 | `ANALYZER_BATCH_SIZE` | `10` | Posts per API call |
 | `ANALYZER_LIMIT` | `0` | Max posts to process per run (0 = all) |
 | `ANALYZER_FLAG_THRESHOLD` | `0.5` | Score above which a post is flagged |
 #### Web UI
 | Variable | Default | Description |
 |----------|---------|-------------|
 | `WEB_PORT` | `5001` | Exposed web dashboard port |
 ## Database Schema
 ### Key Tables
 - **`accounts`** — Tracked accounts (DID, handle, collection timestamps)
 - **`posts`** — Posts from tracked accounts (text, timestamps, engagement counts, post type, raw JSON)
 - **`mentions`** — Posts from anyone that mention a tracked account
 - **`collection_runs`** — Audit trail of each collection run (timing, counts, errors)
 - **`collection_state`** — Per-account bookmarks for incremental collection
 - **`toxicity_scores`** — Per-post scores across all 12 categories + overall + flagged
 - **`mention_toxicity_scores`** — Same structure for mentions
 - **`analysis_runs`** — Audit trail of analyzer runs (timing, counts, cost, errors)
 ### Useful Queries
 ```sql
 -- Recent posts by a specific account
 SELECT created_at, post_type, text, like_count
 FROM posts
 WHERE author_did = (SELECT did FROM accounts WHERE handle = 'alice.bsky.social')
 ORDER BY created_at DESC LIMIT 20;
 -- All mentions of a tracked account
 SELECT m.post_text, m.post_created_at, m.mentioning_did
 FROM mentions m
 JOIN accounts a ON a.did = m.mentioned_did
 WHERE a.handle = 'alice.bsky.social'
 ORDER BY m.post_created_at DESC;
 -- Most toxic posts (overall score)
 SELECT p.text, t.overall, t.toxic, t.threat, t.hate_speech, t.racism
 FROM toxicity_scores t
 JOIN posts p ON p.uri = t.post_uri
 WHERE t.flagged = true
 ORDER BY t.overall DESC LIMIT 20;
 -- Toxicity by account
 SELECT a.handle, avg(t.overall) AS avg_toxicity, count(*) AS scored_posts
 FROM toxicity_scores t
 JOIN posts p ON p.uri = t.post_uri
 JOIN accounts a ON a.did = p.author_did
 GROUP BY a.handle
 ORDER BY avg_toxicity DESC;
 -- Analysis run history
 SELECT id, started_at, status, posts_scored, mentions_scored, cost_usd
 FROM analysis_runs ORDER BY started_at DESC LIMIT 10;
 -- Collection run history
 SELECT id, started_at, status, posts_collected, mentions_collected, duration_secs
 FROM collection_runs ORDER BY started_at DESC LIMIT 10;
 ```
 ## Operations
 ### Manual Runs
 ```bash
 # Collect posts
 docker compose exec collector python -m src
 # Run toxicity analysis
 docker compose exec collector python -m src.analyzer
 ```
 ### Monitoring
 ```bash
 # Follow logs
 docker compose logs -f collector
 # Quick data counts
 docker compose exec -T db psql -U bluesky -d bluesky -c \
  "SELECT (SELECT count(*) FROM posts) AS posts, (SELECT count(*) FROM mentions) AS mentions, (SELECT count(*) FROM toxicity_scores) AS scored;"
 # Last analysis run
 docker compose exec -T db psql -U bluesky -d bluesky -c \
  "SELECT id, started_at, status, posts_scored, mentions_scored, cost_usd FROM analysis_runs ORDER BY started_at DESC LIMIT 5;"
 ```
 ### Rebuilding After Code Changes
 ```bash
 docker compose build collector web
 docker compose up -d
 ```
 ### Add/Remove Accounts
 Edit `config/accounts.yml` — changes take effect on the next scheduled or manual run. Removed accounts are marked inactive but their data is preserved.
 ### First Run / Backfill
 The first run pages back up to `MAX_PAGES_PER_ACCOUNT` pages (default 5000 posts). For a deeper backfill, temporarily increase this value:
 ```bash
 MAX_PAGES_PER_ACCOUNT=200 docker compose exec collector python -m src
 ```
 ### Backup
 The `pgdata` volume persists across container restarts. Back it up with standard PostgreSQL tools:
 ```bash
 docker compose exec -T db pg_dump -U bluesky bluesky > backup.sql
 ```
--- a/config/accounts.yml
+++ b/config/accounts.yml
@ -0,0 +1,171 @@
 # Bluesky accounts to track
 # Source: Bluesky.ods (Dutch politicians, parties, parodies)
 # Total accounts: see below
 accounts:
  # ── Politicians ────────────────────────────────────────
  - handle: franstimmermans.groenlinkspvda.nl  # Frans Timmermans (GroenLinks-PvdA)
  - handle: lisawesterveld.bsky.social  # Lisa Westerveld (GroenLinks-PvdA)
  - handle: laurensdassen.voltnederland.org  # Laurens Dassen (Volt)
  - handle: femkehalsema.bsky.social  # Femke Halsema (GroenLinks)
  - handle: estherouwehand.bsky.social  # Esther Ouwehand (Partij voor de Dieren)
  - handle: laurabromet.bsky.social  # Laura Bromet (GroenLinks-PvdA)
  - handle: henri.cda.nl  # Henri Bontenbal (CDA)
  - handle: sylvanasimons.bsky.social  # Sylvana Simons (BIJ1)
  - handle: jesseklaver.groenlinkspvda.nl  # Jesse Klaver (GroenLinks-PvdA)
  - handle: robjetten.bsky.social  # Rob Jetten (D66)
  - handle: derk.cda.nl  # Derk Boswijk (CDA)
  - handle: marjoleinmoorman.bsky.social  # Marjolein Moorman (GroenLinks-PvdA)
  - handle: mariekekoekkoek.voltnederland.org  # Marieke Koekkoek (Volt)
  - handle: esmahlahlah.bsky.social  # Esmah Lahlah (GroenLinks-PvdA)
  - handle: habtamudehoop.bsky.social  # Habtamu de Hoop (GroenLinks-PvdA)
  - handle: jimmydijk.sp.nl  # Jimmy Dijk (SP)
  - handle: ineskostic.bsky.social  # Ines Kostic (Partij voor de Dieren)
  - handle: suzannekroger.bsky.social  # Suzanne Kroger (GroenLinks-PvdA)
  - handle: tomvanderlee.bsky.social  # Tom van der Lee (GroenLinks-PvdA)
  - handle: barbarakathmann.bsky.social  # Barbara Kathmann (GroenLinks-PvdA)
  - handle: fweisglas.bsky.social  # Frans Weisglas (VVD)
  - handle: rubenbrekelmans.bsky.social  # Ruben Brekelmans (VVD)
  - handle: katipiri.bsky.social  # Kati Piri (GroenLinks-PvdA)
  - handle: jpaternotte.bsky.social  # Jan Paternotte (D66)
  - handle: omtzigt.bsky.social  # Pieter Omtzigt (NSC)
  - handle: erikpjverweij.bsky.social  # Erik Verweij (VVD)
  - handle: kimvsparrentak.bsky.social  # Kim van Sparrentak (GroenLinks-PvdA)
  - handle: mirjambikker.bsky.social  # Mirjam Bikker (ChristenUnie)
  - handle: daniellehirsch.bsky.social  # Danielle Hirsch (GroenLinks-PvdA)
  - handle: lucstultiens.bsky.social  # Luc Stultiens (GroenLinks-PvdA)
  - handle: annakrijger.bsky.social  # Anna Krijger (Partij voor de Dieren)
  - handle: christinepvdd.bsky.social  # Christine Teunissen (Partij voor de Dieren)
  - handle: annemarijke.bsky.social  # Anne-Marijke Podt (D66)
  - handle: ticiaverveer.bsky.social  # Ticia Verveer (VVD)
  - handle: jelgerg.bsky.social  # Jelger Groeneveld (D66)
  - handle: annejessicalise.bsky.social  # An-Jes Oudshoorn (D66)
  - handle: bartgroothuis.bsky.social  # Bart Groothuis (VVD)
  - handle: sneller.bsky.social  # Joost Sneller (D66)
  - handle: pietergrinwis.bsky.social  # Pieter Grinwis (ChristenUnie)
  - handle: julianbushoff.bsky.social  # Julian Bushoff (GroenLinks-PvdA)
  - handle: daanroovers.bsky.social  # Daan Roovers (GroenLinks-PvdA)
  - handle: dijkhoff.bsky.social  # Klaas Dijkhoff (VVD)
  - handle: patijn.bsky.social  # Mariette Patijn (GroenLinks-PvdA)
  - handle: mikaltseggai.bsky.social  # Mikal Tseggai (GroenLinks-PvdA)
  - handle: momohandis.bsky.social  # Mo Mohandis (GroenLinks-PvdA)
  - handle: javijlbrief.bsky.social  # Hans Vijlbrief (D66)
  - handle: maritmaij.bsky.social  # Marit Maij (GroenLinks-PvdA)
  - handle: robertdenhaag.bsky.social  # Robert van Asten (D66)
  - handle: ilanarooderkerk.bsky.social  # Ilana Rooderkerk (D66)
  - handle: casparvandenberg.bsky.social  # Caspar van den Berg (Onafhankelijk)
  - handle: metmarleen.bsky.social  # Marleen Haage (GroenLinks-PvdA)
  - handle: gerbrandy.bsky.social  # Gerben-Jan Gerbrandy (D66)
  - handle: nvanvroonhoven.bsky.social  # Nicolien van Vroonhoven (NSC)
  - handle: janschoonis.bsky.social  # Jan Schoonis (D66)
  - handle: lisaginneken.bsky.social  # Lisa van Ginneken (D66)
  - handle: leoniegerritsen.bsky.social  # Leonie Gerritsen (Partij voor de Dieren)
  - handle: maikel.lukkezen.name  # Maikel Lukkezen (D66)
  - handle: paulblom.nl  # Paul Blom (Partij voor de Dieren)
  - handle: gertjansegers.bsky.social  # Gert-Jan Segers (ChristenUnie)
  - handle: fatihyaabdi.bsky.social  # Fatihya Abdi (GroenLinks-PvdA)
  - handle: sandrabeckerman.bsky.social  # Sandra Beckerman (SP)
  - handle: johannesprakken.bsky.social  # Johannes Prakken (D66)
  - handle: raquelgarciaher.bsky.social  # Raquel Garcia Hermida-vdWalle (D66)
  - handle: ericholterhues.bsky.social  # Eric Holterhues (ChristenUnie)
  - handle: arendkisteman.bsky.social  # Arend Kisteman (VVD)
  - handle: jessesixdijkstra.bsky.social  # Jesse Six Dijkstra (NSC)
  - handle: mariannethieme.bsky.social  # Marianne Thieme (Partij voor de Dieren)
  - handle: andrepoortman.bsky.social  # Andre Poortman (CDA)
  - handle: wiekepaulusma.bsky.social  # Wieke Paulusma (D66)
  - handle: svanoosterhout.bsky.social  # Sjoukje van Oosterhout (GroenLinks-PvdA)
  - handle: danielle-jansen.bsky.social  # Danielle Jansen (NSC)
  - handle: hinddekker.bsky.social  # Hind Dekker-Abdulaziz (D66)
  - handle: dogukanergin.bsky.social  # Dogukan Ergin (DENK)
  - handle: gabyperingopie.voltnederland.org  # Gaby Perin-Gopie (Volt)
  - handle: lisavliegenthart.bsky.social  # Lisa Vliegenthart (GroenLinks-PvdA)
  - handle: tiemenjan.bsky.social  # Tiemen Jan van Dijk (VVD)
  - handle: jerkesetz.bsky.social  # Jerke Setz (ChristenUnie)
  - handle: songulmutluer.bsky.social  # Songul Mutluer (GroenLinks-PvdA)
  - handle: faridazarkan.bsky.social  # Farid Azarkan (DENK)
  - handle: dirkgotink.bsky.social  # Dirk Gotink (NSC)
  - handle: liesvanaelst.bsky.social  # Lies van Aelst (SP)
  - handle: martijnbuijsse.bsky.social  # Martijn Buijsse (VVD)
  - handle: sandra-alberts.bsky.social  # Sandra Alberts (VVD)
  - handle: rikvanwijk.bsky.social  # Rik van Wijk (D66)
  - handle: evertbob.bsky.social  # Evert Bobeldijk (D66)
  - handle: frankwiertz.bsky.social  # Frank Wiertz (D66)
  - handle: andrewvanesch.bsky.social  # Andrew van Esch (D66)
  - handle: jantinezwinkels.bsky.social  # Jantine Zwinkels (CDA)
  - handle: kuneburgers.bsky.social  # Kune Burgers (VVD)
  - handle: pepijnpi.bsky.social  # Pepijn Pi Van de Venne (D66)
  - handle: chris10govaert.bsky.social  # Christine Govaert (BBB)
  - handle: marietbosman62.bsky.social  # Mariet Bosman (BBB)
  - handle: natalienauta.bsky.social  # Natalie Nauta (BBB)
  - handle: nielsoosterom.bsky.social  # Niels Oosterom (BBB)
  - handle: stephanvanbaarle.bsky.social  # Stephan van Baarle (DENK)
  - handle: ananninga.bsky.social  # Annabel Nanninga (JA21)
  - handle: djhvandijk.bsky.social  # Diederik van Dijk (SGP)
  - handle: carladikfaber.bsky.social  # Carla Dik-Faber (ChristenUnie)
  - handle: willemrutjens.bsky.social  # Willem Rutjens (JA21)
  - handle: petraverdonk.bsky.social  # Petra Verdonk (GroenLinks-PvdA)
  - handle: rnbarker.bsky.social  # Robert Barker (Partij voor de Dieren)
  - handle: bertnederveen.bsky.social  # Bert Nederveen (ChristenUnie)
  - handle: kirstenalblas.bsky.social  # Kirsten Alblas (ChristenUnie)
  - handle: benbloem.bsky.social  # Ben Bloem (ChristenUnie)
  # ── Party & Organization Accounts ─────────────────────
  - handle: partijvoordedieren.nl  # Partij voor de Dieren (Partij voor de Dieren)
  - handle: d66.nl  # D66 (D66)
  - handle: vvdonline.bsky.social  # VVD (VVD)
  - handle: christenunie.bsky.social  # ChristenUnie (ChristenUnie)
  - handle: spnederland.bsky.social  # SP (SP)
  - handle: cda.nl  # CDA (CDA)
  - handle: pvdddenhaag.bsky.social  # PvdD Den Haag (Partij voor de Dieren)
  - handle: partijnsc.bsky.social  # NSC (NSC)
  - handle: pvdd-eerstekamer.bsky.social  # PvdD Eerste Kamer (Partij voor de Dieren)
  - handle: pvddeindhoven.bsky.social  # PvdD Eindhoven (Partij voor de Dieren)
  - handle: vvdeuropa.bsky.social  # VVD Europa (VVD)
  - handle: nieuwevvders.bsky.social  # Nieuwe VVD'ers (VVD)
  - handle: pvddfryslan.bsky.social  # PvdD Fryslan (Partij voor de Dieren)
  - handle: perspectief.bsky.social  # PerspectieF (CU jongeren) (ChristenUnie)
  - handle: pvddamsterdam.bsky.social  # PvdD Amsterdam (Partij voor de Dieren)
  - handle: ngpfoundation.bsky.social  # NGP Foundation (PvdD) (Partij voor de Dieren)
  - handle: vvd-overijssel.bsky.social  # VVD Overijssel (VVD)
  - handle: almeresp.bsky.social  # SP Almere (SP)
  - handle: d66vught.bsky.social  # D66 Vught (D66)
  - handle: boerburgerbeweging.bsky.social  # BoerBurgerBeweging (BBB)
  - handle: d66voorschoten.bsky.social  # D66 Voorschoten (D66)
  - handle: sp033.bsky.social  # SP Amersfoort (SP)
  - handle: d66onderwijs.bsky.social  # D66 Onderwijs & Wetenschap (D66)
  - handle: sp-eerstekamer.bsky.social  # SP Eerste Kamerfractie (SP)
  - handle: nsc-limburg.bsky.social  # NSC Limburg (NSC)
  - handle: cda-rotterdam.bsky.social  # CDA Rotterdam (CDA)
  - handle: vvdzaltbommel.bsky.social  # VVD Zaltbommel (VVD)
  - handle: forumvdemocratie.bsky.social  # FvD (niet officieel) (FvD)
  - handle: spamsterdam.bsky.social  # SP Amsterdam (SP)
  - handle: d66leudal.bsky.social  # D66 Leudal (D66)
  - handle: bbbzeeland.bsky.social  # BBB Zeeland (BBB)
  - handle: d66brabant.bsky.social  # D66 Brabant (D66)
  - handle: cda-sd.bsky.social  # CDA Schouwen-Duiveland (CDA)
  - handle: spflevoland.bsky.social  # SP Flevoland (SP)
  - handle: d66hoogeveen.bsky.social  # D66 Hoogeveen (D66)
  - handle: vvdcastricum.bsky.social  # VVD Castricum (VVD)
  - handle: groenlinks-pvda.bsky.social  # GroenLinks-PvdA (GroenLinks-PvdA)
  - handle: voltnederland.org  # Volt Nederland (Volt)
  - handle: social.bij1.org  # BIJ1 (BIJ1)
  - handle: juisteantwoord.bsky.social  # JA21 (JA21)
  - handle: sgpnieuws.bsky.social  # SGP (SGP)
  - handle: charge-volt.bsky.social  # Charge (Volt wetensch.) (Volt)
  - handle: spgelderland.bsky.social  # SP Gelderland (SP)
  - handle: spnoordholland.bsky.social  # SP Noord-Holland (SP)
  - handle: ja21overijssel.bsky.social  # JA21 Overijssel (JA21)
  - handle: groenlinkspvda072.bsky.social  # GL-PvdA Alkmaar (GroenLinks-PvdA)
  - handle: groenlinksassen.nl  # GL Assen (GroenLinks-PvdA)
  - handle: denhaagbij1.bsky.social  # Den Haag BIJ1 (BIJ1)
  - handle: volthouten.bsky.social  # Volt Houten (Volt)
  # ── Parodies & Impersonations ─────────────────────────
  - handle: pvvfaber.bsky.social  # Marjolein Faber (parodie)
  - handle: geertwiiders.bsky.social  # Geert Wiiders (parodie)
  - handle: partijvddieren.bsky.social  # PvdD (impersonation)
  - handle: forumvdemocratie.bsky.social  # FvD (niet officieel)
  - handle: nwsoccontract.bsky.social  # NSC (name squatter)
  - handle: pieter-omtzigt.bsky.social  # Pieter Omtzigt (2nd)
  - handle: dilans-geweten.bsky.social  # Stelt Dilan landsbelang al boven partij?
--- a/docker-compose.yml
+++ b/docker-compose.yml
@ -0,0 +1,72 @@
 services:
  db:
    image: postgres:16-alpine
    environment:
      POSTGRES_DB: bluesky
      POSTGRES_USER: ${POSTGRES_USER:-bluesky}
      POSTGRES_PASSWORD: ${POSTGRES_PASSWORD:-changeme}
    volumes:
      - pgdata:/var/lib/postgresql/data
      - ./scripts/init.sql:/docker-entrypoint-initdb.d/01-init.sql
    ports:
      - "${POSTGRES_PORT:-5432}:5432"
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U ${POSTGRES_USER:-bluesky}"]
      interval: 5s
      retries: 5
    restart: unless-stopped
  collector:
    build: .
    depends_on:
      db:
        condition: service_healthy
    environment:
      DATABASE_URL: postgresql://${POSTGRES_USER:-bluesky}:${POSTGRES_PASSWORD:-changeme}@db:5432/bluesky
      BSKY_PUBLIC_API: https://public.api.bsky.app
      LOG_LEVEL: ${LOG_LEVEL:-INFO}
      MAX_PAGES_PER_ACCOUNT: ${MAX_PAGES_PER_ACCOUNT:-50}
      MENTION_LOOKBACK_HOURS: ${MENTION_LOOKBACK_HOURS:-12}
      BSKY_HANDLE: ${BSKY_HANDLE:-}
      BSKY_APP_PASSWORD: ${BSKY_APP_PASSWORD:-}
      OPENAI_API_KEY: ${OPENAI_API_KEY:-}
      ANALYZER_MODEL: ${ANALYZER_MODEL:-gpt-4.1-nano}
      ANALYZER_CONCURRENCY: ${ANALYZER_CONCURRENCY:-3}
      ANALYZER_BATCH_SIZE: ${ANALYZER_BATCH_SIZE:-10}
      ANALYZER_LIMIT: ${ANALYZER_LIMIT:-0}
    volumes:
      - ./config:/app/config:ro
      - ./logs:/app/logs
      - ./scripts:/app/scripts:ro
    labels:
      ofelia.enabled: "true"
      ofelia.job-exec.collect.schedule: "0 0 */4 * * *"
      ofelia.job-exec.collect.command: "python -m src"
      ofelia.job-exec.analyze.schedule: "0 30 */4 * * *"
      ofelia.job-exec.analyze.command: "python -m src.analyzer"
    restart: unless-stopped
  scheduler:
    image: mcuadros/ofelia:latest
    depends_on:
      - collector
    command: daemon --docker
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock:ro
    restart: unless-stopped
  web:
    build:
      context: .
      dockerfile: web.Dockerfile
    depends_on:
      db:
        condition: service_healthy
    environment:
      DATABASE_URL: postgresql://${POSTGRES_USER:-bluesky}:${POSTGRES_PASSWORD:-changeme}@db:5432/bluesky
    ports:
      - "${WEB_PORT:-5001}:5001"
    restart: unless-stopped
 volumes:
  pgdata:
--- a/requirements-web.txt
+++ b/requirements-web.txt
@ -0,0 +1,3 @@
 flask>=3.0
 gunicorn>=22.0
 psycopg2-binary>=2.9
--- a/requirements.txt
+++ b/requirements.txt
@ -0,0 +1,6 @@
 atproto>=0.0.55
 asyncpg>=0.30.0
 pyyaml>=6.0
 tenacity>=8.2
 httpx>=0.27.0
 openai>=1.60.0
--- a/scripts/02-toxicity.sql
+++ b/scripts/02-toxicity.sql
@ -0,0 +1,65 @@
 -- Toxicity Analysis Schema
 -- Stores per-post and per-mention toxicity scores from LLM classification.
 -- Toxicity scores for posts (from tracked accounts' feeds)
 CREATE TABLE IF NOT EXISTS toxicity_scores (
    uri             TEXT PRIMARY KEY REFERENCES posts(uri) ON DELETE CASCADE,
    overall         REAL NOT NULL,
    toxic           REAL NOT NULL DEFAULT 0,
    threat          REAL NOT NULL DEFAULT 0,
    hate_speech     REAL NOT NULL DEFAULT 0,
    racism          REAL NOT NULL DEFAULT 0,
    antisemitism    REAL NOT NULL DEFAULT 0,
    islamophobia    REAL NOT NULL DEFAULT 0,
    sexism          REAL NOT NULL DEFAULT 0,
    homophobia      REAL NOT NULL DEFAULT 0,
    insult          REAL NOT NULL DEFAULT 0,
    dehumanization  REAL NOT NULL DEFAULT 0,
    extremism       REAL NOT NULL DEFAULT 0,
    ableism         REAL NOT NULL DEFAULT 0,
    flagged         BOOLEAN NOT NULL DEFAULT false,
    model           TEXT NOT NULL DEFAULT 'gpt-4.1-nano',
    scored_at       TIMESTAMPTZ NOT NULL DEFAULT now()
 );
 CREATE INDEX IF NOT EXISTS idx_tox_flagged  ON toxicity_scores (flagged) WHERE flagged = true;
 CREATE INDEX IF NOT EXISTS idx_tox_overall  ON toxicity_scores (overall DESC);
 CREATE INDEX IF NOT EXISTS idx_tox_scored   ON toxicity_scores (scored_at DESC);
 -- Toxicity scores for mentions (posts about tracked accounts)
 CREATE TABLE IF NOT EXISTS mention_toxicity_scores (
    mention_id      BIGINT PRIMARY KEY REFERENCES mentions(id) ON DELETE CASCADE,
    overall         REAL NOT NULL,
    toxic           REAL NOT NULL DEFAULT 0,
    threat          REAL NOT NULL DEFAULT 0,
    hate_speech     REAL NOT NULL DEFAULT 0,
    racism          REAL NOT NULL DEFAULT 0,
    antisemitism    REAL NOT NULL DEFAULT 0,
    islamophobia    REAL NOT NULL DEFAULT 0,
    sexism          REAL NOT NULL DEFAULT 0,
    homophobia      REAL NOT NULL DEFAULT 0,
    insult          REAL NOT NULL DEFAULT 0,
    dehumanization  REAL NOT NULL DEFAULT 0,
    extremism       REAL NOT NULL DEFAULT 0,
    ableism         REAL NOT NULL DEFAULT 0,
    flagged         BOOLEAN NOT NULL DEFAULT false,
    model           TEXT NOT NULL DEFAULT 'gpt-4.1-nano',
    scored_at       TIMESTAMPTZ NOT NULL DEFAULT now()
 );
 CREATE INDEX IF NOT EXISTS idx_mtox_flagged ON mention_toxicity_scores (flagged) WHERE flagged = true;
 CREATE INDEX IF NOT EXISTS idx_mtox_overall ON mention_toxicity_scores (overall DESC);
 -- Analysis run audit trail
 CREATE TABLE IF NOT EXISTS analysis_runs (
    id              BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    started_at      TIMESTAMPTZ NOT NULL DEFAULT now(),
    finished_at     TIMESTAMPTZ,
    status          TEXT NOT NULL DEFAULT 'running',   -- running | completed | failed | partial
    posts_scored    INTEGER NOT NULL DEFAULT 0,
    mentions_scored INTEGER NOT NULL DEFAULT 0,
    errors          INTEGER NOT NULL DEFAULT 0,
    model           TEXT NOT NULL,
    cost_usd        NUMERIC(10,6) DEFAULT 0,
    duration_secs   NUMERIC
 );
--- a/scripts/init.sql
+++ b/scripts/init.sql
@ -0,0 +1,86 @@
 -- Bluesky Collector Schema
 -- Tracks accounts, their posts/replies, and mentions from other users.
 -- Tracked accounts
 CREATE TABLE accounts (
    did                     TEXT PRIMARY KEY,
    handle                  TEXT NOT NULL,
    display_name            TEXT,
    added_at                TIMESTAMPTZ NOT NULL DEFAULT now(),
    last_feed_collected     TIMESTAMPTZ,
    last_mention_collected  TIMESTAMPTZ,
    active                  BOOLEAN NOT NULL DEFAULT true
 );
 CREATE UNIQUE INDEX idx_accounts_handle ON accounts (handle);
 -- Collected posts (from tracked accounts' feeds)
 CREATE TABLE posts (
    uri             TEXT PRIMARY KEY,
    cid             TEXT NOT NULL,
    author_did      TEXT NOT NULL,
    text            TEXT,
    created_at      TIMESTAMPTZ,
    indexed_at      TIMESTAMPTZ,
    collected_at    TIMESTAMPTZ NOT NULL DEFAULT now(),
    reply_parent    TEXT,
    reply_root      TEXT,
    post_type       TEXT NOT NULL DEFAULT 'post',   -- post | reply | repost
    has_media       BOOLEAN DEFAULT false,
    has_embed       BOOLEAN DEFAULT false,
    like_count      INTEGER DEFAULT 0,
    reply_count     INTEGER DEFAULT 0,
    repost_count    INTEGER DEFAULT 0,
    quote_count     INTEGER DEFAULT 0,
    langs           TEXT[],
    raw_json        JSONB NOT NULL
 );
 CREATE INDEX idx_posts_author      ON posts (author_did);
 CREATE INDEX idx_posts_created     ON posts (created_at DESC);
 CREATE INDEX idx_posts_type        ON posts (post_type);
 CREATE INDEX idx_posts_collected   ON posts (collected_at DESC);
 CREATE INDEX idx_posts_reply_root  ON posts (reply_root) WHERE reply_root IS NOT NULL;
 -- Mentions: posts from *anyone* that mention a tracked account
 CREATE TABLE mentions (
    id              BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    post_uri        TEXT NOT NULL,
    mentioned_did   TEXT NOT NULL,
    mentioning_did  TEXT,
    post_text       TEXT,
    post_created_at TIMESTAMPTZ,
    collected_at    TIMESTAMPTZ NOT NULL DEFAULT now(),
    raw_json        JSONB NOT NULL,
    UNIQUE (post_uri, mentioned_did)
 );
 CREATE INDEX idx_mentions_mentioned ON mentions (mentioned_did);
 CREATE INDEX idx_mentions_created   ON mentions (post_created_at DESC);
 -- Collection run audit trail
 CREATE TABLE collection_runs (
    id                  BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    started_at          TIMESTAMPTZ NOT NULL DEFAULT now(),
    finished_at         TIMESTAMPTZ,
    status              TEXT NOT NULL DEFAULT 'running',   -- running | completed | failed | partial
    accounts_total      INTEGER NOT NULL DEFAULT 0,
    accounts_done       INTEGER NOT NULL DEFAULT 0,
    posts_collected     INTEGER NOT NULL DEFAULT 0,
    mentions_collected  INTEGER NOT NULL DEFAULT 0,
    errors              JSONB DEFAULT '[]'::jsonb,
    duration_secs       NUMERIC
 );
 -- Per-account collection bookmark (survives restarts)
 CREATE TABLE collection_state (
    account_did     TEXT NOT NULL,
    collection_type TEXT NOT NULL,       -- feed | mentions
    last_post_at    TIMESTAMPTZ,
    updated_at      TIMESTAMPTZ NOT NULL DEFAULT now(),
    PRIMARY KEY (account_did, collection_type)
 );
--- a/src/init.py
+++ b/src/init.py
--- a/src/main.py
+++ b/src/main.py
@ -0,0 +1,4 @@
 """Allow running as: python -m src"""
 from src.collector import main
 main()
--- a/src/analyzer/init.py
+++ b/src/analyzer/init.py
--- a/src/analyzer/main.py
+++ b/src/analyzer/main.py
@ -0,0 +1,5 @@
 """Allow running as: python -m src.analyzer"""
 from src.analyzer.analyzer import main
 main()
--- a/src/analyzer/analyzer.py
+++ b/src/analyzer/analyzer.py
@ -0,0 +1,282 @@
 """Main toxicity analysis orchestrator.
 Runs as a one-shot batch process: fetches unscored posts and mentions,
 classifies them in batches with GPT-4.1-nano, and stores scores in PostgreSQL.
 Usage:
    python -m src.analyzer
    ANALYZER_LIMIT=100 python -m src.analyzer   # test on 100 posts
 """
 from __future__ import annotations
 import asyncio
 import logging
 import os
 import sys
 import time
 from .classifier import ToxicityClassifier
 from .config import AnalyzerConfig
 from .db import AnalyzerDB
 logger = logging.getLogger("analyzer")
 def make_batches(items: list, batch_size: int) -> list[list]:
    """Split a flat list into sublists of at most batch_size."""
    return [items[i : i + batch_size] for i in range(0, len(items), batch_size)]
 async def classify_posts(
    classifier: ToxicityClassifier,
    db: AnalyzerDB,
    posts: list[dict],
    config: AnalyzerConfig,
 ) -> tuple[int, int, float]:
    """Classify posts in batches, with concurrency control.
    Returns (scored_count, error_count, cost_usd).
    """
    semaphore = asyncio.Semaphore(config.concurrency)
    scored = 0
    errors = 0
    total_input_tokens = 0
    total_output_tokens = 0
    batches = make_batches(posts, config.batch_size)
    logger.info("  Split %d posts into %d batches of ≤%d",
                len(posts), len(batches), config.batch_size)
    async def process_batch(batch: list[dict]) -> None:
        nonlocal scored, errors, total_input_tokens, total_output_tokens
        async with semaphore:
            texts = [p["text"] for p in batch]
            try:
                results = await classifier.classify_batch(texts)
                for post, scores in zip(batch, results):
                    try:
                        score_dict = scores.to_dict()
                        flagged = scores.is_flagged(config.flag_threshold)
                        await db.store_post_score(
                            uri=post["uri"],
                            scores=score_dict,
                            flagged=flagged,
                            model=config.model,
                        )
                        total_input_tokens += scores.input_tokens
                        total_output_tokens += scores.output_tokens
                        scored += 1
                    except Exception:
                        errors += 1
                        logger.exception(
                            "Failed to store score for post %s", post["uri"][:80]
                        )
                if scored % 100 < config.batch_size:
                    logger.info("  Posts scored: %d / %d", scored, len(posts))
            except Exception:
                # Whole batch failed (API error after retries) — count all as errors
                errors += len(batch)
                logger.exception(
                    "Failed to classify batch of %d posts", len(batch)
                )
    tasks = [process_batch(b) for b in batches]
    await asyncio.gather(*tasks)
    cost = (
        total_input_tokens * config.input_cost_per_m / 1_000_000
        + total_output_tokens * config.output_cost_per_m / 1_000_000
    )
    return scored, errors, cost
 async def classify_mentions(
    classifier: ToxicityClassifier,
    db: AnalyzerDB,
    mentions: list[dict],
    config: AnalyzerConfig,
 ) -> tuple[int, int, float]:
    """Classify mentions in batches, with concurrency control.
    Returns (scored_count, error_count, cost_usd).
    """
    semaphore = asyncio.Semaphore(config.concurrency)
    scored = 0
    errors = 0
    total_input_tokens = 0
    total_output_tokens = 0
    batches = make_batches(mentions, config.batch_size)
    logger.info("  Split %d mentions into %d batches of ≤%d",
                len(mentions), len(batches), config.batch_size)
    async def process_batch(batch: list[dict]) -> None:
        nonlocal scored, errors, total_input_tokens, total_output_tokens
        async with semaphore:
            texts = [m["post_text"] for m in batch]
            try:
                results = await classifier.classify_batch(texts)
                for mention, scores in zip(batch, results):
                    try:
                        score_dict = scores.to_dict()
                        flagged = scores.is_flagged(config.flag_threshold)
                        await db.store_mention_score(
                            mention_id=mention["id"],
                            scores=score_dict,
                            flagged=flagged,
                            model=config.model,
                        )
                        total_input_tokens += scores.input_tokens
                        total_output_tokens += scores.output_tokens
                        scored += 1
                    except Exception:
                        errors += 1
                        logger.exception(
                            "Failed to store score for mention %d", mention["id"]
                        )
                if scored % 100 < config.batch_size:
                    logger.info("  Mentions scored: %d / %d", scored, len(mentions))
            except Exception:
                errors += len(batch)
                logger.exception(
                    "Failed to classify batch of %d mentions", len(batch)
                )
    tasks = [process_batch(b) for b in batches]
    await asyncio.gather(*tasks)
    cost = (
        total_input_tokens * config.input_cost_per_m / 1_000_000
        + total_output_tokens * config.output_cost_per_m / 1_000_000
    )
    return scored, errors, cost
 async def run() -> None:
    config = AnalyzerConfig.from_env()
    logging.basicConfig(
        level=getattr(logging, config.log_level.upper(), logging.INFO),
        format="%(asctime)s [%(levelname)s] %(name)s: %(message)s",
        handlers=[logging.StreamHandler(sys.stdout)],
    )
    # Also log to file
    log_dir = "/app/logs"
    if os.path.isdir(log_dir):
        fh = logging.FileHandler(os.path.join(log_dir, "analyzer.log"))
        fh.setFormatter(
            logging.Formatter("%(asctime)s [%(levelname)s] %(name)s: %(message)s")
        )
        logging.getLogger().addHandler(fh)
    logger.info("=" * 60)
    logger.info("Toxicity Analyzer starting (model: %s, concurrency: %d, batch_size: %d)",
                config.model, config.concurrency, config.batch_size)
    db = AnalyzerDB(config.database_url)
    classifier = ToxicityClassifier(
        api_key=config.openai_api_key,
        model=config.model,
    )
    try:
        await db.connect()
        await db.apply_migration()
        # Start analysis run
        run_id = await db.start_analysis_run(model=config.model)
        start_time = time.time()
        # Fetch unscored items
        limit = config.limit if config.limit > 0 else 0
        posts = await db.get_unscored_posts(limit=limit)
        mentions = await db.get_unscored_mentions(limit=limit)
        logger.info("Found %d unscored posts, %d unscored mentions",
                     len(posts), len(mentions))
        if not posts and not mentions:
            logger.info("Nothing to score — exiting.")
            await db.finish_analysis_run(
                run_id, status="completed",
                posts_scored=0, mentions_scored=0, errors=0, cost_usd=0.0,
            )
            return
        total_cost = 0.0
        total_errors = 0
        # Phase 1: Classify posts
        if posts:
            logger.info("Phase 1: Classifying %d posts in batches of %d...",
                        len(posts), config.batch_size)
            p_scored, p_errors, p_cost = await classify_posts(
                classifier, db, posts, config,
            )
            logger.info("  Posts done: %d scored, %d errors, $%.4f",
                         p_scored, p_errors, p_cost)
            total_cost += p_cost
            total_errors += p_errors
        else:
            p_scored = 0
        # Phase 2: Classify mentions
        if mentions:
            logger.info("Phase 2: Classifying %d mentions in batches of %d...",
                        len(mentions), config.batch_size)
            m_scored, m_errors, m_cost = await classify_mentions(
                classifier, db, mentions, config,
            )
            logger.info("  Mentions done: %d scored, %d errors, $%.4f",
                         m_scored, m_errors, m_cost)
            total_cost += m_cost
            total_errors += m_errors
        else:
            m_scored = 0
        # Finalize run
        duration = time.time() - start_time
        status = "completed" if total_errors == 0 else "partial"
        await db.finish_analysis_run(
            run_id,
            status=status,
            posts_scored=p_scored,
            mentions_scored=m_scored,
            errors=total_errors,
            cost_usd=total_cost,
        )
        logger.info("=" * 60)
        logger.info("Analysis complete — status: %s", status)
        logger.info("  Posts scored: %d, Mentions scored: %d, Errors: %d",
                     p_scored, m_scored, total_errors)
        logger.info("  Estimated cost: $%.4f", total_cost)
        logger.info("  Duration: %.1f seconds", duration)
    except Exception:
        logger.exception("Analyzer crashed")
        raise
    finally:
        await classifier.close()
        await db.close()
 def main() -> None:
    asyncio.run(run())
 if __name__ == "__main__":
    main()
--- a/src/analyzer/classifier.py
+++ b/src/analyzer/classifier.py
@ -0,0 +1,287 @@
 """OpenAI-powered toxicity classifier.
 Sends batches of posts to GPT-4.1-nano for multi-category toxicity scoring.
 Returns a list of dicts of category → score (0.0–1.0).
 """
 from __future__ import annotations
 import asyncio
 import json
 import logging
 import random
 from dataclasses import dataclass
 from openai import AsyncOpenAI, RateLimitError, APITimeoutError, APIConnectionError
 logger = logging.getLogger("analyzer.classifier")
 # The 12 toxicity categories we score
 CATEGORIES = [
    "toxic",
    "threat",
    "hate_speech",
    "racism",
    "antisemitism",
    "islamophobia",
    "sexism",
    "homophobia",
    "insult",
    "dehumanization",
    "extremism",
    "ableism",
 ]
 # System prompt for batch classification — multiple numbered posts per call
 SYSTEM_PROMPT = """\
 You are a toxicity classifier for Dutch and English social media posts about politics.
 You will receive one or more numbered posts. Score EACH post on every category from 0.0 (none) to 1.0 (extreme).
 Respond ONLY with a JSON object mapping post numbers (as strings) to their scores. No other text.
 Categories:
 - toxic: rude, disrespectful, or aggressive language
 - threat: threats of violence, harm, intimidation, or calls to action against a person
 - hate_speech: targeting people based on any protected characteristic (race, religion, gender, sexual orientation, disability, nationality)
 - racism: specifically targeting race or ethnicity (e.g. anti-Black, anti-Asian, anti-Moroccan sentiment, "Zwarte Piet" debates when derogatory)
 - antisemitism: targeting Jewish people, Holocaust denial or minimization, Jewish conspiracy theories, coded language like "globalists", "Rothschilds", triple parentheses
 - islamophobia: anti-Muslim hate, mosque opposition framed as hate, "Islam is not a religion" rhetoric, "takeover/omvolking" narratives, halal/hijab targeting
 - sexism: gender-based discrimination, harassment, misogyny, or misandry
 - homophobia: targeting sexual orientation or gender identity, anti-LGBTQ+ rhetoric
 - insult: personal attacks, name-calling, belittling
 - dehumanization: comparing people to animals, vermin, disease, parasites, or other dehumanizing language
 - extremism: far-right or far-left extremist rhetoric, Nazi symbolism or glorification, white supremacist language, Great Replacement theory ("omvolkingstheorie"), calls for political violence, fascist/authoritarian glorification
 - ableism: targeting people with disabilities, using mental health conditions as insults (e.g. "gestoord", "autist" as slur, "mongool")
 Important context:
 - Many posts are in Dutch. Handle Dutch slang, insults, and coded political language.
 - Dutch-specific coded terms: "gelukszoekers", "kutmarokkanen", "omvolking", "landverrader", "volksverrader", "linkse ratten", "wappie", "tokkie" — score appropriately based on context.
 - Political disagreement and criticism are NOT toxic — only score actual hostility, hate, or threats.
 - Satire and parody accounts may use irony — consider context but still score the literal content.
 - A score of 0.0 means the category is completely absent. A score of 1.0 means extreme/explicit.
 - Most posts will score 0.0 on most categories. Only flag genuine toxicity.
 Example for 2 posts:
 {"1":{"toxic":0.0,"threat":0.0,"hate_speech":0.0,"racism":0.0,"antisemitism":0.0,"islamophobia":0.0,"sexism":0.0,"homophobia":0.0,"insult":0.0,"dehumanization":0.0,"extremism":0.0,"ableism":0.0},"2":{"toxic":0.3,"threat":0.0,"hate_speech":0.0,"racism":0.0,"antisemitism":0.0,"islamophobia":0.0,"sexism":0.0,"homophobia":0.0,"insult":0.2,"dehumanization":0.0,"extremism":0.0,"ableism":0.0}}"""
@dataclass
 class ToxicityScores:
    """Classification result for a single post."""
    toxic: float = 0.0
    threat: float = 0.0
    hate_speech: float = 0.0
    racism: float = 0.0
    antisemitism: float = 0.0
    islamophobia: float = 0.0
    sexism: float = 0.0
    homophobia: float = 0.0
    insult: float = 0.0
    dehumanization: float = 0.0
    extremism: float = 0.0
    ableism: float = 0.0
    @property
    def overall(self) -> float:
        """Overall toxicity = max of all categories."""
        return max(
            self.toxic,
            self.threat,
            self.hate_speech,
            self.racism,
            self.antisemitism,
            self.islamophobia,
            self.sexism,
            self.homophobia,
            self.insult,
            self.dehumanization,
            self.extremism,
            self.ableism,
        )
    def is_flagged(self, threshold: float = 0.5) -> bool:
        return self.overall >= threshold
    def to_dict(self) -> dict:
        return {
            "toxic": self.toxic,
            "threat": self.threat,
            "hate_speech": self.hate_speech,
            "racism": self.racism,
            "antisemitism": self.antisemitism,
            "islamophobia": self.islamophobia,
            "sexism": self.sexism,
            "homophobia": self.homophobia,
            "insult": self.insult,
            "dehumanization": self.dehumanization,
            "extremism": self.extremism,
            "ableism": self.ableism,
            "overall": self.overall,
        }
    # Approximate token counts for cost tracking
    input_tokens: int = 0
    output_tokens: int = 0
 def parse_scores(raw: str) -> ToxicityScores:
    """Parse the JSON scores for a single post into ToxicityScores."""
    try:
        data = json.loads(raw) if isinstance(raw, str) else raw
    except json.JSONDecodeError:
        logger.warning("Failed to parse JSON response: %s", str(raw)[:200])
        return ToxicityScores()
    def clamp(val) -> float:
        try:
            f = float(val)
            return max(0.0, min(1.0, f))
        except (TypeError, ValueError):
            return 0.0
    return ToxicityScores(
        toxic=clamp(data.get("toxic")),
        threat=clamp(data.get("threat")),
        hate_speech=clamp(data.get("hate_speech")),
        racism=clamp(data.get("racism")),
        antisemitism=clamp(data.get("antisemitism")),
        islamophobia=clamp(data.get("islamophobia")),
        sexism=clamp(data.get("sexism")),
        homophobia=clamp(data.get("homophobia")),
        insult=clamp(data.get("insult")),
        dehumanization=clamp(data.get("dehumanization")),
        extremism=clamp(data.get("extremism")),
        ableism=clamp(data.get("ableism")),
    )
 def parse_batch_response(raw: str, batch_size: int) -> list[ToxicityScores]:
    """Parse a batched JSON response into a list of ToxicityScores.
    Expected format: {"1": {...scores...}, "2": {...scores...}, ...}
    Returns a list of ToxicityScores in the same order as the input batch.
    """
    try:
        data = json.loads(raw)
    except json.JSONDecodeError:
        logger.warning("Failed to parse batch JSON: %s", raw[:300])
        return [ToxicityScores() for _ in range(batch_size)]
    results = []
    for i in range(1, batch_size + 1):
        key = str(i)
        if key in data and isinstance(data[key], dict):
            results.append(parse_scores(data[key]))
        else:
            logger.warning("Missing scores for post %d in batch response", i)
            results.append(ToxicityScores())
    return results
 class ToxicityClassifier:
    """Async OpenAI-based toxicity classifier with batch support."""
    def __init__(self, api_key: str, model: str = "gpt-4.1-nano"):
        self.client = AsyncOpenAI(api_key=api_key)
        self.model = model
    async def classify_batch(
        self, texts: list[str], max_retries: int = 5
    ) -> list[ToxicityScores]:
        """Classify multiple posts in a single API call.
        Args:
            texts: List of post texts to classify (1–batch_size items).
            max_retries: Number of retries on rate limit / transient errors.
        Returns:
            List of ToxicityScores, one per input text, in the same order.
        """
        if not texts:
            return []
        # Handle single-item batches efficiently
        batch_size = len(texts)
        # Build the numbered user message
        parts = []
        for i, text in enumerate(texts, 1):
            # Truncate very long posts
            t = text.strip() if text else ""
            if len(t) > 2000:
                t = t[:2000]
            if not t:
                t = "(empty)"
            parts.append(f"[{i}] {t}")
        user_message = "\n\n".join(parts)
        # Scale max_tokens by batch size.
        # Each post's JSON scores ≈ 60 tokens compact, but the model often
        # outputs formatted JSON (whitespace/newlines) which can double that.
        # Use a generous budget to avoid truncation.
        max_tokens = max(300, batch_size * 200)
        last_err = None
        for attempt in range(max_retries):
            try:
                response = await self.client.chat.completions.create(
                    model=self.model,
                    temperature=0,
                    max_tokens=max_tokens,
                    response_format={"type": "json_object"},
                    messages=[
                        {"role": "system", "content": SYSTEM_PROMPT},
                        {"role": "user", "content": user_message},
                    ],
                )
                content = response.choices[0].message.content or "{}"
                results = parse_batch_response(content, batch_size)
                # Distribute token usage evenly for cost tracking
                if response.usage:
                    per_post_input = response.usage.prompt_tokens // batch_size
                    per_post_output = response.usage.completion_tokens // batch_size
                    for scores in results:
                        scores.input_tokens = per_post_input
                        scores.output_tokens = per_post_output
                return results
            except RateLimitError as e:
                last_err = e
                wait = min(2 ** attempt + random.uniform(0.5, 1.5), 30)
                logger.debug(
                    "Rate limited (attempt %d/%d), waiting %.1fs",
                    attempt + 1, max_retries, wait,
                )
                await asyncio.sleep(wait)
            except (APITimeoutError, APIConnectionError) as e:
                last_err = e
                wait = 2 ** attempt + random.uniform(0, 1)
                logger.debug(
                    "Transient error (attempt %d/%d), retrying in %.1fs: %s",
                    attempt + 1, max_retries, wait, e,
                )
                await asyncio.sleep(wait)
            except Exception:
                logger.exception(
                    "Batch classification API call failed (%d posts)", batch_size
                )
                raise
        # All retries exhausted
        logger.error("Rate limit retries exhausted for batch of %d posts", batch_size)
        raise last_err
    async def classify(self, text: str, max_retries: int = 5) -> ToxicityScores:
        """Classify a single post (convenience wrapper around classify_batch)."""
        results = await self.classify_batch([text], max_retries=max_retries)
        return results[0]
    async def close(self):
        await self.client.close()
--- a/src/analyzer/config.py
+++ b/src/analyzer/config.py
@ -0,0 +1,44 @@
 """Analyzer configuration loaded from environment variables."""
 from __future__ import annotations
 import os
 from dataclasses import dataclass
@dataclass
 class AnalyzerConfig:
    database_url: str
    openai_api_key: str
    model: str = "gpt-4.1-nano"
    concurrency: int = 3          # concurrent API calls (batches in flight)
    batch_size: int = 10           # posts per API call
    limit: int = 0                 # 0 = no limit (process all unscored)
    flag_threshold: float = 0.5
    log_level: str = "INFO"
    # Cost tracking (per 1M tokens)
    input_cost_per_m: float = 0.10
    output_cost_per_m: float = 0.40
    @classmethod
    def from_env(cls) -> AnalyzerConfig:
        api_key = os.environ.get("OPENAI_API_KEY", "")
        if not api_key:
            raise ValueError(
                "OPENAI_API_KEY environment variable is required. "
                "Get one at https://platform.openai.com/api-keys"
            )
        return cls(
            database_url=os.environ.get(
                "DATABASE_URL",
                "postgresql://bluesky:changeme@db:5432/bluesky",
            ),
            openai_api_key=api_key,
            model=os.environ.get("ANALYZER_MODEL", "gpt-4.1-nano"),
            concurrency=int(os.environ.get("ANALYZER_CONCURRENCY", "3")),
            batch_size=int(os.environ.get("ANALYZER_BATCH_SIZE", "10")),
            limit=int(os.environ.get("ANALYZER_LIMIT", "0")),
            flag_threshold=float(os.environ.get("ANALYZER_FLAG_THRESHOLD", "0.5")),
            log_level=os.environ.get("LOG_LEVEL", "INFO"),
        )
--- a/src/analyzer/db.py
+++ b/src/analyzer/db.py
@ -0,0 +1,201 @@
 """Async database layer for the toxicity analyzer.
 Handles fetching unscored posts/mentions and storing classification results.
 """
 from __future__ import annotations
 import logging
 from datetime import datetime, timezone
 from pathlib import Path
 import asyncpg
 logger = logging.getLogger("analyzer.db")
 MIGRATION_FILE = Path(__file__).parent.parent.parent / "scripts" / "02-toxicity.sql"
 class AnalyzerDB:
    """Async PostgreSQL operations for the analyzer."""
    def __init__(self, dsn: str):
        self._dsn = dsn
        self._pool: asyncpg.Pool | None = None
    async def connect(self) -> None:
        self._pool = await asyncpg.create_pool(self._dsn, min_size=2, max_size=10)
        logger.info("Database connected")
    async def close(self) -> None:
        if self._pool:
            await self._pool.close()
    async def apply_migration(self) -> None:
        """Apply the toxicity schema migration if tables don't exist."""
        async with self._pool.acquire() as conn:
            # Check if toxicity_scores table exists
            exists = await conn.fetchval("""
                SELECT EXISTS (
                    SELECT FROM information_schema.tables
                    WHERE table_name = 'toxicity_scores'
                )
            """)
            if not exists:
                logger.info("Applying toxicity schema migration...")
                sql = MIGRATION_FILE.read_text()
                await conn.execute(sql)
                logger.info("Migration applied successfully")
            else:
                logger.debug("Toxicity tables already exist")
    # ── Fetch unscored items ─────────────────────────────────────────────
    async def get_unscored_posts(self, limit: int = 0) -> list[dict]:
        """Get posts that haven't been scored yet.
        Skips reposts (no text) and posts with empty text.
        """
        query = """
            SELECT p.uri, p.text, p.post_type, p.author_did
            FROM posts p
            LEFT JOIN toxicity_scores ts ON ts.uri = p.uri
            WHERE ts.uri IS NULL
              AND p.post_type != 'repost'
              AND p.text IS NOT NULL
              AND p.text != ''
            ORDER BY p.created_at DESC
        """
        if limit > 0:
            query += f" LIMIT {limit}"
        async with self._pool.acquire() as conn:
            rows = await conn.fetch(query)
            return [dict(r) for r in rows]
    async def get_unscored_mentions(self, limit: int = 0) -> list[dict]:
        """Get mentions that haven't been scored yet."""
        query = """
            SELECT m.id, m.post_text, m.mentioned_did, m.mentioning_did
            FROM mentions m
            LEFT JOIN mention_toxicity_scores mts ON mts.mention_id = m.id
            WHERE mts.mention_id IS NULL
              AND m.post_text IS NOT NULL
              AND m.post_text != ''
            ORDER BY m.post_created_at DESC
        """
        if limit > 0:
            query += f" LIMIT {limit}"
        async with self._pool.acquire() as conn:
            rows = await conn.fetch(query)
            return [dict(r) for r in rows]
    # ── Store scores ─────────────────────────────────────────────────────
    async def store_post_score(
        self,
        uri: str,
        scores: dict,
        flagged: bool,
        model: str,
    ) -> None:
        """Insert a toxicity score for a post."""
        async with self._pool.acquire() as conn:
            await conn.execute("""
                INSERT INTO toxicity_scores
                    (uri, overall, toxic, threat, hate_speech, racism,
                     antisemitism, islamophobia,
                     sexism, homophobia, insult, dehumanization,
                     extremism, ableism, flagged, model)
                VALUES ($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16)
                ON CONFLICT (uri) DO NOTHING
            """,
                uri,
                scores["overall"],
                scores["toxic"],
                scores["threat"],
                scores["hate_speech"],
                scores["racism"],
                scores["antisemitism"],
                scores["islamophobia"],
                scores["sexism"],
                scores["homophobia"],
                scores["insult"],
                scores["dehumanization"],
                scores["extremism"],
                scores["ableism"],
                flagged,
                model,
            )
    async def store_mention_score(
        self,
        mention_id: int,
        scores: dict,
        flagged: bool,
        model: str,
    ) -> None:
        """Insert a toxicity score for a mention."""
        async with self._pool.acquire() as conn:
            await conn.execute("""
                INSERT INTO mention_toxicity_scores
                    (mention_id, overall, toxic, threat, hate_speech, racism,
                     antisemitism, islamophobia,
                     sexism, homophobia, insult, dehumanization,
                     extremism, ableism, flagged, model)
                VALUES ($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16)
                ON CONFLICT (mention_id) DO NOTHING
            """,
                mention_id,
                scores["overall"],
                scores["toxic"],
                scores["threat"],
                scores["hate_speech"],
                scores["racism"],
                scores["antisemitism"],
                scores["islamophobia"],
                scores["sexism"],
                scores["homophobia"],
                scores["insult"],
                scores["dehumanization"],
                scores["extremism"],
                scores["ableism"],
                flagged,
                model,
            )
    # ── Analysis run tracking ────────────────────────────────────────────
    async def start_analysis_run(self, model: str) -> int:
        """Create a new analysis run record. Returns run ID."""
        async with self._pool.acquire() as conn:
            return await conn.fetchval("""
                INSERT INTO analysis_runs (model) VALUES ($1)
                RETURNING id
            """, model)
    async def finish_analysis_run(
        self,
        run_id: int,
        status: str,
        posts_scored: int,
        mentions_scored: int,
        errors: int,
        cost_usd: float,
    ) -> None:
        """Finalize an analysis run with results."""
        async with self._pool.acquire() as conn:
            await conn.execute("""
                UPDATE analysis_runs
                SET finished_at = now(),
                    status = $2,
                    posts_scored = $3,
                    mentions_scored = $4,
                    errors = $5,
                    cost_usd = $6,
                    duration_secs = EXTRACT(EPOCH FROM (now() - started_at))
                WHERE id = $1
            """,
                run_id, status, posts_scored, mentions_scored, errors, cost_usd,
            )
--- a/src/bluesky_client.py
+++ b/src/bluesky_client.py
@ -0,0 +1,301 @@
 """Bluesky AT Protocol API client with rate limiting and retry logic.
 Uses httpx directly against the public API for feeds (no auth needed),
 and an authenticated session via a PDS for searchPosts (which requires auth).
 """
 from __future__ import annotations
 import asyncio
 import logging
 import time
 from datetime import datetime
 import httpx
 from tenacity import (
    retry,
    retry_if_exception_type,
    stop_after_attempt,
    wait_exponential,
 )
 from .models import Mention, Post
 logger = logging.getLogger(__name__)
 # Bluesky embed type prefixes for detecting media/embeds
 _IMAGE_TYPES = {"app.bsky.embed.images#view", "app.bsky.embed.images"}
 _VIDEO_TYPES = {"app.bsky.embed.video#view", "app.bsky.embed.video"}
 _MEDIA_TYPES = _IMAGE_TYPES | _VIDEO_TYPES
 class RateLimiter:
    """Tracks rate limit state from API response headers."""
    def __init__(self):
        self._remaining: int = 3000
        self._reset_at: float = 0.0
        self._lock = asyncio.Lock()
    def update(self, headers: httpx.Headers) -> None:
        remaining = headers.get("ratelimit-remaining")
        reset = headers.get("ratelimit-reset")
        if remaining is not None:
            self._remaining = int(remaining)
        if reset is not None:
            self._reset_at = float(reset)
    async def wait_if_needed(self) -> None:
        async with self._lock:
            if self._remaining <= 20:
                sleep_for = max(0, self._reset_at - time.time()) + 1.0
                logger.warning(
                    "Rate limit nearly exhausted (%d remaining). "
                    "Sleeping %.1f seconds until reset.",
                    self._remaining,
                    sleep_for,
                )
                await asyncio.sleep(sleep_for)
 class BlueskyClient:
    """Async HTTP client for the Bluesky API.
    Uses the public AppView for feeds and an authenticated PDS session
    for searchPosts (which returns 403 on the public API).
    """
    def __init__(self, base_url: str = "https://public.api.bsky.app"):
        self._base = base_url.rstrip("/")
        self._http = httpx.AsyncClient(timeout=30.0)
        self._rate = RateLimiter()
        # Authenticated session state (for searchPosts)
        self._auth_token: str | None = None
        self._auth_pds: str | None = None  # e.g. https://bsky.social
    async def login(self, handle: str, app_password: str) -> None:
        """Create an authenticated session for search requests."""
        resp = await self._http.post(
            "https://bsky.social/xrpc/com.atproto.server.createSession",
            json={"identifier": handle, "password": app_password},
        )
        resp.raise_for_status()
        data = resp.json()
        self._auth_token = data["accessJwt"]
        # Use the PDS from the DID doc if available, otherwise default
        self._auth_pds = data.get("didDoc", {}).get("service", [{}])[0].get("serviceEndpoint", "https://bsky.social")
        logger.info("Authenticated as %s (PDS: %s)", handle, self._auth_pds)
    async def close(self) -> None:
        await self._http.aclose()
    # ── Low-level request helpers ───────────────────────────────────────
    @retry(
        stop=stop_after_attempt(4),
        wait=wait_exponential(multiplier=1, min=2, max=30),
        retry=retry_if_exception_type(
            (httpx.ConnectError, httpx.ReadTimeout, httpx.ConnectTimeout)
        ),
        reraise=True,
    )
    async def _get(self, endpoint: str, params: dict) -> dict:
        """Make an unauthenticated GET request (public API)."""
        await self._rate.wait_if_needed()
        url = f"{self._base}/xrpc/{endpoint}"
        resp = await self._http.get(url, params={k: v for k, v in params.items() if v is not None})
        self._rate.update(resp.headers)
        if resp.status_code == 429:
            reset = resp.headers.get("ratelimit-reset")
            sleep_for = max(0, float(reset) - time.time()) + 1.0 if reset else 30.0
            logger.warning("HTTP 429 — sleeping %.1f seconds", sleep_for)
            await asyncio.sleep(sleep_for)
            raise httpx.ReadTimeout("Rate limited, retrying")
        resp.raise_for_status()
        return resp.json()
    @retry(
        stop=stop_after_attempt(4),
        wait=wait_exponential(multiplier=1, min=2, max=30),
        retry=retry_if_exception_type(
            (httpx.ConnectError, httpx.ReadTimeout, httpx.ConnectTimeout)
        ),
        reraise=True,
    )
    async def _get_auth(self, endpoint: str, params: dict) -> dict:
        """Make an authenticated GET request via the user's PDS."""
        await self._rate.wait_if_needed()
        base = self._auth_pds or self._base
        url = f"{base}/xrpc/{endpoint}"
        headers = {"Authorization": f"Bearer {self._auth_token}"} if self._auth_token else {}
        resp = await self._http.get(
            url,
            params={k: v for k, v in params.items() if v is not None},
            headers=headers,
        )
        self._rate.update(resp.headers)
        if resp.status_code == 429:
            reset = resp.headers.get("ratelimit-reset")
            sleep_for = max(0, float(reset) - time.time()) + 1.0 if reset else 30.0
            logger.warning("HTTP 429 (auth) — sleeping %.1f seconds", sleep_for)
            await asyncio.sleep(sleep_for)
            raise httpx.ReadTimeout("Rate limited, retrying")
        resp.raise_for_status()
        return resp.json()
    # ── Handle resolution ───────────────────────────────────────────────
    async def resolve_handle(self, handle: str) -> str | None:
        """Resolve a Bluesky handle to a DID. Returns None on failure."""
        try:
            data = await self._get(
                "com.atproto.identity.resolveHandle", {"handle": handle}
            )
            did = data.get("did")
            logger.debug("Resolved %s -> %s", handle, did)
            return did
        except Exception:
            logger.exception("Failed to resolve handle: %s", handle)
            return None
    # ── Author feed ─────────────────────────────────────────────────────
    async def get_author_feed_page(
        self,
        actor: str,
        cursor: str | None = None,
        limit: int = 100,
        filter_type: str = "posts_with_replies",
    ) -> tuple[list[dict], str | None]:
        """Fetch one page of an author's feed.
        Returns (list_of_raw_feed_items, next_cursor).
        """
        data = await self._get(
            "app.bsky.feed.getAuthorFeed",
            {
                "actor": actor,
                "cursor": cursor,
                "limit": limit,
                "filter": filter_type,
            },
        )
        return data.get("feed", []), data.get("cursor")
    # ── Mention search ──────────────────────────────────────────────────
    async def search_mentions_page(
        self,
        handle: str,
        since: str | None = None,
        cursor: str | None = None,
        limit: int = 100,
    ) -> tuple[list[dict], str | None]:
        """Search for posts mentioning a handle.
        Returns (list_of_raw_post_objects, next_cursor).
        Uses authenticated PDS endpoint if available (public API 403s on search).
        """
        getter = self._get_auth if self._auth_token else self._get
        search_params = {
            "q": "*",
            "mentions": handle,
            "since": since,
            "sort": "latest",
            "cursor": cursor,
            "limit": limit,
        }
        try:
            data = await getter("app.bsky.feed.searchPosts", search_params)
            return data.get("posts", []), data.get("cursor")
        except httpx.HTTPStatusError as e:
            if e.response.status_code not in (400, 403):
                raise
            logger.warning(
                "Mention search failed for %s (HTTP %d) — skipping",
                handle, e.response.status_code,
            )
            return [], None
 # ── Mapping helpers ─────────────────────────────────────────────────────
 def _parse_dt(s: str | None) -> datetime | None:
    """Parse an ISO datetime string into a timezone-aware datetime."""
    if not s:
        return None
    try:
        # Handle the Z suffix and various ISO formats
        s = s.replace("Z", "+00:00")
        return datetime.fromisoformat(s)
    except (ValueError, TypeError):
        return None
 def _detect_embed_type(embed: dict | None) -> tuple[bool, bool]:
    """Return (has_media, has_embed) from an embed object."""
    if not embed:
        return False, False
    etype = embed.get("$type", "")
    has_media = etype in _MEDIA_TYPES
    has_embed = bool(embed)  # any embed counts
    return has_media, has_embed
 def map_feed_item_to_post(item: dict) -> Post:
    """Map a raw getAuthorFeed item to a Post model."""
    post_view = item.get("post", {})
    record = post_view.get("record", {})
    reply_ref = record.get("reply")
    reason = item.get("reason")
    # Determine post type
    if reason and reason.get("$type") == "app.bsky.feed.defs#reasonRepost":
        post_type = "repost"
    elif reply_ref is not None:
        post_type = "reply"
    else:
        post_type = "post"
    has_media, has_embed = _detect_embed_type(post_view.get("embed"))
    return Post(
        uri=post_view.get("uri", ""),
        cid=post_view.get("cid", ""),
        author_did=post_view.get("author", {}).get("did", ""),
        text=record.get("text"),
        created_at=_parse_dt(record.get("createdAt")),
        indexed_at=_parse_dt(post_view.get("indexedAt")),
        reply_parent=reply_ref.get("parent", {}).get("uri") if reply_ref else None,
        reply_root=reply_ref.get("root", {}).get("uri") if reply_ref else None,
        post_type=post_type,
        has_media=has_media,
        has_embed=has_embed,
        like_count=post_view.get("likeCount", 0) or 0,
        reply_count=post_view.get("replyCount", 0) or 0,
        repost_count=post_view.get("repostCount", 0) or 0,
        quote_count=post_view.get("quoteCount", 0) or 0,
        langs=record.get("langs"),
        raw_json=item,
    )
 def map_search_post_to_mention(post_data: dict, mentioned_did: str) -> Mention:
    """Map a raw searchPosts result to a Mention model."""
    record = post_data.get("record", {})
    return Mention(
        post_uri=post_data.get("uri", ""),
        mentioned_did=mentioned_did,
        mentioning_did=post_data.get("author", {}).get("did"),
        post_text=record.get("text"),
        post_created_at=_parse_dt(record.get("createdAt")),
        raw_json=post_data,
    )
--- a/src/collector.py
+++ b/src/collector.py
@ -0,0 +1,293 @@
 """Main collector orchestrator.
 Runs as a one-shot process: resolves accounts, collects feeds and mentions,
 then exits. Designed to be triggered on a schedule by ofelia or cron.
 Usage:
    python -m src.collector
 """
 from __future__ import annotations
 import asyncio
 import logging
 import os
 import sys
 from datetime import datetime, timedelta, timezone
 from .bluesky_client import (
    BlueskyClient,
    map_feed_item_to_post,
    map_search_post_to_mention,
 )
 from .config import CollectorConfig, load_accounts
 from .db import Database
 from .models import Account
 logger = logging.getLogger("collector")
 # ── Feed collection ─────────────────────────────────────────────────────
 async def collect_feed(
    client: BlueskyClient,
    db: Database,
    account: Account,
    max_pages: int,
 ) -> int:
    """Collect posts from an account's feed. Returns number of posts stored."""
    state = await db.get_collection_state(account.did, "feed")
    cutoff = state.last_post_at if state else None
    all_posts = []
    cursor = None
    pages = 0
    while pages < max_pages:
        items, next_cursor = await client.get_author_feed_page(
            actor=account.did, cursor=cursor, limit=100
        )
        if not items:
            break
        posts = [map_feed_item_to_post(item) for item in items]
        # Check if we've reached posts older than our cutoff.
        # We still upsert everything (to refresh engagement counts),
        # but stop paginating once we pass the cutoff.
        hit_old = False
        if cutoff:
            for p in posts:
                if p.created_at and p.created_at <= cutoff:
                    hit_old = True
                    break
        await db.upsert_posts(posts)
        all_posts.extend(posts)
        if hit_old or not next_cursor:
            break
        cursor = next_cursor
        pages += 1
    # Save the newest timestamp for next incremental run
    dated = [p.created_at for p in all_posts if p.created_at]
    if dated:
        newest = max(dated)
        await db.save_collection_state(account.did, "feed", last_post_at=newest)
    await db.update_account_last_feed(account.did)
    logger.info(
        "  Feed: %d posts collected (%d pages) for %s",
        len(all_posts),
        pages + 1,
        account.handle,
    )
    return len(all_posts)
 # ── Mention collection ──────────────────────────────────────────────────
 async def collect_mentions(
    client: BlueskyClient,
    db: Database,
    account: Account,
    max_pages: int,
    lookback_hours: int,
 ) -> int:
    """Search for posts mentioning this account. Returns number stored."""
    state = await db.get_collection_state(account.did, "mentions")
    if state and state.last_post_at:
        since = state.last_post_at.isoformat()
    else:
        since = (datetime.now(timezone.utc) - timedelta(hours=lookback_hours)).isoformat()
    all_mentions = []
    cursor = None
    pages = 0
    while pages < max_pages:
        posts, next_cursor = await client.search_mentions_page(
            handle=account.handle, since=since, cursor=cursor, limit=100
        )
        if not posts:
            break
        mentions = [map_search_post_to_mention(p, account.did) for p in posts]
        count = await db.upsert_mentions(mentions)
        all_mentions.extend(mentions)
        if not next_cursor:
            break
        cursor = next_cursor
        pages += 1
    # Save newest mention timestamp
    dated = [m.post_created_at for m in all_mentions if m.post_created_at]
    if dated:
        newest = max(dated)
        await db.save_collection_state(account.did, "mentions", last_post_at=newest)
    await db.update_account_last_mention(account.did)
    logger.info(
        "  Mentions: %d found (%d pages) for %s",
        len(all_mentions),
        pages + 1,
        account.handle,
    )
    return len(all_mentions)
 # ── Main orchestrator ───────────────────────────────────────────────────
 async def run() -> None:
    config = CollectorConfig.from_env()
    logging.basicConfig(
        level=getattr(logging, config.log_level.upper(), logging.INFO),
        format="%(asctime)s [%(levelname)s] %(name)s: %(message)s",
        handlers=[
            logging.StreamHandler(sys.stdout),
        ],
    )
    # Also log to file if /app/logs exists
    log_dir = "/app/logs"
    if os.path.isdir(log_dir):
        fh = logging.FileHandler(os.path.join(log_dir, "collector.log"))
        fh.setFormatter(logging.Formatter("%(asctime)s [%(levelname)s] %(name)s: %(message)s"))
        logging.getLogger().addHandler(fh)
    logger.info("=" * 60)
    logger.info("Bluesky Collector starting")
    # Load handles from YAML
    handles = load_accounts(config.accounts_file)
    if not handles:
        logger.error("No accounts found in %s — nothing to do.", config.accounts_file)
        return
    logger.info("Loaded %d handles from config", len(handles))
    db = Database(config.database_url)
    client = BlueskyClient(config.bsky_api_base)
    try:
        await db.connect()
        # Authenticate if credentials are provided (needed for searchPosts)
        if config.bsky_handle and config.bsky_app_password:
            try:
                await client.login(config.bsky_handle, config.bsky_app_password)
            except Exception:
                logger.exception(
                    "Authentication failed — mention search will be limited"
                )
        else:
            logger.info(
                "No BSKY_HANDLE/BSKY_APP_PASSWORD set — "
                "mention search may be limited (403 on public API)"
            )
        # Phase 1: Resolve handles and sync to DB
        logger.info("Phase 1: Resolving handles...")
        accounts: list[Account] = []
        for handle in handles:
            did = await client.resolve_handle(handle)
            if did:
                acct = Account(did=did, handle=handle)
                await db.upsert_account(acct)
                accounts.append(acct)
            else:
                logger.warning("Skipping unresolvable handle: %s", handle)
        if not accounts:
            logger.error("No accounts could be resolved — aborting.")
            return
        # Mark accounts removed from config as inactive
        await db.deactivate_removed_accounts({a.did for a in accounts})
        # Start collection run
        run_id = await db.start_run(accounts_total=len(accounts))
        total_posts = 0
        total_mentions = 0
        accounts_done = 0
        errors: list[dict] = []
        # Phase 2 & 3: Collect feed + mentions for each account
        logger.info("Collecting feeds and mentions for %d accounts...", len(accounts))
        for acct in accounts:
            # Feed
            try:
                n = await collect_feed(client, db, acct, config.max_pages_per_account)
                total_posts += n
            except Exception as e:
                logger.exception("Feed collection failed for %s", acct.handle)
                errors.append({"account": acct.handle, "phase": "feed", "error": str(e)})
            # Mentions
            try:
                n = await collect_mentions(
                    client,
                    db,
                    acct,
                    config.max_pages_per_account,
                    config.mention_lookback_hours,
                )
                total_mentions += n
            except Exception as e:
                logger.exception("Mention collection failed for %s", acct.handle)
                errors.append({"account": acct.handle, "phase": "mentions", "error": str(e)})
            accounts_done += 1
        # Update run record
        await db.update_run_progress(
            run_id,
            accounts_done=accounts_done,
            posts_collected=total_posts,
            mentions_collected=total_mentions,
        )
        status = "completed" if not errors else "partial"
        await db.finish_run(run_id, status=status, errors=errors)
        # Summary
        stats = await db.get_stats()
        logger.info("=" * 60)
        logger.info("Run complete — status: %s", status)
        logger.info(
            "  This run: %d posts, %d mentions, %d errors",
            total_posts,
            total_mentions,
            len(errors),
        )
        logger.info(
            "  Database totals: %d accounts, %d posts, %d mentions, %d runs",
            stats["accounts"],
            stats["posts"],
            stats["mentions"],
            stats["runs"],
        )
    except Exception:
        logger.exception("Collector crashed")
        raise
    finally:
        await client.close()
        await db.close()
 def main() -> None:
    asyncio.run(run())
 if __name__ == "__main__":
    main()
--- a/src/config.py
+++ b/src/config.py
@ -0,0 +1,46 @@
 """Configuration loader: reads environment variables and accounts YAML."""
 from __future__ import annotations
 import os
 from dataclasses import dataclass, field
 from pathlib import Path
 import yaml
@dataclass
 class CollectorConfig:
    database_url: str
    bsky_api_base: str
    accounts_file: str
    log_level: str = "INFO"
    max_pages_per_account: int = 50
    mention_lookback_hours: int = 12
    feed_page_limit: int = 100  # Bluesky API max per page
    bsky_handle: str | None = None       # for authenticated search
    bsky_app_password: str | None = None  # for authenticated search
    @classmethod
    def from_env(cls) -> CollectorConfig:
        return cls(
            database_url=os.environ["DATABASE_URL"],
            bsky_api_base=os.getenv("BSKY_PUBLIC_API", "https://public.api.bsky.app"),
            accounts_file=os.getenv("ACCOUNTS_FILE", "/app/config/accounts.yml"),
            log_level=os.getenv("LOG_LEVEL", "INFO"),
            max_pages_per_account=int(os.getenv("MAX_PAGES_PER_ACCOUNT", "50")),
            mention_lookback_hours=int(os.getenv("MENTION_LOOKBACK_HOURS", "12")),
            bsky_handle=os.getenv("BSKY_HANDLE"),
            bsky_app_password=os.getenv("BSKY_APP_PASSWORD"),
        )
 def load_accounts(path: str) -> list[str]:
    """Load the list of Bluesky handles from a YAML file.
    Returns a list of handle strings (e.g. ['alice.bsky.social', 'bob.bsky.social']).
    """
    data = yaml.safe_load(Path(path).read_text())
    if not data or "accounts" not in data:
        return []
    return [entry["handle"] for entry in data["accounts"] if "handle" in entry]
--- a/src/db.py
+++ b/src/db.py
@ -0,0 +1,265 @@
 """Async PostgreSQL database layer using asyncpg."""
 from __future__ import annotations
 import json
 import logging
 from datetime import datetime, timezone
 from typing import Any
 import asyncpg
 from .models import Account, CollectionState, Mention, Post
 logger = logging.getLogger(__name__)
 class Database:
    def __init__(self, dsn: str):
        self._dsn = dsn
        self._pool: asyncpg.Pool | None = None
    async def connect(self) -> None:
        self._pool = await asyncpg.create_pool(self._dsn, min_size=2, max_size=5)
        logger.info("Database connection pool created")
    async def close(self) -> None:
        if self._pool:
            await self._pool.close()
            logger.info("Database connection pool closed")
    # ── Account operations ──────────────────────────────────────────────
    async def upsert_account(self, account: Account) -> None:
        await self._pool.execute(
            """
            INSERT INTO accounts (did, handle, display_name)
            VALUES ($1, $2, $3)
            ON CONFLICT (did) DO UPDATE SET
                handle       = EXCLUDED.handle,
                display_name = EXCLUDED.display_name
            """,
            account.did,
            account.handle,
            account.display_name,
        )
    async def deactivate_removed_accounts(self, active_dids: set[str]) -> None:
        """Set active=false for accounts no longer in the config."""
        if not active_dids:
            return
        await self._pool.execute(
            """
            UPDATE accounts SET active = false
            WHERE did != ALL($1::text[]) AND active = true
            """,
            list(active_dids),
        )
    async def update_account_last_feed(self, did: str) -> None:
        await self._pool.execute(
            "UPDATE accounts SET last_feed_collected = now() WHERE did = $1", did
        )
    async def update_account_last_mention(self, did: str) -> None:
        await self._pool.execute(
            "UPDATE accounts SET last_mention_collected = now() WHERE did = $1", did
        )
    # ── Post operations ─────────────────────────────────────────────────
    async def upsert_posts(self, posts: list[Post]) -> int:
        """Batch upsert posts. Returns the number of rows affected."""
        if not posts:
            return 0
        count = 0
        async with self._pool.acquire() as conn:
            async with conn.transaction():
                for p in posts:
                    result = await conn.execute(
                        """
                        INSERT INTO posts (
                            uri, cid, author_did, text, created_at, indexed_at,
                            reply_parent, reply_root, post_type,
                            has_media, has_embed,
                            like_count, reply_count, repost_count, quote_count,
                            langs, raw_json
                        ) VALUES (
                            $1, $2, $3, $4, $5, $6,
                            $7, $8, $9,
                            $10, $11,
                            $12, $13, $14, $15,
                            $16, $17
                        )
                        ON CONFLICT (uri) DO UPDATE SET
                            cid          = EXCLUDED.cid,
                            like_count   = EXCLUDED.like_count,
                            reply_count  = EXCLUDED.reply_count,
                            repost_count = EXCLUDED.repost_count,
                            quote_count  = EXCLUDED.quote_count,
                            collected_at = now()
                        """,
                        p.uri,
                        p.cid,
                        p.author_did,
                        p.text,
                        p.created_at,
                        p.indexed_at,
                        p.reply_parent,
                        p.reply_root,
                        p.post_type,
                        p.has_media,
                        p.has_embed,
                        p.like_count,
                        p.reply_count,
                        p.repost_count,
                        p.quote_count,
                        p.langs,
                        json.dumps(p.raw_json),
                    )
                    # asyncpg returns e.g. "INSERT 0 1"
                    count += 1
        return count
    # ── Mention operations ──────────────────────────────────────────────
    async def upsert_mentions(self, mentions: list[Mention]) -> int:
        if not mentions:
            return 0
        count = 0
        async with self._pool.acquire() as conn:
            async with conn.transaction():
                for m in mentions:
                    result = await conn.execute(
                        """
                        INSERT INTO mentions (
                            post_uri, mentioned_did, mentioning_did,
                            post_text, post_created_at, raw_json
                        ) VALUES ($1, $2, $3, $4, $5, $6)
                        ON CONFLICT (post_uri, mentioned_did) DO NOTHING
                        """,
                        m.post_uri,
                        m.mentioned_did,
                        m.mentioning_did,
                        m.post_text,
                        m.post_created_at,
                        json.dumps(m.raw_json),
                    )
                    if "INSERT 0 1" in result:
                        count += 1
        return count
    # ── Collection state ────────────────────────────────────────────────
    async def get_collection_state(
        self, account_did: str, collection_type: str
    ) -> CollectionState | None:
        row = await self._pool.fetchrow(
            """
            SELECT account_did, collection_type, last_post_at
            FROM collection_state
            WHERE account_did = $1 AND collection_type = $2
            """,
            account_did,
            collection_type,
        )
        if not row:
            return None
        return CollectionState(
            account_did=row["account_did"],
            collection_type=row["collection_type"],
            last_post_at=row["last_post_at"],
        )
    async def save_collection_state(
        self, account_did: str, collection_type: str, last_post_at: datetime | None
    ) -> None:
        await self._pool.execute(
            """
            INSERT INTO collection_state (account_did, collection_type, last_post_at, updated_at)
            VALUES ($1, $2, $3, now())
            ON CONFLICT (account_did, collection_type) DO UPDATE SET
                last_post_at = EXCLUDED.last_post_at,
                updated_at   = now()
            """,
            account_did,
            collection_type,
            last_post_at,
        )
    # ── Collection run tracking ─────────────────────────────────────────
    async def start_run(self, accounts_total: int) -> int:
        row = await self._pool.fetchrow(
            """
            INSERT INTO collection_runs (accounts_total)
            VALUES ($1)
            RETURNING id
            """,
            accounts_total,
        )
        return row["id"]
    async def update_run_progress(
        self,
        run_id: int,
        *,
        accounts_done: int | None = None,
        posts_collected: int | None = None,
        mentions_collected: int | None = None,
    ) -> None:
        parts = []
        args: list[Any] = []
        idx = 1
        if accounts_done is not None:
            idx += 1
            parts.append(f"accounts_done = ${idx}")
            args.append(accounts_done)
        if posts_collected is not None:
            idx += 1
            parts.append(f"posts_collected = ${idx}")
            args.append(posts_collected)
        if mentions_collected is not None:
            idx += 1
            parts.append(f"mentions_collected = ${idx}")
            args.append(mentions_collected)
        if not parts:
            return
        sql = f"UPDATE collection_runs SET {', '.join(parts)} WHERE id = $1"
        await self._pool.execute(sql, run_id, *args)
    async def finish_run(
        self, run_id: int, status: str, errors: list[dict] | None = None
    ) -> None:
        await self._pool.execute(
            """
            UPDATE collection_runs SET
                finished_at   = now(),
                status        = $2,
                errors        = $3,
                duration_secs = EXTRACT(EPOCH FROM (now() - started_at))
            WHERE id = $1
            """,
            run_id,
            status,
            json.dumps(errors or []),
        )
    # ── Stats (useful for verification) ─────────────────────────────────
    async def get_stats(self) -> dict[str, int]:
        row = await self._pool.fetchrow(
            """
            SELECT
                (SELECT count(*) FROM accounts WHERE active)  AS accounts,
                (SELECT count(*) FROM posts)                   AS posts,
                (SELECT count(*) FROM mentions)                AS mentions,
                (SELECT count(*) FROM collection_runs)         AS runs
            """
        )
        return dict(row)
--- a/src/models.py
+++ b/src/models.py
@ -0,0 +1,56 @@
 """Data models mirroring the PostgreSQL schema."""
 from __future__ import annotations
 from dataclasses import dataclass, field
 from datetime import datetime
 from typing import Any
@dataclass
 class Account:
    did: str
    handle: str
    display_name: str | None = None
    added_at: datetime | None = None
    last_feed_collected: datetime | None = None
    last_mention_collected: datetime | None = None
    active: bool = True
@dataclass
 class Post:
    uri: str
    cid: str
    author_did: str
    text: str | None
    created_at: datetime | None
    indexed_at: datetime | None
    reply_parent: str | None
    reply_root: str | None
    post_type: str  # "post", "reply", "repost"
    has_media: bool
    has_embed: bool
    like_count: int
    reply_count: int
    repost_count: int
    quote_count: int
    langs: list[str] | None
    raw_json: dict[str, Any] = field(default_factory=dict)
@dataclass
 class Mention:
    post_uri: str
    mentioned_did: str
    mentioning_did: str | None
    post_text: str | None
    post_created_at: datetime | None
    raw_json: dict[str, Any] = field(default_factory=dict)
@dataclass
 class CollectionState:
    account_did: str
    collection_type: str  # "feed" or "mentions"
    last_post_at: datetime | None = None
--- a/src/web/init.py
+++ b/src/web/init.py
--- a/src/web/app.py
+++ b/src/web/app.py
@ -0,0 +1,67 @@
 """Flask application factory for the Bluesky Collector web UI."""
 from __future__ import annotations
 import os
 from flask import Flask
 from . import db as webdb
 from .helpers import (
    bsky_post_url,
    encode_uri,
    format_dt,
    format_number,
    time_ago,
    truncate,
 )
 def create_app() -> Flask:
    app = Flask(
        __name__,
        template_folder="templates",
    )
    app.secret_key = os.environ.get("SECRET_KEY", "bluesky-collector-dev-key")
    app.config["DATABASE_URL"] = os.environ.get(
        "DATABASE_URL",
        "postgresql://bluesky:changeme@db:5432/bluesky",
    )
    # Initialize database pool
    webdb.init_pool(app.config["DATABASE_URL"])
    # Register Jinja2 globals/filters
    app.jinja_env.filters["format_dt"] = format_dt
    app.jinja_env.filters["time_ago"] = time_ago
    app.jinja_env.filters["truncate_text"] = truncate
    app.jinja_env.filters["format_number"] = format_number
    app.jinja_env.globals["encode_uri"] = encode_uri
    app.jinja_env.globals["bsky_post_url"] = bsky_post_url
    # Register blueprints
    from .routes.dashboard import bp as dashboard_bp
    from .routes.accounts import bp as accounts_bp
    from .routes.statuses import bp as statuses_bp
    from .routes.mentions import bp as mentions_bp
    from .routes.export import bp as export_bp
    from .routes.analysis import bp as analysis_bp
    app.register_blueprint(dashboard_bp)
    app.register_blueprint(accounts_bp)
    app.register_blueprint(statuses_bp)
    app.register_blueprint(mentions_bp)
    app.register_blueprint(export_bp)
    app.register_blueprint(analysis_bp)
    # Teardown
    @app.teardown_appcontext
    def close_db(exc):
        pass  # Pool is long-lived, closed at shutdown
    import atexit
    atexit.register(webdb.close_pool)
    return app
--- a/src/web/db.py
+++ b/src/web/db.py
@ -0,0 +1,669 @@
 """Synchronous PostgreSQL query layer for the Flask web UI.
 Uses psycopg2 with a simple connection pool. All functions return
 dicts or lists of dicts for easy template rendering.
 """
 from __future__ import annotations
 import os
 from contextlib import contextmanager
 import psycopg2
 import psycopg2.extras
 import psycopg2.pool
 _pool: psycopg2.pool.ThreadedConnectionPool | None = None
 def init_pool(dsn: str | None = None, minconn: int = 1, maxconn: int = 5) -> None:
    """Initialize the connection pool. Called once at app startup."""
    global _pool
    dsn = dsn or os.environ["DATABASE_URL"]
    _pool = psycopg2.pool.ThreadedConnectionPool(minconn, maxconn, dsn)
 def close_pool() -> None:
    global _pool
    if _pool:
        _pool.closeall()
        _pool = None
@contextmanager
 def get_cursor():
    """Yield a dict cursor from the pool, auto-returning the connection."""
    conn = _pool.getconn()
    try:
        with conn.cursor(cursor_factory=psycopg2.extras.RealDictCursor) as cur:
            yield cur
        conn.commit()
    except Exception:
        conn.rollback()
        raise
    finally:
        _pool.putconn(conn)
 # ── Dashboard ────────────────────────────────────────────────────────────
 def get_dashboard_stats() -> dict:
    with get_cursor() as cur:
        cur.execute("""
            SELECT
                (SELECT count(*) FROM accounts WHERE active) AS accounts,
                (SELECT count(*) FROM posts) AS posts,
                (SELECT count(*) FROM mentions) AS mentions,
                (SELECT count(*) FROM collection_runs) AS runs
        """)
        return dict(cur.fetchone())
 def get_recent_runs(limit: int = 10) -> list[dict]:
    with get_cursor() as cur:
        cur.execute("""
            SELECT id, started_at, finished_at, status,
                   accounts_total, accounts_done,
                   posts_collected, mentions_collected,
                   errors, duration_secs
            FROM collection_runs
            ORDER BY started_at DESC
            LIMIT %s
        """, (limit,))
        return [dict(r) for r in cur.fetchall()]
 # ── Accounts ─────────────────────────────────────────────────────────────
 ACCOUNT_SORT_COLS = {
    "handle": "a.handle",
    "posts": "post_count",
    "mentions": "mention_count",
    "last_feed": "a.last_feed_collected",
    "last_mention": "a.last_mention_collected",
 }
 def get_accounts(
    search: str | None = None,
    sort: str = "handle",
    direction: str = "asc",
    limit: int = 50,
    offset: int = 0,
 ) -> tuple[list[dict], int]:
    """Return (accounts_list, total_count)."""
    sort_col = ACCOUNT_SORT_COLS.get(sort, "a.handle")
    dir_sql = "DESC" if direction == "desc" else "ASC"
    where = "WHERE a.active = true"
    params: list = []
    if search:
        where += " AND a.handle ILIKE %s"
        params.append(f"%{search}%")
    with get_cursor() as cur:
        # Total count
        cur.execute(f"SELECT count(*) AS cnt FROM accounts a {where}", params)
        total = cur.fetchone()["cnt"]
        # Paginated results with counts
        cur.execute(f"""
            SELECT a.did, a.handle, a.display_name, a.active,
                   a.last_feed_collected, a.last_mention_collected, a.added_at,
                   (SELECT count(*) FROM posts p WHERE p.author_did = a.did) AS post_count,
                   (SELECT count(*) FROM mentions m WHERE m.mentioned_did = a.did) AS mention_count
            FROM accounts a
            {where}
            ORDER BY {sort_col} {dir_sql} NULLS LAST
            LIMIT %s OFFSET %s
        """, params + [limit, offset])
        rows = [dict(r) for r in cur.fetchall()]
    return rows, total
 def get_account_by_did(did: str) -> dict | None:
    with get_cursor() as cur:
        cur.execute("""
            SELECT a.did, a.handle, a.display_name, a.active,
                   a.last_feed_collected, a.last_mention_collected, a.added_at,
                   (SELECT count(*) FROM posts p WHERE p.author_did = a.did) AS post_count,
                   (SELECT count(*) FROM mentions m WHERE m.mentioned_did = a.did) AS mention_count
            FROM accounts a
            WHERE a.did = %s
        """, (did,))
        row = cur.fetchone()
        return dict(row) if row else None
 # ── Posts / Statuses ─────────────────────────────────────────────────────
 POST_SORT_COLS = {
    "created": "p.created_at",
    "likes": "p.like_count",
    "replies": "p.reply_count",
    "reposts": "p.repost_count",
 }
 def get_posts(
    account_did: str | None = None,
    post_type: str | None = None,
    search: str | None = None,
    sort: str = "created",
    direction: str = "desc",
    limit: int = 50,
    offset: int = 0,
 ) -> tuple[list[dict], int]:
    """Return (posts_list, total_count)."""
    sort_col = POST_SORT_COLS.get(sort, "p.created_at")
    dir_sql = "DESC" if direction == "desc" else "ASC"
    conditions = []
    params: list = []
    if account_did:
        conditions.append("p.author_did = %s")
        params.append(account_did)
    if post_type:
        conditions.append("p.post_type = %s")
        params.append(post_type)
    if search:
        conditions.append("p.text ILIKE %s")
        params.append(f"%{search}%")
    where = ("WHERE " + " AND ".join(conditions)) if conditions else ""
    with get_cursor() as cur:
        cur.execute(f"SELECT count(*) AS cnt FROM posts p {where}", params)
        total = cur.fetchone()["cnt"]
        cur.execute(f"""
            SELECT p.uri, p.cid, p.author_did, p.text, p.post_type,
                   p.created_at, p.indexed_at, p.collected_at,
                   p.reply_parent, p.reply_root,
                   p.has_media, p.has_embed,
                   p.like_count, p.reply_count, p.repost_count, p.quote_count,
                   p.langs,
                   a.handle AS author_handle
            FROM posts p
            LEFT JOIN accounts a ON a.did = p.author_did
            {where}
            ORDER BY {sort_col} {dir_sql} NULLS LAST
            LIMIT %s OFFSET %s
        """, params + [limit, offset])
        rows = [dict(r) for r in cur.fetchall()]
    return rows, total
 def get_post_by_uri(uri: str) -> dict | None:
    with get_cursor() as cur:
        cur.execute("""
            SELECT p.uri, p.cid, p.author_did, p.text, p.post_type,
                   p.created_at, p.indexed_at, p.collected_at,
                   p.reply_parent, p.reply_root,
                   p.has_media, p.has_embed,
                   p.like_count, p.reply_count, p.repost_count, p.quote_count,
                   p.langs, p.raw_json,
                   a.handle AS author_handle
            FROM posts p
            LEFT JOIN accounts a ON a.did = p.author_did
            WHERE p.uri = %s
        """, (uri,))
        row = cur.fetchone()
        return dict(row) if row else None
 def get_replies_to(uri: str, limit: int = 50) -> list[dict]:
    """Get posts that are replies to the given URI."""
    with get_cursor() as cur:
        cur.execute("""
            SELECT p.uri, p.author_did, p.text, p.post_type,
                   p.created_at, p.like_count, p.reply_count, p.repost_count,
                   a.handle AS author_handle
            FROM posts p
            LEFT JOIN accounts a ON a.did = p.author_did
            WHERE p.reply_parent = %s
            ORDER BY p.created_at ASC
            LIMIT %s
        """, (uri, limit))
        return [dict(r) for r in cur.fetchall()]
 # ── Mentions ─────────────────────────────────────────────────────────────
 def get_mentions(
    mentioned_did: str | None = None,
    search: str | None = None,
    limit: int = 50,
    offset: int = 0,
 ) -> tuple[list[dict], int]:
    conditions = []
    params: list = []
    if mentioned_did:
        conditions.append("m.mentioned_did = %s")
        params.append(mentioned_did)
    if search:
        conditions.append("m.post_text ILIKE %s")
        params.append(f"%{search}%")
    where = ("WHERE " + " AND ".join(conditions)) if conditions else ""
    with get_cursor() as cur:
        cur.execute(f"SELECT count(*) AS cnt FROM mentions m {where}", params)
        total = cur.fetchone()["cnt"]
        cur.execute(f"""
            SELECT m.id, m.post_uri, m.mentioned_did, m.mentioning_did,
                   m.post_text, m.post_created_at, m.collected_at,
                   a.handle AS mentioned_handle
            FROM mentions m
            LEFT JOIN accounts a ON a.did = m.mentioned_did
            {where}
            ORDER BY m.post_created_at DESC NULLS LAST
            LIMIT %s OFFSET %s
        """, params + [limit, offset])
        rows = [dict(r) for r in cur.fetchall()]
    return rows, total
 # ── Export helpers ────────────────────────────────────────────────────────
 def iter_posts_csv(
    account_did: str | None = None,
    since: str | None = None,
    until: str | None = None,
 ):
    """Generator yielding post rows as dicts for CSV export."""
    conditions = []
    params: list = []
    if account_did:
        conditions.append("p.author_did = %s")
        params.append(account_did)
    if since:
        conditions.append("p.created_at >= %s")
        params.append(since)
    if until:
        conditions.append("p.created_at <= %s")
        params.append(until)
    where = ("WHERE " + " AND ".join(conditions)) if conditions else ""
    with get_cursor() as cur:
        cur.execute(f"""
            SELECT p.uri, p.author_did, a.handle AS author_handle,
                   p.text, p.post_type, p.created_at,
                   p.like_count, p.reply_count, p.repost_count, p.quote_count,
                   p.has_media, p.has_embed, p.reply_parent, p.reply_root
            FROM posts p
            LEFT JOIN accounts a ON a.did = p.author_did
            {where}
            ORDER BY p.created_at DESC
        """, params)
        for row in cur:
            yield dict(row)
 def iter_mentions_csv(
    mentioned_did: str | None = None,
    since: str | None = None,
    until: str | None = None,
 ):
    """Generator yielding mention rows as dicts for CSV export."""
    conditions = []
    params: list = []
    if mentioned_did:
        conditions.append("m.mentioned_did = %s")
        params.append(mentioned_did)
    if since:
        conditions.append("m.post_created_at >= %s")
        params.append(since)
    if until:
        conditions.append("m.post_created_at <= %s")
        params.append(until)
    where = ("WHERE " + " AND ".join(conditions)) if conditions else ""
    with get_cursor() as cur:
        cur.execute(f"""
            SELECT m.post_uri, m.mentioned_did, a.handle AS mentioned_handle,
                   m.mentioning_did, m.post_text, m.post_created_at
            FROM mentions m
            LEFT JOIN accounts a ON a.did = m.mentioned_did
            {where}
            ORDER BY m.post_created_at DESC
        """, params)
        for row in cur:
            yield dict(row)
 def get_accounts_for_select() -> list[dict]:
    """Get a simple list of active accounts for dropdown selectors."""
    with get_cursor() as cur:
        cur.execute("""
            SELECT did, handle FROM accounts
            WHERE active = true
            ORDER BY handle
        """)
        return [dict(r) for r in cur.fetchall()]
 # ── Analysis queries ─────────────────────────────────────────────────────
 TOXICITY_CATEGORIES = [
    "toxic", "threat", "hate_speech", "racism",
    "antisemitism", "islamophobia", "sexism", "homophobia",
    "insult", "dehumanization", "extremism", "ableism",
 ]
 def _check_toxicity_tables() -> bool:
    """Check if the toxicity tables exist (migration applied)."""
    with get_cursor() as cur:
        cur.execute("""
            SELECT EXISTS (
                SELECT FROM information_schema.tables
                WHERE table_name = 'toxicity_scores'
            )
        """)
        return cur.fetchone()["exists"]
 def get_analysis_stats() -> dict:
    """Get overview stats for the analysis dashboard."""
    if not _check_toxicity_tables():
        return {
            "total_scored_posts": 0, "total_scored_mentions": 0,
            "flagged_posts": 0, "flagged_mentions": 0,
            "avg_toxicity_posts": 0, "avg_toxicity_mentions": 0,
            "total_posts": 0, "total_mentions": 0,
        }
    with get_cursor() as cur:
        cur.execute("""
            SELECT
                (SELECT count(*) FROM toxicity_scores) AS total_scored_posts,
                (SELECT count(*) FROM mention_toxicity_scores) AS total_scored_mentions,
                (SELECT count(*) FROM toxicity_scores WHERE flagged) AS flagged_posts,
                (SELECT count(*) FROM mention_toxicity_scores WHERE flagged) AS flagged_mentions,
                (SELECT coalesce(avg(overall), 0) FROM toxicity_scores) AS avg_toxicity_posts,
                (SELECT coalesce(avg(overall), 0) FROM mention_toxicity_scores) AS avg_toxicity_mentions,
                (SELECT count(*) FROM posts WHERE post_type != 'repost' AND text IS NOT NULL AND text != '') AS total_posts,
                (SELECT count(*) FROM mentions WHERE post_text IS NOT NULL AND post_text != '') AS total_mentions
        """)
        return dict(cur.fetchone())
 def get_toxicity_trend(weeks: int = 12) -> list[dict]:
    """Get weekly average toxicity scores for trend chart.
    Returns rows with: week, avg_post_toxicity, avg_mention_toxicity,
    flagged_post_count, flagged_mention_count.
    """
    if not _check_toxicity_tables():
        return []
    with get_cursor() as cur:
        cur.execute("""
            WITH weeks AS (
                SELECT generate_series(
                    date_trunc('week', now() - interval '%s weeks'),
                    date_trunc('week', now()),
                    '1 week'::interval
                ) AS week_start
            ),
            post_stats AS (
                SELECT date_trunc('week', p.created_at) AS week_start,
                       avg(ts.overall) AS avg_tox,
                       count(*) FILTER (WHERE ts.flagged) AS flagged_count,
                       count(*) AS total
                FROM toxicity_scores ts
                JOIN posts p ON p.uri = ts.uri
                WHERE p.created_at >= now() - interval '%s weeks'
                GROUP BY 1
            ),
            mention_stats AS (
                SELECT date_trunc('week', m.post_created_at) AS week_start,
                       avg(mts.overall) AS avg_tox,
                       count(*) FILTER (WHERE mts.flagged) AS flagged_count,
                       count(*) AS total
                FROM mention_toxicity_scores mts
                JOIN mentions m ON m.id = mts.mention_id
                WHERE m.post_created_at >= now() - interval '%s weeks'
                GROUP BY 1
            )
            SELECT w.week_start AS week,
                   coalesce(ps.avg_tox, 0) AS avg_post_toxicity,
                   coalesce(ms.avg_tox, 0) AS avg_mention_toxicity,
                   coalesce(ps.flagged_count, 0) AS flagged_posts,
                   coalesce(ms.flagged_count, 0) AS flagged_mentions,
                   coalesce(ps.total, 0) AS post_count,
                   coalesce(ms.total, 0) AS mention_count
            FROM weeks w
            LEFT JOIN post_stats ps ON ps.week_start = w.week_start
            LEFT JOIN mention_stats ms ON ms.week_start = w.week_start
            ORDER BY w.week_start
        """, (weeks, weeks, weeks))
        return [dict(r) for r in cur.fetchall()]
 def get_category_averages() -> dict:
    """Get average score for each toxicity category across all scored items."""
    if not _check_toxicity_tables():
        return {cat: 0.0 for cat in TOXICITY_CATEGORIES}
    cols = ", ".join(f"coalesce(avg({cat}), 0) AS {cat}" for cat in TOXICITY_CATEGORIES)
    with get_cursor() as cur:
        cur.execute(f"""
            SELECT {cols}
            FROM (
                SELECT {', '.join(TOXICITY_CATEGORIES)} FROM toxicity_scores
                UNION ALL
                SELECT {', '.join(TOXICITY_CATEGORIES)} FROM mention_toxicity_scores
            ) combined
        """)
        return dict(cur.fetchone())
 def get_recent_analysis_runs(limit: int = 5) -> list[dict]:
    """Get the latest analysis runs."""
    if not _check_toxicity_tables():
        return []
    with get_cursor() as cur:
        cur.execute("""
            SELECT id, started_at, finished_at, status, model,
                   posts_scored, mentions_scored, errors,
                   cost_usd, duration_secs
            FROM analysis_runs
            ORDER BY started_at DESC
            LIMIT %s
        """, (limit,))
        return [dict(r) for r in cur.fetchall()]
 def get_flagged_content(
    content_type: str | None = None,
    category: str | None = None,
    account_did: str | None = None,
    threshold: float = 0.5,
    limit: int = 50,
    offset: int = 0,
 ) -> tuple[list[dict], int]:
    """Get flagged posts and mentions combined.
    Returns (items, total_count). Each item has:
    item_type ('post', 'reply', or 'mention'), text, author info, scores.
    """
    if not _check_toxicity_tables():
        return [], 0
    # Build the UNION query
    cat_filter = ""
    if category and category in TOXICITY_CATEGORIES:
        cat_filter = f"AND {category} >= {threshold}"
    post_conditions = f"WHERE ts.overall >= %s {cat_filter}"
    mention_conditions = f"WHERE mts.overall >= %s {cat_filter}"
    params_posts: list = [threshold]
    params_mentions: list = [threshold]
    if account_did:
        post_conditions += " AND p.author_did = %s"
        params_posts.append(account_did)
        mention_conditions += " AND m.mentioned_did = %s"
        params_mentions.append(account_did)
    type_filter_post = ""
    type_filter_mention = ""
    if content_type == "mention":
        type_filter_post = "AND false"  # exclude posts
    elif content_type in ("post", "reply"):
        type_filter_mention = "AND false"  # exclude mentions
        if content_type == "post":
            post_conditions += " AND p.post_type = 'post'"
        elif content_type == "reply":
            post_conditions += " AND p.post_type = 'reply'"
    with get_cursor() as cur:
        # Count
        cur.execute(f"""
            SELECT count(*) AS cnt FROM (
                SELECT 1 FROM toxicity_scores ts
                JOIN posts p ON p.uri = ts.uri
                {post_conditions} {type_filter_post}
                UNION ALL
                SELECT 1 FROM mention_toxicity_scores mts
                JOIN mentions m ON m.id = mts.mention_id
                {mention_conditions} {type_filter_mention}
            ) sub
        """, params_posts + params_mentions)
        total = cur.fetchone()["cnt"]
        # Paginated results
        cur.execute(f"""
            SELECT * FROM (
                SELECT
                    'post' AS source_type,
                    p.post_type AS item_type,
                    p.uri AS item_id,
                    p.text,
                    p.author_did,
                    a.handle AS author_handle,
                    NULL::text AS mentioned_did,
                    NULL::text AS mentioned_handle,
                    p.created_at,
                    ts.overall, ts.toxic, ts.threat, ts.hate_speech,
                    ts.racism, ts.antisemitism, ts.islamophobia,
                    ts.sexism, ts.homophobia, ts.insult, ts.dehumanization,
                    ts.extremism, ts.ableism
                FROM toxicity_scores ts
                JOIN posts p ON p.uri = ts.uri
                LEFT JOIN accounts a ON a.did = p.author_did
                {post_conditions} {type_filter_post}
                UNION ALL
                SELECT
                    'mention' AS source_type,
                    'mention' AS item_type,
                    m.post_uri AS item_id,
                    m.post_text AS text,
                    m.mentioning_did AS author_did,
                    NULL AS author_handle,
                    m.mentioned_did,
                    ma.handle AS mentioned_handle,
                    m.post_created_at AS created_at,
                    mts.overall, mts.toxic, mts.threat, mts.hate_speech,
                    mts.racism, mts.antisemitism, mts.islamophobia,
                    mts.sexism, mts.homophobia, mts.insult, mts.dehumanization,
                    mts.extremism, mts.ableism
                FROM mention_toxicity_scores mts
                JOIN mentions m ON m.id = mts.mention_id
                LEFT JOIN accounts ma ON ma.did = m.mentioned_did
                {mention_conditions} {type_filter_mention}
            ) combined
            ORDER BY overall DESC, created_at DESC
            LIMIT %s OFFSET %s
        """, params_posts + params_mentions + [limit, offset])
        rows = [dict(r) for r in cur.fetchall()]
    # Determine top category for each row
    for row in rows:
        top_cat = max(TOXICITY_CATEGORIES, key=lambda c: row.get(c, 0))
        row["top_category"] = top_cat
        row["top_score"] = row.get(top_cat, 0)
    return rows, total
 def get_account_toxicity_summary(
    sort: str = "mention_tox",
    direction: str = "desc",
    limit: int = 50,
    offset: int = 0,
 ) -> tuple[list[dict], int]:
    """Get per-account toxicity summary.
    Returns (accounts, total).
    """
    if not _check_toxicity_tables():
        return [], 0
    sort_cols = {
        "handle": "a.handle",
        "post_tox": "avg_post_tox",
        "mention_tox": "avg_mention_tox",
        "flagged_posts": "flagged_posts",
        "flagged_mentions": "flagged_mentions",
    }
    sort_col = sort_cols.get(sort, "avg_mention_tox")
    dir_sql = "DESC" if direction == "desc" else "ASC"
    with get_cursor() as cur:
        cur.execute("SELECT count(*) AS cnt FROM accounts WHERE active")
        total = cur.fetchone()["cnt"]
        cur.execute(f"""
            SELECT
                a.did, a.handle, a.display_name,
                coalesce(post_agg.avg_tox, 0) AS avg_post_tox,
                coalesce(post_agg.flagged, 0) AS flagged_posts,
                coalesce(post_agg.total, 0) AS scored_posts,
                coalesce(mention_agg.avg_tox, 0) AS avg_mention_tox,
                coalesce(mention_agg.flagged, 0) AS flagged_mentions,
                coalesce(mention_agg.total, 0) AS scored_mentions
            FROM accounts a
            LEFT JOIN (
                SELECT p.author_did,
                       avg(ts.overall) AS avg_tox,
                       count(*) FILTER (WHERE ts.flagged) AS flagged,
                       count(*) AS total
                FROM toxicity_scores ts
                JOIN posts p ON p.uri = ts.uri
                GROUP BY p.author_did
            ) post_agg ON post_agg.author_did = a.did
            LEFT JOIN (
                SELECT m.mentioned_did,
                       avg(mts.overall) AS avg_tox,
                       count(*) FILTER (WHERE mts.flagged) AS flagged,
                       count(*) AS total
                FROM mention_toxicity_scores mts
                JOIN mentions m ON m.id = mts.mention_id
                GROUP BY m.mentioned_did
            ) mention_agg ON mention_agg.mentioned_did = a.did
            WHERE a.active = true
            ORDER BY {sort_col} {dir_sql} NULLS LAST
            LIMIT %s OFFSET %s
        """, (limit, offset))
        rows = [dict(r) for r in cur.fetchall()]
    return rows, total
--- a/src/web/helpers.py
+++ b/src/web/helpers.py
@ -0,0 +1,109 @@
 """Utility functions for the web UI: URI encoding, date formatting, link building."""
 from __future__ import annotations
 import base64
 import re
 from datetime import datetime, timezone
 # ── URI encoding for route parameters ────────────────────────────────────
 # AT URIs look like: at://did:plc:xxx/app.bsky.feed.post/rkey
 # They contain / and : which break URL routing, so we base64url-encode them.
 def encode_uri(uri: str) -> str:
    """Base64url-encode an AT URI for use in URL paths."""
    return base64.urlsafe_b64encode(uri.encode()).decode().rstrip("=")
 def decode_uri(encoded: str) -> str:
    """Decode a base64url-encoded AT URI."""
    # Add back padding
    padding = 4 - len(encoded) % 4
    if padding != 4:
        encoded += "=" * padding
    return base64.urlsafe_b64decode(encoded.encode()).decode()
 # ── Bluesky link construction ────────────────────────────────────────────
 _AT_URI_RE = re.compile(r"at://([^/]+)/app\.bsky\.feed\.post/(.+)")
 def bsky_post_url(uri: str, handle: str | None = None) -> str | None:
    """Build a bsky.app URL from an AT URI.
    If handle is provided, uses it for a nicer URL. Otherwise uses the DID.
    Returns None if the URI doesn't match expected format.
    """
    m = _AT_URI_RE.match(uri)
    if not m:
        return None
    did_or_handle = handle or m.group(1)
    rkey = m.group(2)
    return f"https://bsky.app/profile/{did_or_handle}/post/{rkey}"
 def extract_rkey(uri: str) -> str | None:
    """Extract the record key from an AT URI."""
    m = _AT_URI_RE.match(uri)
    return m.group(2) if m else None
 # ── Date/time formatting ─────────────────────────────────────────────────
 def format_dt(dt: datetime | None, fmt: str = "%Y-%m-%d %H:%M") -> str:
    """Format a datetime for display, returns '—' if None."""
    if dt is None:
        return "\u2014"
    return dt.strftime(fmt)
 def time_ago(dt: datetime | None) -> str:
    """Return a human-readable 'time ago' string."""
    if dt is None:
        return "\u2014"
    now = datetime.now(timezone.utc)
    if dt.tzinfo is None:
        dt = dt.replace(tzinfo=timezone.utc)
    delta = now - dt
    seconds = int(delta.total_seconds())
    if seconds < 60:
        return "just now"
    elif seconds < 3600:
        m = seconds // 60
        return f"{m}m ago"
    elif seconds < 86400:
        h = seconds // 3600
        return f"{h}h ago"
    elif seconds < 604800:
        d = seconds // 86400
        return f"{d}d ago"
    else:
        return dt.strftime("%b %d, %Y")
 # ── Text helpers ─────────────────────────────────────────────────────────
 def truncate(text: str | None, length: int = 200) -> str:
    """Truncate text to a max length, adding ellipsis if needed."""
    if not text:
        return ""
    if len(text) <= length:
        return text
    return text[:length].rsplit(" ", 1)[0] + "\u2026"
 def format_number(n: int | None) -> str:
    """Format a number with K/M suffixes for display."""
    if n is None:
        return "0"
    if n >= 1_000_000:
        return f"{n / 1_000_000:.1f}M"
    if n >= 1_000:
        return f"{n / 1_000:.1f}K"
    return str(n)
--- a/src/web/routes/init.py
+++ b/src/web/routes/init.py
--- a/src/web/routes/accounts.py
+++ b/src/web/routes/accounts.py
@ -0,0 +1,53 @@
 """Flask blueprint for the accounts listing page."""
 from __future__ import annotations
 from flask import Blueprint, render_template, request
 from ..db import get_accounts
 bp = Blueprint("accounts", __name__, url_prefix="/accounts")
@bp.route("/")
 def index():
    """List tracked accounts with search, sorting, and pagination."""
    # Query parameters
    search = request.args.get("search", "").strip() or None
    sort = request.args.get("sort", "handle")
    direction = request.args.get("dir", "asc")
    page = max(1, request.args.get("page", 1, type=int))
    per_page = 50
    # Validate sort column
    allowed_sorts = {"handle", "posts", "mentions", "last_feed", "last_mention"}
    if sort not in allowed_sorts:
        sort = "handle"
    # Validate direction
    if direction not in ("asc", "desc"):
        direction = "asc"
    accounts, total = get_accounts(
        search=search,
        sort=sort,
        direction=direction,
        limit=per_page,
        offset=(page - 1) * per_page,
    )
    total_pages = max(1, (total + per_page - 1) // per_page)
    # Clamp page to valid range
    if page > total_pages:
        page = total_pages
    return render_template(
        "accounts.html",
        accounts=accounts,
        total=total,
        page=page,
        total_pages=total_pages,
        search=search or "",
        sort=sort,
        direction=direction,
    )
--- a/src/web/routes/analysis.py
+++ b/src/web/routes/analysis.py
@ -0,0 +1,134 @@
 """Analysis dashboard routes: toxicity overview, flagged content, account breakdown."""
 from __future__ import annotations
 import json
 from flask import Blueprint, render_template, request
 from ..db import (
    TOXICITY_CATEGORIES,
    get_account_toxicity_summary,
    get_accounts_for_select,
    get_analysis_stats,
    get_category_averages,
    get_flagged_content,
    get_recent_analysis_runs,
    get_toxicity_trend,
 )
 bp = Blueprint("analysis", __name__, url_prefix="/analysis")
@bp.route("/")
 def index():
    stats = get_analysis_stats()
    trend = get_toxicity_trend(weeks=12)
    categories = get_category_averages()
    runs = get_recent_analysis_runs(limit=5)
    # Prepare chart data as JSON for Chart.js
    trend_json = json.dumps([
        {
            "week": r["week"].strftime("%Y-%m-%d") if r["week"] else "",
            "avg_post_toxicity": round(float(r["avg_post_toxicity"]), 4),
            "avg_mention_toxicity": round(float(r["avg_mention_toxicity"]), 4),
            "flagged_posts": int(r["flagged_posts"]),
            "flagged_mentions": int(r["flagged_mentions"]),
        }
        for r in trend
    ])
    categories_json = json.dumps({
        k: round(float(v), 4) for k, v in categories.items()
    })
    return render_template(
        "analysis.html",
        stats=stats,
        trend_json=trend_json,
        categories_json=categories_json,
        categories=TOXICITY_CATEGORIES,
        runs=runs,
    )
@bp.route("/flagged")
 def flagged():
    content_type = request.args.get("type") or None
    category = request.args.get("category") or None
    account_did = request.args.get("account") or None
    threshold = request.args.get("threshold", 0.5, type=float)
    page = max(1, request.args.get("page", 1, type=int))
    per_page = 50
    items, total = get_flagged_content(
        content_type=content_type,
        category=category,
        account_did=account_did,
        threshold=threshold,
        limit=per_page,
        offset=(page - 1) * per_page,
    )
    total_pages = max(1, (total + per_page - 1) // per_page)
    accounts = get_accounts_for_select()
    return render_template(
        "flagged.html",
        items=items,
        total=total,
        page=page,
        total_pages=total_pages,
        accounts=accounts,
        categories=TOXICITY_CATEGORIES,
        content_type=content_type or "",
        category=category or "",
        account_did=account_did or "",
        threshold=threshold,
    )
@bp.route("/accounts")
 def accounts():
    sort = request.args.get("sort", "mention_tox")
    direction = request.args.get("dir", "desc")
    page = max(1, request.args.get("page", 1, type=int))
    per_page = 50
    # Validate
    valid_sorts = {"handle", "post_tox", "mention_tox", "flagged_posts", "flagged_mentions"}
    if sort not in valid_sorts:
        sort = "mention_tox"
    if direction not in ("asc", "desc"):
        direction = "desc"
    rows, total = get_account_toxicity_summary(
        sort=sort, direction=direction,
        limit=per_page, offset=(page - 1) * per_page,
    )
    total_pages = max(1, (total + per_page - 1) // per_page)
    # Top 20 most-targeted for bar chart
    top_targeted, _ = get_account_toxicity_summary(
        sort="mention_tox", direction="desc", limit=20, offset=0,
    )
    top_targeted_json = json.dumps([
        {
            "handle": r["handle"],
            "avg_mention_tox": round(float(r["avg_mention_tox"]), 4),
            "flagged_mentions": int(r["flagged_mentions"]),
        }
        for r in top_targeted
        if float(r["avg_mention_tox"]) > 0
    ])
    return render_template(
        "account_toxicity.html",
        accounts=rows,
        total=total,
        page=page,
        total_pages=total_pages,
        sort=sort,
        direction=direction,
        top_targeted_json=top_targeted_json,
    )
--- a/src/web/routes/dashboard.py
+++ b/src/web/routes/dashboard.py
@ -0,0 +1,12 @@
 from flask import Blueprint, render_template
 from ..db import get_dashboard_stats, get_recent_runs
 bp = Blueprint("dashboard", __name__)
@bp.route("/")
 def index():
    stats = get_dashboard_stats()
    runs = get_recent_runs(limit=10)
    return render_template("dashboard.html", stats=stats, runs=runs)
--- a/src/web/routes/export.py
+++ b/src/web/routes/export.py
@ -0,0 +1,91 @@
 """Export routes: CSV download for posts and mentions."""
 from __future__ import annotations
 import csv
 import io
 from datetime import datetime, timezone
 from flask import Blueprint, Response, render_template, request, stream_with_context
 from ..db import get_accounts_for_select, iter_mentions_csv, iter_posts_csv
 bp = Blueprint("export", __name__, url_prefix="/export")
@bp.route("/")
 def index():
    accounts = get_accounts_for_select()
    return render_template("export.html", accounts=accounts)
@bp.route("/posts.csv")
 def posts_csv():
    account_did = request.args.get("account") or None
    since = request.args.get("since") or None
    until = request.args.get("until") or None
    def generate():
        output = io.StringIO()
        writer = csv.writer(output)
        # Header row
        header = [
            "uri", "author_did", "author_handle", "text", "post_type",
            "created_at", "like_count", "reply_count", "repost_count",
            "quote_count", "has_media", "has_embed", "reply_parent", "reply_root",
        ]
        writer.writerow(header)
        yield output.getvalue()
        output.seek(0)
        output.truncate(0)
        for row in iter_posts_csv(account_did=account_did, since=since, until=until):
            writer.writerow([row.get(col, "") for col in header])
            yield output.getvalue()
            output.seek(0)
            output.truncate(0)
    timestamp = datetime.now(timezone.utc).strftime("%Y%m%d_%H%M%S")
    filename = f"bluesky_posts_{timestamp}.csv"
    return Response(
        stream_with_context(generate()),
        mimetype="text/csv",
        headers={"Content-Disposition": f"attachment; filename={filename}"},
    )
@bp.route("/mentions.csv")
 def mentions_csv():
    mentioned_did = request.args.get("account") or None
    since = request.args.get("since") or None
    until = request.args.get("until") or None
    def generate():
        output = io.StringIO()
        writer = csv.writer(output)
        header = [
            "post_uri", "mentioned_did", "mentioned_handle",
            "mentioning_did", "post_text", "post_created_at",
        ]
        writer.writerow(header)
        yield output.getvalue()
        output.seek(0)
        output.truncate(0)
        for row in iter_mentions_csv(mentioned_did=mentioned_did, since=since, until=until):
            writer.writerow([row.get(col, "") for col in header])
            yield output.getvalue()
            output.seek(0)
            output.truncate(0)
    timestamp = datetime.now(timezone.utc).strftime("%Y%m%d_%H%M%S")
    filename = f"bluesky_mentions_{timestamp}.csv"
    return Response(
        stream_with_context(generate()),
        mimetype="text/csv",
        headers={"Content-Disposition": f"attachment; filename={filename}"},
    )
--- a/src/web/routes/mentions.py
+++ b/src/web/routes/mentions.py
@ -0,0 +1,26 @@
 from flask import Blueprint, render_template, request
 from ..db import get_mentions, get_accounts_for_select
 bp = Blueprint("mentions", __name__, url_prefix="/mentions")
@bp.route("/")
 def index():
    mentioned_did = request.args.get("account") or None
    search = request.args.get("search", "").strip() or None
    page = max(1, request.args.get("page", 1, type=int))
    per_page = 50
    mentions, total = get_mentions(
        mentioned_did=mentioned_did, search=search,
        limit=per_page, offset=(page - 1) * per_page,
    )
    total_pages = max(1, (total + per_page - 1) // per_page)
    accounts = get_accounts_for_select()
    return render_template(
        "mentions.html",
        mentions=mentions, total=total,
        page=page, total_pages=total_pages,
        accounts=accounts,
        mentioned_did=mentioned_did or "", search=search or "",
    )
--- a/src/web/routes/statuses.py
+++ b/src/web/routes/statuses.py
@ -0,0 +1,45 @@
 from flask import Blueprint, render_template, request, abort
 from ..db import get_posts, get_post_by_uri, get_replies_to, get_accounts_for_select
 from ..helpers import decode_uri
 bp = Blueprint("statuses", __name__, url_prefix="/statuses")
@bp.route("/")
 def index():
    account_did = request.args.get("account") or None
    post_type = request.args.get("type") or None
    search = request.args.get("search", "").strip() or None
    sort = request.args.get("sort", "created")
    direction = request.args.get("dir", "desc")
    page = max(1, request.args.get("page", 1, type=int))
    per_page = 50
    posts, total = get_posts(
        account_did=account_did, post_type=post_type, search=search,
        sort=sort, direction=direction,
        limit=per_page, offset=(page - 1) * per_page,
    )
    total_pages = max(1, (total + per_page - 1) // per_page)
    accounts = get_accounts_for_select()
    return render_template(
        "statuses.html",
        posts=posts, total=total,
        page=page, total_pages=total_pages,
        accounts=accounts,
        account_did=account_did or "", post_type=post_type or "",
        search=search or "", sort=sort, direction=direction,
    )
@bp.route("/<encoded_uri>")
 def detail(encoded_uri):
    uri = decode_uri(encoded_uri)
    post = get_post_by_uri(uri)
    if not post:
        abort(404)
    replies = get_replies_to(uri)
    # Get parent post if this is a reply
    parent = None
    if post.get("reply_parent"):
        parent = get_post_by_uri(post["reply_parent"])
    return render_template("status_detail.html", post=post, replies=replies, parent=parent)
--- a/src/web/templates/account_toxicity.html
+++ b/src/web/templates/account_toxicity.html
@ -0,0 +1,621 @@
 {% extends "base.html" %}
 {% block title %}Account Toxicity Analysis{% endblock %}
 {% macro sort_header(col, label) %}
  {% set new_dir = 'desc' if (sort == col and direction == 'asc') else 'asc' %}
  <a href="{{ url_for('analysis.accounts', sort=col, dir=new_dir, page=1) }}" class="sort-link{% if sort == col %} active{% endif %}">
    {{ label }}
    {% if sort == col %}
      <span class="sort-arrow">{% if direction == 'asc' %}&#9650;{% else %}&#9660;{% endif %}</span>
    {% endif %}
  </a>
 {% endmacro %}
 {% block content %}
 <div class="account-toxicity-container">
  <!-- Page Header -->
  <div class="page-header">
    <div>
      <h1>Account Toxicity Analysis</h1>
      <p class="page-subtitle">Toxicity metrics across monitored accounts</p>
    </div>
  </div>
  <!-- Chart Section -->
  <div class="chart-section">
    <div class="chart-card">
      <h2 class="chart-title">Most Targeted Accounts</h2>
      <p class="chart-subtitle">Average mention toxicity for top 20 most-targeted accounts</p>
      <div class="chart-container">
        <canvas id="toxicity-chart"></canvas>
      </div>
    </div>
  </div>
  <!-- Table Section -->
  <div class="table-section">
    <h2 class="section-title">Account Details</h2>
    {% if accounts %}
    <div class="table-wrapper">
      <table class="accounts-table">
        <thead>
          <tr>
            <th>{{ sort_header('handle', 'Account') }}</th>
            <th>{{ sort_header('post_tox', 'Avg Post Toxicity') }}</th>
            <th>{{ sort_header('flagged_posts', 'Flagged Posts') }}</th>
            <th>{{ sort_header('mention_tox', 'Avg Mention Toxicity') }}</th>
            <th>{{ sort_header('flagged_mentions', 'Flagged Mentions') }}</th>
          </tr>
        </thead>
        <tbody>
          {% for account in accounts %}
          <tr class="account-row">
            <!-- Account Name -->
            <td class="col-account">
              <div class="account-info">
                <a href="https://bsky.app/profile/{{ account.handle }}" target="_blank" rel="noopener" class="account-handle">
                  @{{ account.handle }}
                </a>
                {% if account.display_name %}
                <div class="account-display-name">{{ account.display_name }}</div>
                {% endif %}
              </div>
            </td>
            <!-- Avg Post Toxicity -->
            <td class="col-score">
              <div class="score-bar-container">
                {% set post_pct = (account.avg_post_tox * 100) | int %}
                {% if account.avg_post_tox < 0.3 %}
                  {% set bar_class = 'score-bar-low' %}
                {% elif account.avg_post_tox < 0.6 %}
                  {% set bar_class = 'score-bar-medium' %}
                {% else %}
                  {% set bar_class = 'score-bar-high' %}
                {% endif %}
                <div class="score-bar {{ bar_class }}" style="width: {{ post_pct }}%"></div>
                <span class="score-number">{{ "%.2f" | format(account.avg_post_tox) }}</span>
              </div>
            </td>
            <!-- Flagged Posts Count -->
            <td class="col-count">
              <span class="count-badge">
                {{ account.flagged_posts | format_number }}
                <span class="count-total">/ {{ account.scored_posts | format_number }}</span>
              </span>
            </td>
            <!-- Avg Mention Toxicity -->
            <td class="col-score">
              <div class="score-bar-container">
                {% set mention_pct = (account.avg_mention_tox * 100) | int %}
                {% if account.avg_mention_tox < 0.3 %}
                  {% set bar_class = 'score-bar-low' %}
                {% elif account.avg_mention_tox < 0.6 %}
                  {% set bar_class = 'score-bar-medium' %}
                {% else %}
                  {% set bar_class = 'score-bar-high' %}
                {% endif %}
                <div class="score-bar {{ bar_class }}" style="width: {{ mention_pct }}%"></div>
                <span class="score-number">{{ "%.2f" | format(account.avg_mention_tox) }}</span>
              </div>
            </td>
            <!-- Flagged Mentions Count -->
            <td class="col-count">
              <span class="count-badge">
                {{ account.flagged_mentions | format_number }}
                <span class="count-total">/ {{ account.scored_mentions | format_number }}</span>
              </span>
            </td>
          </tr>
          {% endfor %}
        </tbody>
      </table>
    </div>
    <!-- Pagination -->
    {% if total_pages > 1 %}
    <div class="pagination">
      {% if page > 1 %}
      <a href="{{ url_for('analysis.accounts', page=1, sort=sort, dir=direction) }}" class="btn-pagination">First</a>
      <a href="{{ url_for('analysis.accounts', page=page-1, sort=sort, dir=direction) }}" class="btn-pagination">Previous</a>
      {% endif %}
      <span class="pagination-info">Page {{ page }} of {{ total_pages }}</span>
      {% if page < total_pages %}
      <a href="{{ url_for('analysis.accounts', page=page+1, sort=sort, dir=direction) }}" class="btn-pagination">Next</a>
      <a href="{{ url_for('analysis.accounts', page=total_pages, sort=sort, dir=direction) }}" class="btn-pagination">Last</a>
      {% endif %}
    </div>
    {% endif %}
    {% else %}
    <!-- Empty State -->
    <div class="empty-state">
      <p class="empty-icon">∅</p>
      <p class="empty-text">No accounts found</p>
      <p class="empty-subtext">Start monitoring accounts to see toxicity analysis</p>
    </div>
    {% endif %}
  </div>
 </div>
 {% endblock %}
 {% block extra_css %}
 <style>
  :root {
    --dark-bg: #1a1a2e;
    --dark-card: #16213e;
    --dark-nav: #0f3460;
    --dark-text: #e0e0e0;
    --accent-primary: #00b4d8;
    --tox-low: #2ecc71;
    --tox-medium: #f39c12;
    --tox-high: #e74c3c;
  }
  .account-toxicity-container {
    padding: 2rem;
    max-width: 1400px;
    margin: 0 auto;
  }
  /* Page Header */
  .page-header {
    margin-bottom: 2.5rem;
  }
  .page-header h1 {
    font-size: 2rem;
    font-weight: 700;
    color: var(--dark-text);
    margin: 0 0 0.5rem 0;
  }
  .page-subtitle {
    font-size: 1rem;
    color: rgba(255, 255, 255, 0.6);
    margin: 0;
  }
  /* Chart Section */
  .chart-section {
    margin-bottom: 3rem;
  }
  .chart-card {
    background: var(--dark-card);
    border: 1px solid rgba(255, 255, 255, 0.1);
    border-radius: 0.5rem;
    padding: 2rem;
  }
  .chart-title {
    font-size: 1.3rem;
    font-weight: 600;
    color: var(--dark-text);
    margin: 0 0 0.5rem 0;
  }
  .chart-subtitle {
    font-size: 0.9rem;
    color: rgba(255, 255, 255, 0.5);
    margin: 0 0 1.5rem 0;
  }
  .chart-container {
    position: relative;
    height: 400px;
    width: 100%;
  }
  /* Table Section */
  .table-section {
    margin-top: 2rem;
  }
  .section-title {
    font-size: 1.3rem;
    font-weight: 600;
    color: var(--dark-text);
    margin: 0 0 1.5rem 0;
  }
  /* Table Wrapper */
  .table-wrapper {
    background: var(--dark-card);
    border: 1px solid rgba(255, 255, 255, 0.1);
    border-radius: 0.5rem;
    overflow-x: auto;
    margin-bottom: 2rem;
  }
  /* Table Styles */
  .accounts-table {
    width: 100%;
    border-collapse: collapse;
    font-size: 0.9rem;
  }
  .accounts-table thead {
    background: rgba(0, 0, 0, 0.3);
    border-bottom: 2px solid rgba(255, 255, 255, 0.1);
  }
  .accounts-table th {
    padding: 1rem;
    text-align: left;
    font-weight: 600;
    color: var(--dark-text);
    white-space: nowrap;
  }
  .accounts-table td {
    padding: 1rem;
    border-bottom: 1px solid rgba(255, 255, 255, 0.05);
    color: var(--dark-text);
  }
  .accounts-table tbody tr:hover {
    background: rgba(0, 180, 216, 0.05);
  }
  /* Column Styles */
  .col-account {
    min-width: 250px;
  }
  .col-score {
    width: 180px;
  }
  .col-count {
    width: 140px;
    text-align: center;
  }
  /* Account Info */
  .account-info {
    display: flex;
    flex-direction: column;
    gap: 0.25rem;
  }
  .account-handle {
    color: var(--accent-primary);
    text-decoration: none;
    font-weight: 600;
    transition: color 0.2s ease;
  }
  .account-handle:hover {
    color: rgba(0, 180, 216, 0.8);
    text-decoration: underline;
  }
  .account-display-name {
    font-size: 0.85rem;
    color: rgba(255, 255, 255, 0.6);
  }
  /* Sort Header */
  .sort-link {
    color: var(--dark-text);
    text-decoration: none;
    display: inline-flex;
    align-items: center;
    gap: 0.5rem;
    transition: color 0.2s ease;
    cursor: pointer;
  }
  .sort-link:hover {
    color: var(--accent-primary);
  }
  .sort-link.active {
    color: var(--accent-primary);
    font-weight: 700;
  }
  .sort-arrow {
    font-size: 0.75rem;
    display: inline-block;
  }
  /* Score Bar */
  .score-bar-container {
    position: relative;
    height: 30px;
    background: rgba(255, 255, 255, 0.05);
    border-radius: 0.25rem;
    overflow: hidden;
    display: flex;
    align-items: center;
    padding: 0 0.5rem;
  }
  .score-bar {
    position: absolute;
    height: 100%;
    left: 0;
    top: 0;
    transition: width 0.3s ease;
  }
  .score-bar-low {
    background: linear-gradient(90deg, rgba(46, 204, 113, 0.3), rgba(46, 204, 113, 0.5));
  }
  .score-bar-medium {
    background: linear-gradient(90deg, rgba(243, 156, 18, 0.3), rgba(243, 156, 18, 0.5));
  }
  .score-bar-high {
    background: linear-gradient(90deg, rgba(231, 76, 60, 0.3), rgba(231, 76, 60, 0.5));
  }
  .score-number {
    position: relative;
    z-index: 1;
    font-weight: 600;
    color: var(--dark-text);
    font-size: 0.85rem;
  }
  /* Count Badge */
  .count-badge {
    display: inline-flex;
    align-items: center;
    gap: 0.25rem;
    background: rgba(0, 180, 216, 0.1);
    color: var(--accent-primary);
    padding: 0.35rem 0.75rem;
    border-radius: 0.25rem;
    font-weight: 600;
    font-size: 0.85rem;
  }
  .count-total {
    color: rgba(255, 255, 255, 0.5);
    font-weight: 400;
  }
  /* Empty State */
  .empty-state {
    text-align: center;
    padding: 4rem 2rem;
    background: var(--dark-card);
    border: 2px dashed rgba(255, 255, 255, 0.2);
    border-radius: 0.5rem;
  }
  .empty-icon {
    font-size: 3rem;
    color: rgba(255, 255, 255, 0.2);
    margin: 0 0 1rem 0;
  }
  .empty-text {
    font-size: 1.2rem;
    font-weight: 600;
    color: var(--dark-text);
    margin: 0 0 0.5rem 0;
  }
  .empty-subtext {
    color: rgba(255, 255, 255, 0.5);
    margin: 0;
  }
  /* Pagination */
  .pagination {
    display: flex;
    justify-content: center;
    align-items: center;
    gap: 1rem;
    margin-top: 2rem;
  }
  .pagination-info {
    color: var(--dark-text);
    font-weight: 500;
    min-width: 150px;
    text-align: center;
  }
  .btn-pagination {
    background: var(--dark-card);
    color: var(--accent-primary);
    border: 1px solid var(--accent-primary);
    padding: 0.5rem 1rem;
    border-radius: 0.375rem;
    text-decoration: none;
    font-weight: 600;
    font-size: 0.9rem;
    transition: all 0.2s ease;
    cursor: pointer;
  }
  .btn-pagination:hover {
    background: var(--accent-primary);
    color: var(--dark-bg);
  }
  .btn-pagination:active {
    transform: scale(0.98);
  }
  /* Chart.js Custom Styling */
  canvas {
    max-height: 400px;
  }
  /* Responsive */
  @media (max-width: 1024px) {
    .col-account {
      min-width: 200px;
    }
    .col-score {
      width: 150px;
    }
    .col-count {
      width: 120px;
    }
  }
  @media (max-width: 768px) {
    .account-toxicity-container {
      padding: 1rem;
    }
    .page-header h1 {
      font-size: 1.5rem;
    }
    .chart-container {
      height: 300px;
    }
    .chart-card {
      padding: 1.5rem;
    }
    .accounts-table {
      font-size: 0.8rem;
    }
    .accounts-table th,
    .accounts-table td {
      padding: 0.75rem 0.5rem;
    }
    .col-account {
      min-width: 160px;
    }
    .col-score,
    .col-count {
      width: auto;
      min-width: 100px;
    }
    .section-title {
      font-size: 1.1rem;
    }
  }
  @media (max-width: 480px) {
    .page-header h1 {
      font-size: 1.2rem;
    }
    .accounts-table {
      font-size: 0.75rem;
    }
    .accounts-table th,
    .accounts-table td {
      padding: 0.5rem;
    }
    .pagination {
      flex-wrap: wrap;
      gap: 0.5rem;
    }
    .pagination-info {
      width: 100%;
      order: 3;
    }
  }
 </style>
 <script src="https://cdnjs.cloudflare.com/ajax/libs/Chart.js/4.4.7/chart.umd.min.js"></script>
 <script>
  document.addEventListener('DOMContentLoaded', function() {
    const chartData = {{ top_targeted_json | safe }};
    if (chartData && chartData.length > 0) {
      const labels = chartData.map(item => '@' + item.handle);
      const scores = chartData.map(item => item.avg_mention_tox);
      const flagged = chartData.map(item => item.flagged_mentions);
      const ctx = document.getElementById('toxicity-chart');
      if (ctx) {
        new Chart(ctx, {
          type: 'bar',
          data: {
            labels: labels,
            datasets: [{
              label: 'Avg Mention Toxicity',
              data: scores,
              backgroundColor: scores.map(score => {
                if (score < 0.3) return 'rgba(46, 204, 113, 0.7)';
                if (score < 0.6) return 'rgba(243, 156, 18, 0.7)';
                return 'rgba(231, 76, 60, 0.7)';
              }),
              borderColor: scores.map(score => {
                if (score < 0.3) return 'rgba(46, 204, 113, 1)';
                if (score < 0.6) return 'rgba(243, 156, 18, 1)';
                return 'rgba(231, 76, 60, 1)';
              }),
              borderWidth: 1,
              borderRadius: 4
            }]
          },
          options: {
            indexAxis: 'y',
            responsive: true,
            maintainAspectRatio: false,
            plugins: {
              legend: {
                display: false
              },
              tooltip: {
                callbacks: {
                  afterLabel: function(context) {
                    return 'Flagged: ' + flagged[context.dataIndex];
                  }
                }
              }
            },
            scales: {
              x: {
                beginAtZero: true,
                max: 1,
                ticks: {
                  color: 'rgba(224, 224, 224, 0.7)',
                  callback: function(value) {
                    return (value * 100).toFixed(0) + '%';
                  }
                },
                grid: {
                  color: 'rgba(255, 255, 255, 0.1)',
                  drawBorder: false
                }
              },
              y: {
                ticks: {
                  color: 'rgba(224, 224, 224, 0.9)'
                },
                grid: {
                  display: false,
                  drawBorder: false
                }
              }
            }
          }
        });
      }
    }
  });
 </script>
 {% endblock %}
--- a/src/web/templates/accounts.html
+++ b/src/web/templates/accounts.html
@ -0,0 +1,349 @@
 {% extends "base.html" %}
 {% block title %}Accounts{% endblock %}
 {% block content %}
 <div class="page-header">
  <h1>Tracked Accounts</h1>
  <span class="badge">{{ total | format_number }} total</span>
 </div>
 {# ── Search bar ──────────────────────────────────────────────────────── #}
 <form method="get" action="{{ url_for('accounts.index') }}" class="search-form">
  {# Preserve current sort parameters #}
  <input type="hidden" name="sort" value="{{ sort }}">
  <input type="hidden" name="dir" value="{{ direction }}">
  <div class="search-row">
    <input
      type="text"
      name="search"
      value="{{ search }}"
      placeholder="Search by handle..."
      class="search-input"
      aria-label="Search accounts"
    >
    <button type="submit" class="btn btn-primary">Search</button>
    {% if search %}
      <a href="{{ url_for('accounts.index', sort=sort, dir=direction) }}" class="btn btn-secondary">Clear</a>
    {% endif %}
  </div>
 </form>
 {# ── Sortable column header macro ────────────────────────────────────── #}
 {% macro sort_header(col, label) %}
  {% set new_dir = 'desc' if (sort == col and direction == 'asc') else 'asc' %}
  <a href="{{ url_for('accounts.index', search=search, sort=col, dir=new_dir, page=1) }}" class="sort-link{% if sort == col %} active{% endif %}">
    {{ label }}
    {% if sort == col %}
      <span class="sort-arrow">{% if direction == 'asc' %}&#9650;{% else %}&#9660;{% endif %}</span>
    {% endif %}
  </a>
 {% endmacro %}
 {# ── Accounts table ──────────────────────────────────────────────────── #}
 {% if accounts %}
 <div class="table-wrap">
  <table class="data-table">
    <thead>
      <tr>
        <th>{{ sort_header('handle', 'Handle') }}</th>
        <th class="num">{{ sort_header('posts', 'Posts') }}</th>
        <th class="num">{{ sort_header('mentions', 'Mentions') }}</th>
        <th>{{ sort_header('last_feed', 'Last Feed') }}</th>
        <th>{{ sort_header('last_mention', 'Last Mention') }}</th>
      </tr>
    </thead>
    <tbody>
      {% for acct in accounts %}
      <tr>
        <td class="handle-cell">
          <a href="https://bsky.app/profile/{{ acct.handle }}" target="_blank" rel="noopener" class="handle-link">
            @{{ acct.handle }}
          </a>
          {% if acct.display_name %}
            <span class="display-name">{{ acct.display_name | truncate_text }}</span>
          {% endif %}
        </td>
        <td class="num">
          <a href="{{ url_for('statuses.index', account=acct.did) }}" class="count-link">
            {{ acct.post_count | format_number }}
          </a>
        </td>
        <td class="num">
          <a href="{{ url_for('mentions.index', account=acct.did) }}" class="count-link">
            {{ acct.mention_count | format_number }}
          </a>
        </td>
        <td title="{{ acct.last_feed_collected | format_dt }}">
          {{ acct.last_feed_collected | time_ago }}
        </td>
        <td title="{{ acct.last_mention_collected | format_dt }}">
          {{ acct.last_mention_collected | time_ago }}
        </td>
      </tr>
      {% endfor %}
    </tbody>
  </table>
 </div>
 {% else %}
 <div class="empty-state">
  {% if search %}
    <p>No accounts found matching "{{ search }}".</p>
  {% else %}
    <p>No tracked accounts yet.</p>
  {% endif %}
 </div>
 {% endif %}
 {# ── Pagination ──────────────────────────────────────────────────────── #}
 {% if total_pages > 1 %}
 <nav class="pagination" aria-label="Page navigation">
  {# Previous button #}
  {% if page > 1 %}
    <a href="{{ url_for('accounts.index', search=search, sort=sort, dir=direction, page=page - 1) }}" class="page-link">&laquo; Previous</a>
  {% else %}
    <span class="page-link disabled">&laquo; Previous</span>
  {% endif %}
  {# Page numbers #}
  {% set start_page = [1, page - 2] | max %}
  {% set end_page = [total_pages, page + 2] | min %}
  {% if start_page > 1 %}
    <a href="{{ url_for('accounts.index', search=search, sort=sort, dir=direction, page=1) }}" class="page-link">1</a>
    {% if start_page > 2 %}
      <span class="page-ellipsis">&hellip;</span>
    {% endif %}
  {% endif %}
  {% for p in range(start_page, end_page + 1) %}
    {% if p == page %}
      <span class="page-link current">{{ p }}</span>
    {% else %}
      <a href="{{ url_for('accounts.index', search=search, sort=sort, dir=direction, page=p) }}" class="page-link">{{ p }}</a>
    {% endif %}
  {% endfor %}
  {% if end_page < total_pages %}
    {% if end_page < total_pages - 1 %}
      <span class="page-ellipsis">&hellip;</span>
    {% endif %}
    <a href="{{ url_for('accounts.index', search=search, sort=sort, dir=direction, page=total_pages) }}" class="page-link">{{ total_pages }}</a>
  {% endif %}
  {# Next button #}
  {% if page < total_pages %}
    <a href="{{ url_for('accounts.index', search=search, sort=sort, dir=direction, page=page + 1) }}" class="page-link">Next &raquo;</a>
  {% else %}
    <span class="page-link disabled">Next &raquo;</span>
  {% endif %}
 </nav>
 {% endif %}
 {% endblock %}
 {% block extra_css %}
 <style>
  .page-header {
    display: flex;
    align-items: center;
    gap: 1rem;
    margin-bottom: 1.5rem;
  }
  .page-header h1 {
    margin: 0;
    font-size: 1.5rem;
  }
  .badge {
    background: #0f3460;
    color: #00b4d8;
    padding: 0.25rem 0.75rem;
    border-radius: 1rem;
    font-size: 0.85rem;
  }
  /* Search */
  .search-form {
    margin-bottom: 1.5rem;
  }
  .search-row {
    display: flex;
    gap: 0.5rem;
    align-items: center;
  }
  .search-input {
    flex: 1;
    max-width: 400px;
    padding: 0.5rem 0.75rem;
    border: 1px solid #2a2a4a;
    border-radius: 0.375rem;
    background: #1a1a2e;
    color: #e0e0e0;
    font-size: 0.95rem;
  }
  .search-input::placeholder {
    color: #666;
  }
  .search-input:focus {
    outline: none;
    border-color: #00b4d8;
    box-shadow: 0 0 0 2px rgba(0, 180, 216, 0.2);
  }
  .btn {
    padding: 0.5rem 1rem;
    border: none;
    border-radius: 0.375rem;
    cursor: pointer;
    font-size: 0.9rem;
    text-decoration: none;
    display: inline-block;
  }
  .btn-primary {
    background: #00b4d8;
    color: #1a1a2e;
    font-weight: 600;
  }
  .btn-primary:hover {
    background: #0096b7;
  }
  .btn-secondary {
    background: #2a2a4a;
    color: #e0e0e0;
  }
  .btn-secondary:hover {
    background: #3a3a5a;
  }
  /* Table */
  .table-wrap {
    overflow-x: auto;
    border-radius: 0.5rem;
    background: #16213e;
    border: 1px solid #2a2a4a;
  }
  .data-table {
    width: 100%;
    border-collapse: collapse;
    font-size: 0.9rem;
  }
  .data-table thead {
    background: #0f3460;
  }
  .data-table th {
    padding: 0.75rem 1rem;
    text-align: left;
    font-weight: 600;
    white-space: nowrap;
    color: #e0e0e0;
  }
  .data-table th.num {
    text-align: right;
  }
  .data-table td {
    padding: 0.6rem 1rem;
    border-top: 1px solid #2a2a4a;
    color: #e0e0e0;
  }
  .data-table td.num {
    text-align: right;
  }
  .data-table tbody tr:hover {
    background: rgba(0, 180, 216, 0.05);
  }
  /* Sort links */
  .sort-link {
    color: #e0e0e0;
    text-decoration: none;
    white-space: nowrap;
  }
  .sort-link:hover {
    color: #00b4d8;
  }
  .sort-link.active {
    color: #00b4d8;
  }
  .sort-arrow {
    font-size: 0.7rem;
    margin-left: 0.25rem;
  }
  /* Handle cell */
  .handle-cell {
    display: flex;
    flex-direction: column;
    gap: 0.15rem;
  }
  .handle-link {
    color: #00b4d8;
    text-decoration: none;
    font-weight: 500;
  }
  .handle-link:hover {
    text-decoration: underline;
  }
  .display-name {
    font-size: 0.8rem;
    color: #888;
  }
  /* Count links */
  .count-link {
    color: #e0e0e0;
    text-decoration: none;
  }
  .count-link:hover {
    color: #00b4d8;
    text-decoration: underline;
  }
  /* Empty state */
  .empty-state {
    text-align: center;
    padding: 3rem 1rem;
    color: #888;
    background: #16213e;
    border-radius: 0.5rem;
    border: 1px solid #2a2a4a;
  }
  /* Pagination */
  .pagination {
    display: flex;
    justify-content: center;
    align-items: center;
    gap: 0.25rem;
    margin-top: 1.5rem;
    flex-wrap: wrap;
  }
  .page-link {
    padding: 0.4rem 0.75rem;
    border-radius: 0.375rem;
    background: #16213e;
    color: #e0e0e0;
    text-decoration: none;
    font-size: 0.9rem;
    border: 1px solid #2a2a4a;
    transition: background 0.15s, color 0.15s;
  }
  .page-link:hover:not(.disabled):not(.current) {
    background: #0f3460;
    color: #00b4d8;
    border-color: #00b4d8;
  }
  .page-link.current {
    background: #00b4d8;
    color: #1a1a2e;
    font-weight: 600;
    border-color: #00b4d8;
  }
  .page-link.disabled {
    color: #555;
    cursor: default;
    opacity: 0.5;
  }
  .page-ellipsis {
    padding: 0.4rem 0.25rem;
    color: #888;
  }
 </style>
 {% endblock %}
--- a/src/web/templates/analysis.html
+++ b/src/web/templates/analysis.html
@ -0,0 +1,704 @@
 {% extends "base.html" %}
 {% block title %}Toxicity Analysis Dashboard{% endblock %}
 {% block extra_css %}
 <style>
  /* Color scheme */
  :root {
    --bg-primary: #1a1a2e;
    --bg-secondary: #16213e;
    --nav-bg: #0f3460;
    --text-primary: #e0e0e0;
    --text-secondary: #b0b0b0;
    --accent: #00b4d8;
    --danger: #e74c3c;
    --warning: #f39c12;
    --success: #27ae60;
    --category-1: #00b4d8;
    --category-2: #e67e22;
    --category-3: #9b59b6;
    --category-4: #1abc9c;
    --category-5: #e74c3c;
    --category-6: #f39c12;
    --category-7: #3498db;
    --category-8: #2ecc71;
  }
  /* Layout */
  .page-header {
    margin-bottom: 2rem;
  }
  .page-header h1 {
    font-size: 2.5rem;
    font-weight: 700;
    color: var(--text-primary);
    margin-bottom: 0.5rem;
  }
  .page-header .subtitle {
    font-size: 1rem;
    color: var(--text-secondary);
  }
  /* Grid layout */
  .stats-grid {
    display: grid;
    grid-template-columns: repeat(auto-fit, minmax(250px, 1fr));
    gap: 1.5rem;
    margin-bottom: 2rem;
  }
  /* Stat cards */
  .stat-card {
    background-color: var(--bg-secondary);
    border-radius: 8px;
    padding: 1.5rem;
    border-left: 4px solid var(--accent);
    display: flex;
    flex-direction: column;
    justify-content: space-between;
  }
  .stat-card.danger {
    border-left-color: var(--danger);
  }
  .stat-card.warning {
    border-left-color: var(--warning);
  }
  .stat-card-label {
    font-size: 0.875rem;
    color: var(--text-secondary);
    text-transform: uppercase;
    letter-spacing: 0.5px;
    margin-bottom: 0.75rem;
  }
  .stat-card-value {
    font-size: 1.75rem;
    font-weight: 700;
    color: var(--text-primary);
    margin-bottom: 0.5rem;
  }
  .stat-card-detail {
    font-size: 0.875rem;
    color: var(--text-secondary);
  }
  /* Percentage bar */
  .percentage-bar {
    width: 100%;
    height: 8px;
    background-color: rgba(224, 224, 224, 0.1);
    border-radius: 4px;
    overflow: hidden;
    margin-top: 0.75rem;
  }
  .percentage-bar-fill {
    height: 100%;
    background-color: var(--accent);
    border-radius: 4px;
    transition: width 0.3s ease;
  }
  .stat-card.danger .percentage-bar-fill {
    background-color: var(--danger);
  }
  /* Cards */
  .card {
    background-color: var(--bg-secondary);
    border-radius: 8px;
    padding: 1.5rem;
    margin-bottom: 2rem;
    box-shadow: 0 2px 8px rgba(0, 0, 0, 0.3);
  }
  .card-title {
    font-size: 1.25rem;
    font-weight: 600;
    color: var(--text-primary);
    margin-bottom: 1.5rem;
    border-bottom: 2px solid rgba(0, 180, 216, 0.2);
    padding-bottom: 0.75rem;
  }
  /* Chart containers */
  .chart-container {
    position: relative;
    height: 400px;
    width: 100%;
  }
  .chart-container.horizontal {
    height: 300px;
  }
  /* Links section */
  .quick-links {
    display: flex;
    gap: 1rem;
    flex-wrap: wrap;
  }
  .quick-link {
    display: inline-flex;
    align-items: center;
    padding: 0.75rem 1.25rem;
    background-color: rgba(0, 180, 216, 0.1);
    border: 1px solid var(--accent);
    border-radius: 6px;
    color: var(--accent);
    text-decoration: none;
    font-weight: 500;
    transition: all 0.3s ease;
  }
  .quick-link:hover {
    background-color: var(--accent);
    color: var(--bg-primary);
  }
  .quick-link::after {
    content: " →";
    margin-left: 0.5rem;
  }
  /* Runs table */
  .runs-table {
    width: 100%;
    border-collapse: collapse;
  }
  .runs-table thead {
    background-color: rgba(0, 180, 216, 0.1);
    border-bottom: 2px solid var(--accent);
  }
  .runs-table th {
    padding: 1rem;
    text-align: left;
    font-weight: 600;
    color: var(--text-primary);
    font-size: 0.875rem;
    text-transform: uppercase;
    letter-spacing: 0.5px;
  }
  .runs-table td {
    padding: 0.75rem 1rem;
    border-bottom: 1px solid rgba(224, 224, 224, 0.1);
    color: var(--text-secondary);
    font-size: 0.875rem;
  }
  .runs-table tbody tr:hover {
    background-color: rgba(0, 180, 216, 0.05);
  }
  /* Status badge */
  .status-badge {
    display: inline-block;
    padding: 0.35rem 0.75rem;
    border-radius: 4px;
    font-size: 0.75rem;
    font-weight: 600;
    text-transform: uppercase;
    letter-spacing: 0.5px;
  }
  .status-badge.completed {
    background-color: rgba(39, 174, 96, 0.2);
    color: var(--success);
  }
  .status-badge.in-progress {
    background-color: rgba(243, 156, 18, 0.2);
    color: var(--warning);
  }
  .status-badge.failed {
    background-color: rgba(231, 76, 60, 0.2);
    color: var(--danger);
  }
  /* Responsive */
  @media (max-width: 768px) {
    .page-header h1 {
      font-size: 1.75rem;
    }
    .stats-grid {
      grid-template-columns: 1fr;
    }
    .chart-container {
      height: 300px;
    }
    .quick-links {
      flex-direction: column;
    }
    .quick-link {
      width: 100%;
      justify-content: center;
    }
    .runs-table th,
    .runs-table td {
      padding: 0.5rem 0.75rem;
      font-size: 0.75rem;
    }
  }
  /* Empty state */
  .empty-state {
    text-align: center;
    padding: 2rem;
    color: var(--text-secondary);
  }
  .empty-state p {
    margin: 0;
  }
  /* Number formatting */
  .number-highlight {
    color: var(--accent);
    font-weight: 600;
  }
  .percentage-highlight {
    color: var(--warning);
    font-weight: 600;
  }
  .percentage-highlight.high {
    color: var(--danger);
  }
 </style>
 {% endblock %}
 {% block content %}
 <div class="page-header">
  <h1>Toxicity Analysis</h1>
  <p class="subtitle">
    {{ stats.total_scored_posts | format_number }} / {{ stats.total_posts | format_number }} posts scored
    <span style="margin: 0 0.5rem;">•</span>
    {{ stats.total_scored_mentions | format_number }} / {{ stats.total_mentions | format_number }} mentions scored
  </p>
 </div>
 <!-- Stats Grid -->
 <div class="stats-grid">
  <!-- Total Scored Card -->
  <div class="stat-card">
    <div>
      <div class="stat-card-label">Total Scored</div>
      <div class="stat-card-value">{{ (stats.total_scored_posts + stats.total_scored_mentions) | format_number }}</div>
    </div>
    <div class="stat-card-detail">
      {{ stats.total_scored_posts | format_number }} posts
      <span style="color: var(--text-secondary); margin: 0 0.25rem;">+</span>
      {{ stats.total_scored_mentions | format_number }} mentions
    </div>
  </div>
  <!-- Flagged Posts Card -->
  <div class="stat-card {% if stats.flagged_posts > 0 and (stats.flagged_posts / (stats.total_scored_posts or 1)) > 0.05 %}danger{% endif %}">
    <div>
      <div class="stat-card-label">Flagged Posts</div>
      <div class="stat-card-value">{{ stats.flagged_posts | format_number }}</div>
    </div>
    <div class="stat-card-detail">
      <span class="{% if stats.flagged_posts > 0 and (stats.flagged_posts / (stats.total_scored_posts or 1)) > 0.05 %}percentage-highlight high{% else %}percentage-highlight{% endif %}">
        {{ "%.2f" | format(100.0 * stats.flagged_posts / (stats.total_scored_posts or 1)) }}%
      </span>
      of scored posts
    </div>
    <div class="percentage-bar">
      <div class="percentage-bar-fill" style="width: {{ 100.0 * stats.flagged_posts / (stats.total_scored_posts or 1) }}%"></div>
    </div>
  </div>
  <!-- Flagged Mentions Card -->
  <div class="stat-card">
    <div>
      <div class="stat-card-label">Flagged Mentions</div>
      <div class="stat-card-value">{{ stats.flagged_mentions | format_number }}</div>
    </div>
    <div class="stat-card-detail">
      <span class="percentage-highlight">
        {{ "%.2f" | format(100.0 * stats.flagged_mentions / (stats.total_scored_mentions or 1)) }}%
      </span>
      of scored mentions
    </div>
    <div class="percentage-bar">
      <div class="percentage-bar-fill" style="width: {{ 100.0 * stats.flagged_mentions / (stats.total_scored_mentions or 1) }}%"></div>
    </div>
  </div>
  <!-- Avg Toxicity Card -->
  <div class="stat-card">
    <div>
      <div class="stat-card-label">Average Toxicity</div>
      <div class="stat-card-value">{{ "%.1f" | format(100.0 * ((stats.avg_toxicity_posts + stats.avg_toxicity_mentions) / 2.0)) }}%</div>
    </div>
    <div class="stat-card-detail">
      Posts: {{ "%.2f" | format(100.0 * stats.avg_toxicity_posts) }}%
      <span style="color: var(--text-secondary); margin: 0 0.25rem;">•</span>
      Mentions: {{ "%.2f" | format(100.0 * stats.avg_toxicity_mentions) }}%
    </div>
    <div class="percentage-bar">
      <div class="percentage-bar-fill" style="width: {{ 100.0 * ((stats.avg_toxicity_posts + stats.avg_toxicity_mentions) / 2.0) }}%"></div>
    </div>
  </div>
 </div>
 <!-- Trend Chart -->
 <div class="card">
  <div class="card-title">Toxicity Trends Over Time</div>
  <div class="chart-container">
    <canvas id="trendChart"></canvas>
  </div>
 </div>
 <!-- Category Breakdown -->
 <div class="card">
  <div class="card-title">Toxicity by Category</div>
  <div class="chart-container horizontal">
    <canvas id="categoriesChart"></canvas>
  </div>
 </div>
 <!-- Recent Analysis Runs -->
 <div class="card">
  <div class="card-title">Recent Analysis Runs</div>
  {% if runs %}
    <div style="overflow-x: auto;">
      <table class="runs-table">
        <thead>
          <tr>
            <th>Started</th>
            <th>Duration</th>
            <th>Posts Scored</th>
            <th>Mentions Scored</th>
            <th>Errors</th>
            <th>Cost</th>
            <th>Status</th>
          </tr>
        </thead>
        <tbody>
          {% for run in runs[:5] %}
            <tr>
              <td>{{ run.started_at | time_ago }}</td>
              <td>{% if run.duration_secs is not none %}{{ "%.0f" | format(run.duration_secs | float) }}s{% else %}—{% endif %}</td>
              <td>{{ run.posts_scored | format_number }}</td>
              <td>{{ run.mentions_scored | format_number }}</td>
              <td>{{ run.errors }}</td>
              <td>${{ "%.4f" | format(run.cost_usd | default(0) | float) }}</td>
              <td>
                <span class="status-badge {% if run.status == 'completed' %}completed{% elif run.status == 'in_progress' %}in-progress{% elif run.status == 'failed' %}failed{% endif %}">
                  {{ run.status }}
                </span>
              </td>
            </tr>
          {% endfor %}
        </tbody>
      </table>
    </div>
  {% else %}
    <div class="empty-state">
      <p>No analysis runs yet. Start a new analysis to see results here.</p>
    </div>
  {% endif %}
 </div>
 <!-- Quick Links -->
 <div style="margin-top: 2rem;">
  <div class="quick-links">
    <a href="{{ url_for('analysis.flagged') }}" class="quick-link">View Flagged Content</a>
    <a href="{{ url_for('analysis.accounts') }}" class="quick-link">Account Breakdown</a>
  </div>
 </div>
 <!-- Chart.js Script -->
 <script src="https://cdnjs.cloudflare.com/ajax/libs/Chart.js/4.4.7/chart.umd.min.js"></script>
 <script>
  // Chart color scheme
  const chartColors = {
    accent: '#00b4d8',
    orange: '#e67e22',
    gridLine: '#2a2a3e',
    text: '#e0e0e0',
    categories: [
      '#00b4d8', // 1
      '#e67e22', // 2
      '#9b59b6', // 3
      '#1abc9c', // 4
      '#e74c3c', // 5
      '#f39c12', // 6
      '#3498db', // 7
      '#2ecc71'  // 8
    ]
  };
  // Trend Chart
  {% if trend_json %}
    const trendData = {{ trend_json | safe }};
    const trendCtx = document.getElementById('trendChart').getContext('2d');
    const trendChart = new Chart(trendCtx, {
      type: 'line',
      data: {
        labels: trendData.map(d => d.week),
        datasets: [
          {
            label: 'Avg Post Toxicity',
            data: trendData.map(d => d.avg_post_toxicity),
            borderColor: chartColors.accent,
            backgroundColor: 'rgba(0, 180, 216, 0.05)',
            borderWidth: 2,
            tension: 0.4,
            fill: true,
            yAxisID: 'y',
            pointBackgroundColor: chartColors.accent,
            pointBorderColor: '#1a1a2e',
            pointBorderWidth: 2,
            pointRadius: 5,
            pointHoverRadius: 7
          },
          {
            label: 'Avg Mention Toxicity',
            data: trendData.map(d => d.avg_mention_toxicity),
            borderColor: chartColors.orange,
            backgroundColor: 'rgba(230, 126, 34, 0.05)',
            borderWidth: 2,
            tension: 0.4,
            fill: true,
            yAxisID: 'y',
            pointBackgroundColor: chartColors.orange,
            pointBorderColor: '#1a1a2e',
            pointBorderWidth: 2,
            pointRadius: 5,
            pointHoverRadius: 7
          },
          {
            label: 'Flagged Posts',
            data: trendData.map(d => d.flagged_posts),
            type: 'bar',
            borderColor: 'rgba(231, 76, 60, 0.5)',
            backgroundColor: 'rgba(231, 76, 60, 0.2)',
            yAxisID: 'y1',
            borderWidth: 1,
            barThickness: 8,
            categoryPercentage: 0.8,
            maxBarThickness: 15
          },
          {
            label: 'Flagged Mentions',
            data: trendData.map(d => d.flagged_mentions),
            type: 'bar',
            borderColor: 'rgba(243, 156, 18, 0.5)',
            backgroundColor: 'rgba(243, 156, 18, 0.2)',
            yAxisID: 'y1',
            borderWidth: 1,
            barThickness: 8,
            categoryPercentage: 0.8,
            maxBarThickness: 15
          }
        ]
      },
      options: {
        responsive: true,
        maintainAspectRatio: false,
        interaction: {
          mode: 'index',
          intersect: false
        },
        plugins: {
          legend: {
            display: true,
            labels: {
              color: chartColors.text,
              usePointStyle: true,
              padding: 15,
              font: {
                size: 12
              }
            }
          },
          tooltip: {
            backgroundColor: '#0f3460',
            titleColor: chartColors.text,
            bodyColor: chartColors.text,
            borderColor: chartColors.accent,
            borderWidth: 1,
            padding: 10,
            displayColors: true
          }
        },
        scales: {
          x: {
            grid: {
              color: chartColors.gridLine,
              drawBorder: false
            },
            ticks: {
              color: chartColors.text,
              font: {
                size: 11
              }
            }
          },
          y: {
            type: 'linear',
            display: true,
            position: 'left',
            min: 0,
            max: 1,
            ticks: {
              color: chartColors.text,
              font: {
                size: 11
              },
              callback: function(value) {
                return (value * 100).toFixed(0) + '%';
              }
            },
            grid: {
              color: chartColors.gridLine,
              drawBorder: false
            },
            title: {
              display: true,
              text: 'Toxicity Score',
              color: chartColors.text,
              font: {
                size: 12,
                weight: 'bold'
              }
            }
          },
          y1: {
            type: 'linear',
            display: true,
            position: 'right',
            grid: {
              drawOnChartArea: false
            },
            ticks: {
              color: chartColors.text,
              font: {
                size: 11
              }
            },
            title: {
              display: true,
              text: 'Flagged Count',
              color: chartColors.text,
              font: {
                size: 12,
                weight: 'bold'
              }
            }
          }
        }
      }
    });
  {% endif %}
  // Category Chart
  {% if categories_json %}
    const categoriesData = {{ categories_json | safe }};
    const categoryNames = {{ categories | tojson | safe }};
    const categoriesCtx = document.getElementById('categoriesChart').getContext('2d');
    const categoriesChart = new Chart(categoriesCtx, {
      type: 'bar',
      data: {
        labels: categoryNames,
        datasets: [
          {
            label: 'Average Toxicity Score',
            data: categoryNames.map(cat => categoriesData[cat] || 0),
            backgroundColor: chartColors.categories,
            borderColor: chartColors.categories.map(c => c.replace('0.', '1.')),
            borderWidth: 1.5,
            borderRadius: 4
          }
        ]
      },
      options: {
        indexAxis: 'y',
        responsive: true,
        maintainAspectRatio: false,
        plugins: {
          legend: {
            display: false
          },
          tooltip: {
            backgroundColor: '#0f3460',
            titleColor: chartColors.text,
            bodyColor: chartColors.text,
            borderColor: chartColors.accent,
            borderWidth: 1,
            padding: 10,
            callbacks: {
              label: function(context) {
                return 'Score: ' + (context.parsed.x * 100).toFixed(2) + '%';
              }
            }
          }
        },
        scales: {
          x: {
            min: 0,
            max: 1,
            grid: {
              color: chartColors.gridLine,
              drawBorder: false
            },
            ticks: {
              color: chartColors.text,
              font: {
                size: 11
              },
              callback: function(value) {
                return (value * 100).toFixed(0) + '%';
              }
            },
            title: {
              display: true,
              text: 'Average Toxicity',
              color: chartColors.text,
              font: {
                size: 12,
                weight: 'bold'
              }
            }
          },
          y: {
            grid: {
              drawOnChartArea: false,
              drawBorder: false
            },
            ticks: {
              color: chartColors.text,
              font: {
                size: 11
              }
            }
          }
        }
      }
    });
  {% endif %}
 </script>
 {% endblock %}
--- a/src/web/templates/base.html
+++ b/src/web/templates/base.html
@ -0,0 +1,688 @@
 <!DOCTYPE html>
 <html lang="en">
 <head>
  <meta charset="UTF-8">
  <meta name="viewport" content="width=device-width, initial-scale=1.0">
  <title>{% block title %}Dashboard{% endblock %} - Bluesky Collector</title>
  <style>
    /* ===== Reset & Base ===== */
    *, *::before, *::after {
      box-sizing: border-box;
      margin: 0;
      padding: 0;
    }
    html {
      font-size: 15px;
      -webkit-font-smoothing: antialiased;
      -moz-osx-font-smoothing: grayscale;
    }
    body {
      font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", Roboto, Oxygen,
        Ubuntu, Cantarell, "Helvetica Neue", Arial, sans-serif;
      background-color: #1a1a2e;
      color: #e0e0e0;
      line-height: 1.6;
      min-height: 100vh;
      display: flex;
      flex-direction: column;
    }
    a {
      color: #00b4d8;
      text-decoration: none;
      transition: color 0.2s ease;
    }
    a:hover {
      color: #48cae4;
    }
    /* ===== Navigation ===== */
    .navbar {
      background-color: #0f3460;
      padding: 0 2rem;
      display: flex;
      align-items: center;
      justify-content: space-between;
      height: 60px;
      box-shadow: 0 2px 8px rgba(0, 0, 0, 0.3);
      position: sticky;
      top: 0;
      z-index: 1000;
    }
    .navbar-brand {
      display: flex;
      align-items: center;
      gap: 0.5rem;
      font-size: 1.25rem;
      font-weight: 700;
      color: #ffffff;
      letter-spacing: 0.02em;
    }
    .navbar-brand .brand-icon {
      font-size: 1.4rem;
    }
    .navbar-nav {
      display: flex;
      list-style: none;
      gap: 0.25rem;
      align-items: center;
    }
    .navbar-nav a {
      display: block;
      padding: 0.5rem 1rem;
      color: #b0c4de;
      border-radius: 6px;
      font-size: 0.9rem;
      font-weight: 500;
      transition: background-color 0.2s ease, color 0.2s ease;
    }
    .navbar-nav a:hover {
      background-color: rgba(0, 180, 216, 0.15);
      color: #e0e0e0;
    }
    .navbar-nav a.active {
      background-color: rgba(0, 180, 216, 0.2);
      color: #00b4d8;
    }
    /* ===== Main Content ===== */
    .main-content {
      flex: 1;
      padding: 2rem;
      max-width: 1280px;
      width: 100%;
      margin: 0 auto;
    }
    .page-header {
      margin-bottom: 1.75rem;
    }
    .page-header h1 {
      font-size: 1.6rem;
      font-weight: 700;
      color: #ffffff;
    }
    .page-header p {
      color: #8899aa;
      margin-top: 0.25rem;
      font-size: 0.95rem;
    }
    /* ===== Cards ===== */
    .card {
      background-color: #16213e;
      border-radius: 10px;
      padding: 1.5rem;
      box-shadow: 0 2px 12px rgba(0, 0, 0, 0.25);
      border: 1px solid rgba(255, 255, 255, 0.04);
      transition: transform 0.15s ease, box-shadow 0.15s ease;
    }
    .card:hover {
      transform: translateY(-2px);
      box-shadow: 0 6px 20px rgba(0, 0, 0, 0.35);
    }
    .card-title {
      font-size: 0.85rem;
      font-weight: 600;
      text-transform: uppercase;
      letter-spacing: 0.06em;
      color: #8899aa;
      margin-bottom: 0.5rem;
    }
    .card-value {
      font-size: 2rem;
      font-weight: 700;
      color: #ffffff;
      line-height: 1.2;
    }
    .card-footer-text {
      font-size: 0.8rem;
      color: #667788;
      margin-top: 0.5rem;
    }
    /* Stat cards grid */
    .stats-grid {
      display: grid;
      grid-template-columns: repeat(auto-fit, minmax(200px, 1fr));
      gap: 1.25rem;
      margin-bottom: 2rem;
    }
    .stat-card {
      text-align: center;
      padding: 1.75rem 1.5rem;
    }
    .stat-card .card-value {
      font-size: 2.25rem;
    }
    /* ===== Tables ===== */
    .table-wrapper {
      background-color: #16213e;
      border-radius: 10px;
      overflow: hidden;
      box-shadow: 0 2px 12px rgba(0, 0, 0, 0.25);
      border: 1px solid rgba(255, 255, 255, 0.04);
    }
    .table-header {
      padding: 1.25rem 1.5rem;
      border-bottom: 1px solid rgba(255, 255, 255, 0.06);
    }
    .table-header h2 {
      font-size: 1.1rem;
      font-weight: 600;
      color: #ffffff;
    }
    table {
      width: 100%;
      border-collapse: collapse;
    }
    thead th {
      background-color: rgba(0, 0, 0, 0.15);
      padding: 0.75rem 1rem;
      text-align: left;
      font-size: 0.8rem;
      font-weight: 600;
      text-transform: uppercase;
      letter-spacing: 0.05em;
      color: #8899aa;
      border-bottom: 1px solid rgba(255, 255, 255, 0.06);
      white-space: nowrap;
    }
    tbody tr {
      border-bottom: 1px solid rgba(255, 255, 255, 0.03);
      transition: background-color 0.15s ease;
    }
    tbody tr:nth-child(even) {
      background-color: rgba(0, 0, 0, 0.08);
    }
    tbody tr:hover {
      background-color: rgba(0, 180, 216, 0.06);
    }
    tbody td {
      padding: 0.75rem 1rem;
      font-size: 0.9rem;
      color: #d0d0d0;
      vertical-align: top;
    }
    .table-empty {
      text-align: center;
      padding: 2.5rem 1rem;
      color: #667788;
      font-size: 0.95rem;
    }
    /* Alignment helpers */
    .text-right {
      text-align: right;
    }
    .text-center {
      text-align: center;
    }
    /* ===== Badges ===== */
    .badge {
      display: inline-block;
      padding: 0.2rem 0.65rem;
      border-radius: 50px;
      font-size: 0.75rem;
      font-weight: 600;
      letter-spacing: 0.03em;
      text-transform: capitalize;
      line-height: 1.4;
    }
    /* Post type badges */
    .badge-post {
      background-color: rgba(0, 180, 216, 0.15);
      color: #00b4d8;
    }
    .badge-reply {
      background-color: rgba(155, 89, 182, 0.15);
      color: #9b59b6;
    }
    .badge-repost {
      background-color: rgba(230, 126, 34, 0.15);
      color: #e67e22;
    }
    /* Status badges */
    .badge-completed {
      background-color: rgba(39, 174, 96, 0.15);
      color: #27ae60;
    }
    .badge-partial {
      background-color: rgba(243, 156, 18, 0.15);
      color: #f39c12;
    }
    .badge-running {
      background-color: rgba(52, 152, 219, 0.15);
      color: #3498db;
    }
    .badge-failed {
      background-color: rgba(231, 76, 60, 0.15);
      color: #e74c3c;
    }
    /* ===== Pagination ===== */
    .pagination {
      display: flex;
      justify-content: center;
      align-items: center;
      gap: 0.35rem;
      margin-top: 1.5rem;
      padding: 1rem 0;
      flex-wrap: wrap;
    }
    .pagination a,
    .pagination span {
      display: inline-flex;
      align-items: center;
      justify-content: center;
      min-width: 36px;
      height: 36px;
      padding: 0 0.5rem;
      border-radius: 8px;
      font-size: 0.85rem;
      font-weight: 500;
      color: #b0c4de;
      background-color: #16213e;
      border: 1px solid rgba(255, 255, 255, 0.06);
      transition: background-color 0.2s ease, color 0.2s ease, border-color 0.2s ease;
    }
    .pagination a:hover {
      background-color: rgba(0, 180, 216, 0.12);
      border-color: rgba(0, 180, 216, 0.3);
      color: #00b4d8;
    }
    .pagination .active {
      background-color: #00b4d8;
      color: #ffffff;
      border-color: #00b4d8;
      font-weight: 700;
    }
    .pagination .disabled {
      opacity: 0.35;
      pointer-events: none;
    }
    .pagination .ellipsis {
      background: none;
      border: none;
      color: #667788;
      cursor: default;
    }
    /* ===== Buttons ===== */
    .btn {
      display: inline-flex;
      align-items: center;
      gap: 0.4rem;
      padding: 0.55rem 1.15rem;
      border-radius: 8px;
      font-size: 0.85rem;
      font-weight: 600;
      border: none;
      cursor: pointer;
      text-decoration: none;
      transition: background-color 0.2s ease, transform 0.1s ease;
    }
    .btn:active {
      transform: scale(0.97);
    }
    .btn:hover {
      text-decoration: none;
    }
    .btn-primary {
      background-color: #00b4d8;
      color: #ffffff;
    }
    .btn-primary:hover {
      background-color: #0096b7;
      color: #ffffff;
    }
    .btn-secondary {
      background-color: rgba(255, 255, 255, 0.08);
      color: #e0e0e0;
    }
    .btn-secondary:hover {
      background-color: rgba(255, 255, 255, 0.14);
    }
    .btn-danger {
      background-color: rgba(231, 76, 60, 0.15);
      color: #e74c3c;
    }
    .btn-danger:hover {
      background-color: rgba(231, 76, 60, 0.25);
    }
    /* ===== Filter Bar ===== */
    .filter-bar {
      display: flex;
      flex-wrap: wrap;
      gap: 0.75rem;
      align-items: center;
      margin-bottom: 1.25rem;
      padding: 1rem 1.25rem;
      background-color: #16213e;
      border-radius: 10px;
      border: 1px solid rgba(255, 255, 255, 0.04);
    }
    .filter-bar select,
    .filter-bar input[type="text"] {
      background-color: #1a1a2e;
      color: #e0e0e0;
      border: 1px solid rgba(255, 255, 255, 0.12);
      border-radius: 6px;
      padding: 0.5rem 0.75rem;
      font-size: 0.85rem;
      transition: border-color 0.2s ease;
    }
    .filter-bar select:focus,
    .filter-bar input[type="text"]:focus {
      outline: none;
      border-color: #00b4d8;
      box-shadow: 0 0 0 2px rgba(0, 180, 216, 0.15);
    }
    /* ===== Utility ===== */
    .mono {
      font-family: "SF Mono", "Fira Code", "Fira Mono", Menlo, Consolas,
        "DejaVu Sans Mono", monospace;
      font-size: 0.85em;
    }
    .text-muted {
      color: #667788;
    }
    .text-accent {
      color: #00b4d8;
    }
    .mt-1 { margin-top: 0.5rem; }
    .mt-2 { margin-top: 1rem; }
    .mt-3 { margin-top: 1.5rem; }
    .mb-1 { margin-bottom: 0.5rem; }
    .mb-2 { margin-bottom: 1rem; }
    .mb-3 { margin-bottom: 1.5rem; }
    .text-preview {
      max-width: 400px;
      overflow: hidden;
      text-overflow: ellipsis;
    }
    .result-count {
      font-size: 0.85rem;
      color: #8899aa;
      margin-bottom: 0.75rem;
    }
    .back-link {
      display: inline-block;
      margin-bottom: 1rem;
      font-size: 0.9rem;
    }
    .post-text {
      white-space: pre-wrap;
      word-break: break-word;
      line-height: 1.7;
      margin: 1rem 0;
    }
    .reply-card {
      border-left: 3px solid #9b59b6;
      padding-left: 1rem;
      margin-bottom: 0.75rem;
    }
    h1, h2, h3 {
      color: #ffffff;
    }
    /* ===== Details / Collapsible ===== */
    details {
      margin-top: 1rem;
    }
    details summary {
      cursor: pointer;
      color: #00b4d8;
      font-size: 0.9rem;
      margin-bottom: 0.5rem;
    }
    details pre {
      background-color: #1a1a2e;
      padding: 1rem;
      border-radius: 8px;
      overflow-x: auto;
      font-size: 0.8rem;
      color: #c0c0c0;
      max-height: 500px;
      overflow-y: auto;
    }
    /* ===== Flash Messages ===== */
    .flash-messages {
      margin-bottom: 1.5rem;
    }
    .flash {
      padding: 0.85rem 1.25rem;
      border-radius: 8px;
      font-size: 0.9rem;
      margin-bottom: 0.5rem;
    }
    .flash-success {
      background-color: rgba(39, 174, 96, 0.12);
      color: #27ae60;
      border: 1px solid rgba(39, 174, 96, 0.2);
    }
    .flash-error {
      background-color: rgba(231, 76, 60, 0.12);
      color: #e74c3c;
      border: 1px solid rgba(231, 76, 60, 0.2);
    }
    .flash-info {
      background-color: rgba(0, 180, 216, 0.12);
      color: #00b4d8;
      border: 1px solid rgba(0, 180, 216, 0.2);
    }
    /* ===== Stats Row (inline) ===== */
    .stats-row {
      display: flex;
      gap: 1.5rem;
      flex-wrap: wrap;
      margin: 1rem 0;
    }
    .stat-item {
      display: flex;
      align-items: center;
      gap: 0.35rem;
      font-size: 0.9rem;
      color: #8899aa;
    }
    .stat-item strong {
      color: #e0e0e0;
    }
    /* ===== Footer ===== */
    .footer {
      text-align: center;
      padding: 1.5rem 2rem;
      color: #4a5568;
      font-size: 0.8rem;
      border-top: 1px solid rgba(255, 255, 255, 0.04);
      margin-top: auto;
    }
    .footer span {
      color: #667788;
    }
    /* ===== Responsive ===== */
    @media (max-width: 768px) {
      .navbar {
        padding: 0 1rem;
        flex-wrap: wrap;
        height: auto;
        padding-top: 0.75rem;
        padding-bottom: 0.75rem;
        gap: 0.5rem;
      }
      .navbar-nav {
        gap: 0.15rem;
        flex-wrap: wrap;
      }
      .navbar-nav a {
        padding: 0.4rem 0.7rem;
        font-size: 0.82rem;
      }
      .main-content {
        padding: 1.25rem;
      }
      .stats-grid {
        grid-template-columns: repeat(2, 1fr);
        gap: 0.75rem;
      }
      .stat-card .card-value {
        font-size: 1.6rem;
      }
      table {
        font-size: 0.82rem;
      }
      thead th,
      tbody td {
        padding: 0.55rem 0.65rem;
      }
    }
    @media (max-width: 480px) {
      .stats-grid {
        grid-template-columns: 1fr;
      }
    }
  </style>
  {% block extra_css %}{% endblock %}
 </head>
 <body>
  <nav class="navbar">
    <div class="navbar-brand">
      <span class="brand-icon">&#129419;</span>
      Bluesky Collector
    </div>
    <ul class="navbar-nav">
      <li>
        <a href="/" class="{% if request.path == '/' %}active{% endif %}">
          Dashboard
        </a>
      </li>
      <li>
        <a href="/accounts" class="{% if request.path.startswith('/accounts') %}active{% endif %}">
          Accounts
        </a>
      </li>
      <li>
        <a href="/statuses" class="{% if request.path.startswith('/statuses') %}active{% endif %}">
          Statuses
        </a>
      </li>
      <li>
        <a href="/mentions" class="{% if request.path.startswith('/mentions') %}active{% endif %}">
          Mentions
        </a>
      </li>
      <li>
        <a href="/analysis" class="{% if request.path.startswith('/analysis') %}active{% endif %}">
          Analysis
        </a>
      </li>
      <li>
        <a href="/export" class="{% if request.path.startswith('/export') %}active{% endif %}">
          Export
        </a>
      </li>
    </ul>
  </nav>
  <main class="main-content">
    {% with messages = get_flashed_messages(with_categories=true) %}
      {% if messages %}
        <div class="flash-messages">
          {% for category, message in messages %}
            <div class="flash flash-{{ category }}">{{ message }}</div>
          {% endfor %}
        </div>
      {% endif %}
    {% endwith %}
    {% block content %}{% endblock %}
  </main>
  <footer class="footer">
    <span>Bluesky Collector</span>
  </footer>
 </body>
 </html>
--- a/src/web/templates/dashboard.html
+++ b/src/web/templates/dashboard.html
@ -0,0 +1,93 @@
 {% extends "base.html" %}
 {% block title %}Dashboard{% endblock %}
 {% block content %}
 <div class="page-header">
  <h1>Dashboard</h1>
  <p>Overview of your Bluesky data collection</p>
 </div>
 <!-- Stat Cards -->
 <div class="stats-grid">
  <div class="card stat-card">
    <div class="card-title">Accounts</div>
    <div class="card-value">{{ stats.accounts | format_number }}</div>
    <div class="card-footer-text">Tracked accounts</div>
  </div>
  <div class="card stat-card">
    <div class="card-title">Posts</div>
    <div class="card-value">{{ stats.posts | format_number }}</div>
    <div class="card-footer-text">Collected posts</div>
  </div>
  <div class="card stat-card">
    <div class="card-title">Mentions</div>
    <div class="card-value">{{ stats.mentions | format_number }}</div>
    <div class="card-footer-text">Detected mentions</div>
  </div>
  <div class="card stat-card">
    <div class="card-title">Collection Runs</div>
    <div class="card-value">{{ stats.runs | format_number }}</div>
    <div class="card-footer-text">Total runs</div>
  </div>
 </div>
 <!-- Recent Collection Runs -->
 <div class="table-wrapper">
  <div class="table-header">
    <h2>Recent Collection Runs</h2>
  </div>
  <table>
    <thead>
      <tr>
        <th>Started</th>
        <th>Duration</th>
        <th>Status</th>
        <th class="text-right">Accounts</th>
        <th class="text-right">Posts</th>
        <th class="text-right">Mentions</th>
        <th class="text-right">Errors</th>
      </tr>
    </thead>
    <tbody>
      {% for run in runs %}
      <tr>
        <td class="mono">{{ run.started_at | format_dt }}</td>
        <td>
          {% if run.duration_secs is not none %}
            {% set minutes = (run.duration_secs // 60) | int %}
            {% set seconds = (run.duration_secs % 60) | int %}
            {% if minutes > 0 %}
              {{ minutes }}m {{ seconds }}s
            {% else %}
              {{ seconds }}s
            {% endif %}
          {% else %}
            &mdash;
          {% endif %}
        </td>
        <td>
          <span class="badge badge-{{ run.status }}">{{ run.status }}</span>
        </td>
        <td class="text-right">{{ run.accounts_done | format_number }}</td>
        <td class="text-right">{{ run.posts_collected | format_number }}</td>
        <td class="text-right">{{ run.mentions_collected | format_number }}</td>
        <td class="text-right">
          {% if run.errors %}
            <span class="text-accent">{{ run.errors | length }}</span>
          {% else %}
            0
          {% endif %}
        </td>
      </tr>
      {% else %}
      <tr>
        <td colspan="7" class="table-empty">
          No collection runs yet. Start a collection to see results here.
        </td>
      </tr>
      {% endfor %}
    </tbody>
  </table>
 </div>
 {% endblock %}
--- a/src/web/templates/export.html
+++ b/src/web/templates/export.html
@ -0,0 +1,114 @@
 {% extends "base.html" %}
 {% block title %}Export{% endblock %}
 {% block content %}
 <div class="page-header">
    <h1>Export Data</h1>
    <p class="text-muted">Download posts and mentions as CSV files for analysis.</p>
 </div>
 <div class="export-grid">
    <!-- Posts Export -->
    <div class="card">
        <h2>Export Posts</h2>
        <p class="text-muted">Download all collected posts from tracked accounts.</p>
        <form action="{{ url_for('export.posts_csv') }}" method="get" class="export-form">
            <div class="form-group">
                <label for="post-account">Filter by Account</label>
                <select id="post-account" name="account" class="form-control">
                    <option value="">All accounts</option>
                    {% for a in accounts %}
                    <option value="{{ a.did }}">{{ a.handle }}</option>
                    {% endfor %}
                </select>
            </div>
            <div class="form-row">
                <div class="form-group">
                    <label for="post-since">From date</label>
                    <input type="date" id="post-since" name="since" class="form-control">
                </div>
                <div class="form-group">
                    <label for="post-until">To date</label>
                    <input type="date" id="post-until" name="until" class="form-control">
                </div>
            </div>
            <button type="submit" class="btn btn-primary">
                Download Posts CSV
            </button>
        </form>
    </div>
    <!-- Mentions Export -->
    <div class="card">
        <h2>Export Mentions</h2>
        <p class="text-muted">Download all collected mentions of tracked accounts.</p>
        <form action="{{ url_for('export.mentions_csv') }}" method="get" class="export-form">
            <div class="form-group">
                <label for="mention-account">Filter by Mentioned Account</label>
                <select id="mention-account" name="account" class="form-control">
                    <option value="">All accounts</option>
                    {% for a in accounts %}
                    <option value="{{ a.did }}">{{ a.handle }}</option>
                    {% endfor %}
                </select>
            </div>
            <div class="form-row">
                <div class="form-group">
                    <label for="mention-since">From date</label>
                    <input type="date" id="mention-since" name="since" class="form-control">
                </div>
                <div class="form-group">
                    <label for="mention-until">To date</label>
                    <input type="date" id="mention-until" name="until" class="form-control">
                </div>
            </div>
            <button type="submit" class="btn btn-primary">
                Download Mentions CSV
            </button>
        </form>
    </div>
 </div>
 <style>
    .export-grid {
        display: grid;
        grid-template-columns: repeat(auto-fit, minmax(400px, 1fr));
        gap: 1.5rem;
        margin-top: 1.5rem;
    }
    .export-form {
        margin-top: 1rem;
    }
    .form-group {
        margin-bottom: 1rem;
    }
    .form-group label {
        display: block;
        margin-bottom: 0.4rem;
        color: #b0b0b0;
        font-size: 0.9rem;
    }
    .form-control {
        width: 100%;
        padding: 0.5rem 0.75rem;
        background: #1a1a2e;
        border: 1px solid #2a2a4a;
        border-radius: 6px;
        color: #e0e0e0;
        font-size: 0.95rem;
    }
    .form-control:focus {
        border-color: #00b4d8;
        outline: none;
    }
    .form-row {
        display: grid;
        grid-template-columns: 1fr 1fr;
        gap: 1rem;
    }
    .export-form .btn {
        margin-top: 0.5rem;
        width: 100%;
    }
 </style>
 {% endblock %}
--- a/src/web/templates/flagged.html
+++ b/src/web/templates/flagged.html
@ -0,0 +1,594 @@
 {% extends "base.html" %}
 {% block title %}Flagged Content{% endblock %}
 {% block content %}
 <div class="flagged-container">
  <!-- Page Header -->
  <div class="page-header">
    <h1>Flagged Content</h1>
    <span class="total-badge">{{ total | format_number }}</span>
  </div>
  <!-- Filter Bar -->
  <div class="filter-bar">
    <form method="get" action="{{ url_for('analysis.flagged') }}" class="filter-form">
      <div class="filter-group">
        <label for="content-type">Type:</label>
        <select id="content-type" name="content_type" class="filter-select">
          <option value="">All Types</option>
          <option value="post" {% if content_type == 'post' %}selected{% endif %}>Post</option>
          <option value="reply" {% if content_type == 'reply' %}selected{% endif %}>Reply</option>
          <option value="mention" {% if content_type == 'mention' %}selected{% endif %}>Mention</option>
        </select>
      </div>
      <div class="filter-group">
        <label for="category">Category:</label>
        <select id="category" name="category" class="filter-select">
          <option value="">All Categories</option>
          {% for cat in categories %}
          <option value="{{ cat }}" {% if category == cat %}selected{% endif %}>{{ cat }}</option>
          {% endfor %}
        </select>
      </div>
      <div class="filter-group">
        <label for="account-did">Account:</label>
        <select id="account-did" name="account_did" class="filter-select">
          <option value="">All Accounts</option>
          {% for acc in accounts %}
          <option value="{{ acc.did }}" {% if account_did == acc.did %}selected{% endif %}>{{ acc.handle }}</option>
          {% endfor %}
        </select>
      </div>
      <div class="filter-group">
        <label for="threshold">Threshold:</label>
        <input type="number" id="threshold" name="threshold" min="0.0" max="1.0" step="0.1" value="{{ threshold or 0.5 }}" class="filter-input" placeholder="0.5">
      </div>
      <button type="submit" class="btn-apply">Apply Filters</button>
    </form>
  </div>
  <!-- Content Table -->
  {% if items %}
  <div class="table-wrapper">
    <table class="flagged-table">
      <thead>
        <tr>
          <th>Type</th>
          <th>Author</th>
          <th>Content</th>
          <th>Score</th>
          <th>Category</th>
          <th>Created</th>
        </tr>
      </thead>
      <tbody>
        {% for item in items %}
        <tr class="item-row">
          <!-- Type Badge -->
          <td class="col-type">
            <span class="badge badge-{{ item.item_type }}">
              {% if item.item_type == 'post' %}
                Post
              {% elif item.item_type == 'reply' %}
                Reply
              {% elif item.item_type == 'mention' %}
                Mention
              {% endif %}
            </span>
          </td>
          <!-- Author -->
          <td class="col-author">
            {% if item.author_handle %}
            <a href="https://bsky.app/profile/{{ item.author_handle }}" target="_blank" rel="noopener" class="author-link">
              @{{ item.author_handle }}
            </a>
            {% else %}
            <span class="author-did" title="{{ item.author_did }}">{{ item.author_did[:30] }}…</span>
            {% endif %}
            {% if item.item_type == 'mention' and item.mentioned_handle %}
            <span class="mention-arrow">→</span>
            <a href="https://bsky.app/profile/{{ item.mentioned_handle }}" target="_blank" rel="noopener" class="author-link">
              @{{ item.mentioned_handle }}
            </a>
            {% endif %}
          </td>
          <!-- Content Text -->
          <td class="col-text">
            {% if item.source_type == 'post' %}
            <a href="{{ url_for('statuses.detail', encoded_uri=encode_uri(item.item_id)) }}" class="content-link">
              {{ item.text | truncate_text(200) }}
            </a>
            {% else %}
            <span class="content-text">{{ item.text | truncate_text(200) }}</span>
            {% endif %}
          </td>
          <!-- Score with Bar -->
          <td class="col-score">
            <div class="score-bar-container">
              {% set score_pct = (item.overall * 100) | int %}
              {% if item.overall < 0.3 %}
                {% set bar_class = 'score-bar-low' %}
              {% elif item.overall < 0.6 %}
                {% set bar_class = 'score-bar-medium' %}
              {% else %}
                {% set bar_class = 'score-bar-high' %}
              {% endif %}
              <div class="score-bar {{ bar_class }}" style="width: {{ score_pct }}%"></div>
              <span class="score-number">{{ "%.2f" | format(item.overall) }}</span>
            </div>
          </td>
          <!-- Top Category -->
          <td class="col-category">
            {% if item.top_category %}
            <span class="badge badge-category">{{ item.top_category }}</span>
            {% else %}
            <span class="text-muted">—</span>
            {% endif %}
          </td>
          <!-- Created Time -->
          <td class="col-created">
            <span class="time-ago" title="{{ item.created_at }}">
              {{ item.created_at | time_ago }}
            </span>
          </td>
        </tr>
        {% endfor %}
      </tbody>
    </table>
  </div>
  <!-- Pagination -->
  {% if total_pages > 1 %}
  <div class="pagination">
    {% if page > 1 %}
    <a href="{{ url_for('analysis.flagged', page=1, content_type=content_type, category=category, account_did=account_did, threshold=threshold) }}" class="btn-pagination">First</a>
    <a href="{{ url_for('analysis.flagged', page=page-1, content_type=content_type, category=category, account_did=account_did, threshold=threshold) }}" class="btn-pagination">Previous</a>
    {% endif %}
    <span class="pagination-info">Page {{ page }} of {{ total_pages }}</span>
    {% if page < total_pages %}
    <a href="{{ url_for('analysis.flagged', page=page+1, content_type=content_type, category=category, account_did=account_did, threshold=threshold) }}" class="btn-pagination">Next</a>
    <a href="{{ url_for('analysis.flagged', page=total_pages, content_type=content_type, category=category, account_did=account_did, threshold=threshold) }}" class="btn-pagination">Last</a>
    {% endif %}
  </div>
  {% endif %}
  {% else %}
  <!-- Empty State -->
  <div class="empty-state">
    <p class="empty-icon">∅</p>
    <p class="empty-text">No flagged content found</p>
    <p class="empty-subtext">Try adjusting your filters or threshold</p>
  </div>
  {% endif %}
 </div>
 {% endblock %}
 {% block extra_css %}
 <style>
  :root {
    --dark-bg: #1a1a2e;
    --dark-card: #16213e;
    --dark-nav: #0f3460;
    --dark-text: #e0e0e0;
    --accent-primary: #00b4d8;
    --badge-post: #00b4d8;
    --badge-reply: #9b59b6;
    --badge-mention: #2ecc71;
    --tox-low: #2ecc71;
    --tox-medium: #f39c12;
    --tox-high: #e74c3c;
  }
  .flagged-container {
    padding: 2rem;
    max-width: 1400px;
    margin: 0 auto;
  }
  /* Page Header */
  .page-header {
    display: flex;
    align-items: center;
    gap: 1rem;
    margin-bottom: 2rem;
  }
  .page-header h1 {
    font-size: 2rem;
    font-weight: 700;
    color: var(--dark-text);
    margin: 0;
  }
  .total-badge {
    background: var(--accent-primary);
    color: var(--dark-bg);
    font-weight: 600;
    padding: 0.5rem 1rem;
    border-radius: 2rem;
    font-size: 0.9rem;
  }
  /* Filter Bar */
  .filter-bar {
    background: var(--dark-card);
    border: 1px solid rgba(255, 255, 255, 0.1);
    border-radius: 0.5rem;
    padding: 1.5rem;
    margin-bottom: 2rem;
  }
  .filter-form {
    display: flex;
    flex-wrap: wrap;
    gap: 1rem;
    align-items: flex-end;
  }
  .filter-group {
    display: flex;
    flex-direction: column;
    gap: 0.5rem;
    flex: 1;
    min-width: 150px;
  }
  .filter-group label {
    font-size: 0.9rem;
    font-weight: 600;
    color: var(--dark-text);
  }
  .filter-select,
  .filter-input {
    background: var(--dark-bg);
    border: 1px solid rgba(255, 255, 255, 0.2);
    border-radius: 0.375rem;
    color: var(--dark-text);
    padding: 0.625rem;
    font-size: 0.9rem;
    font-family: inherit;
  }
  .filter-select:hover,
  .filter-input:hover {
    border-color: rgba(255, 255, 255, 0.3);
  }
  .filter-select:focus,
  .filter-input:focus {
    outline: none;
    border-color: var(--accent-primary);
    background: var(--dark-bg);
    color: var(--dark-text);
  }
  .btn-apply {
    background: var(--accent-primary);
    color: var(--dark-bg);
    border: none;
    border-radius: 0.375rem;
    padding: 0.625rem 1.25rem;
    font-weight: 600;
    cursor: pointer;
    font-size: 0.9rem;
    transition: all 0.2s ease;
  }
  .btn-apply:hover {
    opacity: 0.9;
    transform: translateY(-1px);
  }
  .btn-apply:active {
    transform: translateY(0);
  }
  /* Table Wrapper */
  .table-wrapper {
    background: var(--dark-card);
    border: 1px solid rgba(255, 255, 255, 0.1);
    border-radius: 0.5rem;
    overflow-x: auto;
    margin-bottom: 2rem;
  }
  /* Table Styles */
  .flagged-table {
    width: 100%;
    border-collapse: collapse;
    font-size: 0.9rem;
  }
  .flagged-table thead {
    background: rgba(0, 0, 0, 0.3);
    border-bottom: 2px solid rgba(255, 255, 255, 0.1);
  }
  .flagged-table th {
    padding: 1rem;
    text-align: left;
    font-weight: 600;
    color: var(--dark-text);
    white-space: nowrap;
  }
  .flagged-table td {
    padding: 1rem;
    border-bottom: 1px solid rgba(255, 255, 255, 0.05);
    color: var(--dark-text);
  }
  .flagged-table tbody tr:hover {
    background: rgba(0, 180, 216, 0.05);
  }
  /* Column Styles */
  .col-type {
    width: 90px;
  }
  .col-author {
    width: 200px;
  }
  .col-text {
    min-width: 300px;
  }
  .col-score {
    width: 150px;
  }
  .col-category {
    width: 140px;
  }
  .col-created {
    width: 120px;
  }
  /* Badges */
  .badge {
    display: inline-block;
    padding: 0.35rem 0.75rem;
    border-radius: 0.25rem;
    font-size: 0.8rem;
    font-weight: 600;
    text-transform: uppercase;
    letter-spacing: 0.5px;
  }
  .badge-post {
    background: rgba(0, 180, 216, 0.2);
    color: var(--badge-post);
  }
  .badge-reply {
    background: rgba(155, 89, 182, 0.2);
    color: var(--badge-reply);
  }
  .badge-mention {
    background: rgba(46, 204, 113, 0.2);
    color: var(--badge-mention);
  }
  .badge-category {
    background: rgba(0, 180, 216, 0.15);
    color: var(--accent-primary);
  }
  /* Author Links */
  .author-link {
    color: var(--accent-primary);
    text-decoration: none;
    transition: color 0.2s ease;
  }
  .author-link:hover {
    color: rgba(0, 180, 216, 0.8);
    text-decoration: underline;
  }
  .mention-arrow {
    color: rgba(255, 255, 255, 0.3);
    margin: 0 0.5rem;
  }
  /* Content Link */
  .content-link {
    color: var(--dark-text);
    text-decoration: none;
    transition: color 0.2s ease;
  }
  .content-link:hover {
    color: var(--accent-primary);
  }
  .content-text {
    color: rgba(255, 255, 255, 0.7);
  }
  /* Score Bar */
  .score-bar-container {
    position: relative;
    height: 30px;
    background: rgba(255, 255, 255, 0.05);
    border-radius: 0.25rem;
    overflow: hidden;
    display: flex;
    align-items: center;
    padding: 0 0.5rem;
  }
  .score-bar {
    position: absolute;
    height: 100%;
    left: 0;
    top: 0;
    transition: width 0.3s ease;
  }
  .score-bar-low {
    background: linear-gradient(90deg, rgba(46, 204, 113, 0.3), rgba(46, 204, 113, 0.5));
  }
  .score-bar-medium {
    background: linear-gradient(90deg, rgba(243, 156, 18, 0.3), rgba(243, 156, 18, 0.5));
  }
  .score-bar-high {
    background: linear-gradient(90deg, rgba(231, 76, 60, 0.3), rgba(231, 76, 60, 0.5));
  }
  .score-number {
    position: relative;
    z-index: 1;
    font-weight: 600;
    color: var(--dark-text);
    font-size: 0.85rem;
  }
  /* Time Ago */
  .time-ago {
    color: rgba(255, 255, 255, 0.6);
    font-size: 0.85rem;
    cursor: help;
  }
  .time-ago:hover {
    color: var(--dark-text);
  }
  .text-muted {
    color: rgba(255, 255, 255, 0.3);
  }
  /* Empty State */
  .empty-state {
    text-align: center;
    padding: 4rem 2rem;
    background: var(--dark-card);
    border: 2px dashed rgba(255, 255, 255, 0.2);
    border-radius: 0.5rem;
  }
  .empty-icon {
    font-size: 3rem;
    color: rgba(255, 255, 255, 0.2);
    margin: 0 0 1rem 0;
  }
  .empty-text {
    font-size: 1.2rem;
    font-weight: 600;
    color: var(--dark-text);
    margin: 0 0 0.5rem 0;
  }
  .empty-subtext {
    color: rgba(255, 255, 255, 0.5);
    margin: 0;
  }
  /* Pagination */
  .pagination {
    display: flex;
    justify-content: center;
    align-items: center;
    gap: 1rem;
    margin-top: 2rem;
  }
  .pagination-info {
    color: var(--dark-text);
    font-weight: 500;
    min-width: 150px;
    text-align: center;
  }
  .btn-pagination {
    background: var(--dark-card);
    color: var(--accent-primary);
    border: 1px solid var(--accent-primary);
    padding: 0.5rem 1rem;
    border-radius: 0.375rem;
    text-decoration: none;
    font-weight: 600;
    font-size: 0.9rem;
    transition: all 0.2s ease;
    cursor: pointer;
  }
  .btn-pagination:hover {
    background: var(--accent-primary);
    color: var(--dark-bg);
  }
  .btn-pagination:active {
    transform: scale(0.98);
  }
  /* Responsive */
  @media (max-width: 1024px) {
    .filter-form {
      flex-direction: column;
    }
    .filter-group {
      width: 100%;
    }
    .col-text {
      min-width: 250px;
    }
  }
  @media (max-width: 768px) {
    .flagged-container {
      padding: 1rem;
    }
    .page-header {
      flex-direction: column;
      align-items: flex-start;
    }
    .page-header h1 {
      font-size: 1.5rem;
    }
    .table-wrapper {
      font-size: 0.8rem;
    }
    .flagged-table th,
    .flagged-table td {
      padding: 0.75rem 0.5rem;
    }
    .col-author,
    .col-text {
      min-width: 180px;
    }
    .col-created {
      width: 100px;
    }
  }
 </style>
 {% endblock %}
--- a/src/web/templates/mentions.html
+++ b/src/web/templates/mentions.html
@ -0,0 +1,116 @@
 {% extends "base.html" %}
 {% block title %}Mentions{% endblock %}
 {% block content %}
 <div class="page-header">
  <h1>Mentions</h1>
  <p>Track when monitored accounts are mentioned by other users.</p>
 </div>
 {# ── Filter Bar ──────────────────────────────────────────────────── #}
 <form class="filter-bar" method="get" action="/mentions">
  <select name="account">
    <option value="">All Accounts</option>
    {% for acct in accounts %}
    <option value="{{ acct.did }}" {% if mentioned_did == acct.did %}selected{% endif %}>
      @{{ acct.handle }}
    </option>
    {% endfor %}
  </select>
  <input type="text" name="search" placeholder="Search text..." value="{{ search }}" style="min-width: 200px;">
  <button type="submit" class="btn btn-primary">Apply</button>
 </form>
 <div class="result-count">{{ total | format_number }} mentions found</div>
 {# ── Mentions Table ──────────────────────────────────────────────── #}
 <div class="table-wrapper">
  <table>
    <thead>
      <tr>
        <th>Mentioned Account</th>
        <th>Mentioning User</th>
        <th>Text</th>
        <th>Created</th>
      </tr>
    </thead>
    <tbody>
      {% for mention in mentions %}
      <tr>
        <td>
          {% if mention.mentioned_handle %}
          <a href="https://bsky.app/profile/{{ mention.mentioned_handle }}" target="_blank" rel="noopener">
            @{{ mention.mentioned_handle }}
          </a>
          {% else %}
          <span class="text-muted mono">{{ mention.mentioned_did[:25] }}...</span>
          {% endif %}
        </td>
        <td>
          <span class="text-muted mono" title="{{ mention.mentioning_did }}">
            {{ mention.mentioning_did[:30] }}...
          </span>
        </td>
        <td class="text-preview">
          {% if mention.post_uri %}
          <a href="/statuses/{{ encode_uri(mention.post_uri) }}">
            {{ mention.post_text | truncate_text(200) }}
          </a>
          {% else %}
          {{ mention.post_text | truncate_text(200) }}
          {% endif %}
        </td>
        <td title="{{ mention.post_created_at | format_dt }}">
          {{ mention.post_created_at | time_ago }}
        </td>
      </tr>
      {% else %}
      <tr>
        <td colspan="4" class="table-empty">No mentions found.</td>
      </tr>
      {% endfor %}
    </tbody>
  </table>
 </div>
 {# ── Pagination ──────────────────────────────────────────────────── #}
 {% if total_pages > 1 %}
 <div class="pagination">
  {% if page > 1 %}
    <a href="?account={{ mentioned_did }}&search={{ search }}&page={{ page - 1 }}">Prev</a>
  {% else %}
    <span class="disabled">Prev</span>
  {% endif %}
  {% set start_page = [1, page - 3] | max %}
  {% set end_page = [total_pages, page + 3] | min %}
  {% if start_page > 1 %}
    <a href="?account={{ mentioned_did }}&search={{ search }}&page=1">1</a>
    {% if start_page > 2 %}<span class="ellipsis">...</span>{% endif %}
  {% endif %}
  {% for p in range(start_page, end_page + 1) %}
    {% if p == page %}
      <span class="active">{{ p }}</span>
    {% else %}
      <a href="?account={{ mentioned_did }}&search={{ search }}&page={{ p }}">{{ p }}</a>
    {% endif %}
  {% endfor %}
  {% if end_page < total_pages %}
    {% if end_page < total_pages - 1 %}<span class="ellipsis">...</span>{% endif %}
    <a href="?account={{ mentioned_did }}&search={{ search }}&page={{ total_pages }}">{{ total_pages }}</a>
  {% endif %}
  {% if page < total_pages %}
    <a href="?account={{ mentioned_did }}&search={{ search }}&page={{ page + 1 }}">Next</a>
  {% else %}
    <span class="disabled">Next</span>
  {% endif %}
 </div>
 {% endif %}
 {% endblock %}
--- a/src/web/templates/status_detail.html
+++ b/src/web/templates/status_detail.html
@ -0,0 +1,134 @@
 {% extends "base.html" %}
 {% block title %}Status Detail{% endblock %}
 {% block content %}
 <a href="/statuses" class="back-link">&larr; Back to statuses</a>
 {# ── Main Post Card ──────────────────────────────────────────────── #}
 <div class="card mb-3">
  <div style="display: flex; align-items: center; gap: 0.75rem; margin-bottom: 0.75rem;">
    {% if post.author_handle %}
    <a href="https://bsky.app/profile/{{ post.author_handle }}" target="_blank" rel="noopener"
       style="font-weight: 600; font-size: 1.05rem;">
      @{{ post.author_handle }}
    </a>
    {% else %}
    <span class="text-muted mono">{{ post.author_did }}</span>
    {% endif %}
    {% if post.post_type == 'post' %}
      <span class="badge badge-post">Post</span>
    {% elif post.post_type == 'reply' %}
      <span class="badge badge-reply">Reply</span>
    {% elif post.post_type == 'repost' %}
      <span class="badge badge-repost">Repost</span>
    {% else %}
      <span class="badge">{{ post.post_type }}</span>
    {% endif %}
  </div>
  <div class="text-muted" style="font-size: 0.85rem;">
    {{ post.created_at | format_dt }} ({{ post.created_at | time_ago }})
  </div>
  {# ── Reply context ───────────────────────────────────────────── #}
  {% if post.reply_parent %}
  <div style="margin: 0.75rem 0; padding: 0.6rem 0.85rem; background: rgba(155,89,182,0.08);
              border-radius: 6px; border-left: 3px solid #9b59b6; font-size: 0.85rem;">
    In reply to:
    {% if parent %}
      <a href="/statuses/{{ encode_uri(parent.uri) }}">
        {% if parent.author_handle %}@{{ parent.author_handle }}{% else %}{{ parent.author_did[:30] }}...{% endif %}
        &mdash; {{ parent.text | truncate_text(120) }}
      </a>
    {% else %}
      <span class="text-muted mono">{{ post.reply_parent }}</span>
    {% endif %}
  </div>
  {% endif %}
  {# ── Full post text ──────────────────────────────────────────── #}
  <div class="post-text">{{ post.text }}</div>
  {# ── Engagement stats ────────────────────────────────────────── #}
  <div class="stats-row">
    <div class="stat-item">
      <span>Likes:</span>
      <strong>{{ post.like_count | format_number }}</strong>
    </div>
    <div class="stat-item">
      <span>Replies:</span>
      <strong>{{ post.reply_count | format_number }}</strong>
    </div>
    <div class="stat-item">
      <span>Reposts:</span>
      <strong>{{ post.repost_count | format_number }}</strong>
    </div>
    <div class="stat-item">
      <span>Quotes:</span>
      <strong>{{ post.quote_count | format_number }}</strong>
    </div>
  </div>
  {# ── Additional metadata ─────────────────────────────────────── #}
  <div class="text-muted mt-1" style="font-size: 0.85rem;">
    {% if post.has_media %}<span style="margin-right: 0.75rem;">Has media</span>{% endif %}
    {% if post.has_embed %}<span style="margin-right: 0.75rem;">Has embed</span>{% endif %}
    {% if post.langs %}<span style="margin-right: 0.75rem;">Language: {{ post.langs }}</span>{% endif %}
  </div>
  {# ── External link ───────────────────────────────────────────── #}
  {% set bsky_url = bsky_post_url(post.uri, post.author_handle) %}
  {% if bsky_url %}
  <div class="mt-2">
    <a href="{{ bsky_url }}" target="_blank" rel="noopener" class="btn btn-primary">
      View on Bluesky &rarr;
    </a>
  </div>
  {% endif %}
  <div class="text-muted mt-2" style="font-size: 0.8rem;">
    Indexed: {{ post.indexed_at | format_dt }} &middot;
    Collected: {{ post.collected_at | format_dt }}
  </div>
 </div>
 {# ── Replies Section ─────────────────────────────────────────────── #}
 {% if replies %}
 <h2 class="mb-2">Replies ({{ replies | length }})</h2>
 {% for reply in replies %}
 <div class="card reply-card">
  <div style="display: flex; align-items: center; gap: 0.5rem; margin-bottom: 0.4rem;">
    {% if reply.author_handle %}
    <a href="https://bsky.app/profile/{{ reply.author_handle }}" target="_blank" rel="noopener"
       style="font-weight: 600; font-size: 0.9rem;">
      @{{ reply.author_handle }}
    </a>
    {% else %}
    <span class="text-muted mono" style="font-size: 0.85rem;">{{ reply.author_did[:30] }}...</span>
    {% endif %}
    <span class="text-muted" style="font-size: 0.85rem;">{{ reply.created_at | time_ago }}</span>
  </div>
  <div style="margin-bottom: 0.35rem;">
    <a href="/statuses/{{ encode_uri(reply.uri) }}">
      {{ reply.text | truncate_text(300) }}
    </a>
  </div>
  <div class="text-muted" style="font-size: 0.8rem;">
    Likes: {{ reply.like_count | format_number }} &middot;
    Replies: {{ reply.reply_count | format_number }} &middot;
    Reposts: {{ reply.repost_count | format_number }}
  </div>
 </div>
 {% endfor %}
 {% endif %}
 {# ── Raw JSON ────────────────────────────────────────────────────── #}
 {% if post.raw_json %}
 <details>
  <summary>Raw JSON</summary>
  <pre>{{ post.raw_json | tojson(indent=2) }}</pre>
 </details>
 {% endif %}
 {% endblock %}
--- a/src/web/templates/statuses.html
+++ b/src/web/templates/statuses.html
@ -0,0 +1,143 @@
 {% extends "base.html" %}
 {% block title %}Statuses{% endblock %}
 {% block content %}
 <div class="page-header">
  <h1>Statuses</h1>
  <p>Browse and search collected posts, replies, and reposts.</p>
 </div>
 {# ── Filter Bar ──────────────────────────────────────────────────── #}
 <form class="filter-bar" method="get" action="/statuses">
  <select name="account">
    <option value="">All Accounts</option>
    {% for acct in accounts %}
    <option value="{{ acct.did }}" {% if account_did == acct.did %}selected{% endif %}>
      @{{ acct.handle }}
    </option>
    {% endfor %}
  </select>
  <select name="type">
    <option value="">All Types</option>
    <option value="post" {% if post_type == 'post' %}selected{% endif %}>Post</option>
    <option value="reply" {% if post_type == 'reply' %}selected{% endif %}>Reply</option>
    <option value="repost" {% if post_type == 'repost' %}selected{% endif %}>Repost</option>
  </select>
  <input type="text" name="search" placeholder="Search text..." value="{{ search }}" style="min-width: 200px;">
  <select name="sort">
    <option value="created" {% if sort == 'created' %}selected{% endif %}>Created</option>
    <option value="likes" {% if sort == 'likes' %}selected{% endif %}>Likes</option>
    <option value="replies" {% if sort == 'replies' %}selected{% endif %}>Replies</option>
    <option value="reposts" {% if sort == 'reposts' %}selected{% endif %}>Reposts</option>
  </select>
  <select name="dir">
    <option value="desc" {% if direction == 'desc' %}selected{% endif %}>Desc</option>
    <option value="asc" {% if direction == 'asc' %}selected{% endif %}>Asc</option>
  </select>
  <button type="submit" class="btn btn-primary">Apply</button>
 </form>
 <div class="result-count">{{ total | format_number }} statuses found</div>
 {# ── Posts Table ─────────────────────────────────────────────────── #}
 <div class="table-wrapper">
  <table>
    <thead>
      <tr>
        <th>Author</th>
        <th>Text</th>
        <th>Type</th>
        <th>Created</th>
        <th class="text-right">Likes</th>
        <th class="text-right">Replies</th>
        <th class="text-right">Reposts</th>
      </tr>
    </thead>
    <tbody>
      {% for post in posts %}
      <tr>
        <td>
          {% if post.author_handle %}
          <a href="https://bsky.app/profile/{{ post.author_handle }}" target="_blank" rel="noopener">
            @{{ post.author_handle }}
          </a>
          {% else %}
          <span class="text-muted mono">{{ post.author_did[:20] }}...</span>
          {% endif %}
        </td>
        <td class="text-preview">
          <a href="/statuses/{{ encode_uri(post.uri) }}">
            {{ post.text | truncate_text(200) }}
          </a>
        </td>
        <td>
          {% if post.post_type == 'post' %}
            <span class="badge badge-post">Post</span>
          {% elif post.post_type == 'reply' %}
            <span class="badge badge-reply">Reply</span>
          {% elif post.post_type == 'repost' %}
            <span class="badge badge-repost">Repost</span>
          {% else %}
            <span class="badge">{{ post.post_type }}</span>
          {% endif %}
        </td>
        <td title="{{ post.created_at | format_dt }}">
          {{ post.created_at | time_ago }}
        </td>
        <td class="text-right">{{ post.like_count | format_number }}</td>
        <td class="text-right">{{ post.reply_count | format_number }}</td>
        <td class="text-right">{{ post.repost_count | format_number }}</td>
      </tr>
      {% else %}
      <tr>
        <td colspan="7" class="table-empty">No statuses found.</td>
      </tr>
      {% endfor %}
    </tbody>
  </table>
 </div>
 {# ── Pagination ──────────────────────────────────────────────────── #}
 {% if total_pages > 1 %}
 <div class="pagination">
  {% if page > 1 %}
    <a href="?account={{ account_did }}&type={{ post_type }}&search={{ search }}&sort={{ sort }}&dir={{ direction }}&page={{ page - 1 }}">Prev</a>
  {% else %}
    <span class="disabled">Prev</span>
  {% endif %}
  {% set start_page = [1, page - 3] | max %}
  {% set end_page = [total_pages, page + 3] | min %}
  {% if start_page > 1 %}
    <a href="?account={{ account_did }}&type={{ post_type }}&search={{ search }}&sort={{ sort }}&dir={{ direction }}&page=1">1</a>
    {% if start_page > 2 %}<span class="ellipsis">...</span>{% endif %}
  {% endif %}
  {% for p in range(start_page, end_page + 1) %}
    {% if p == page %}
      <span class="active">{{ p }}</span>
    {% else %}
      <a href="?account={{ account_did }}&type={{ post_type }}&search={{ search }}&sort={{ sort }}&dir={{ direction }}&page={{ p }}">{{ p }}</a>
    {% endif %}
  {% endfor %}
  {% if end_page < total_pages %}
    {% if end_page < total_pages - 1 %}<span class="ellipsis">...</span>{% endif %}
    <a href="?account={{ account_did }}&type={{ post_type }}&search={{ search }}&sort={{ sort }}&dir={{ direction }}&page={{ total_pages }}">{{ total_pages }}</a>
  {% endif %}
  {% if page < total_pages %}
    <a href="?account={{ account_did }}&type={{ post_type }}&search={{ search }}&sort={{ sort }}&dir={{ direction }}&page={{ page + 1 }}">Next</a>
  {% else %}
    <span class="disabled">Next</span>
  {% endif %}
 </div>
 {% endif %}
 {% endblock %}
--- a/web.Dockerfile
+++ b/web.Dockerfile
@ -0,0 +1,10 @@
 FROM python:3.12-slim
 WORKDIR /app
 COPY requirements-web.txt .
 RUN pip install --no-cache-dir -r requirements-web.txt
 COPY src/ ./src/
 CMD ["gunicorn", "-b", "0.0.0.0:5001", "-w", "2", "src.web.app:create_app()"]