From ac2b50751b6c6a861b04991b1b06cf2d2cb11f92 Mon Sep 17 00:00:00 2001
From: Pieter <pieter@kolabnow.com>
Date: Tue, 31 Mar 2026 09:25:18 +0200
Subject: [PATCH] Fix flagged page styling by adding CSS/JS blocks to base
 template
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

- Add {% block extra_css %} to base.html head section
- Add {% block extra_js %} to base.html before closing body tag
- Add encode_uri template filter for URL encoding

Flagged page CSS and JavaScript now load correctly, fixing:
- Filter bar styling
- Table formatting
- Review button styles and functionality

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
---
 TOXICITY_ANALYSIS.md    | 242 ----------------------------------------
 app/templates/base.html |   2 +
 app/web.py              |   7 ++
 3 files changed, 9 insertions(+), 242 deletions(-)
 delete mode 100644 TOXICITY_ANALYSIS.md

diff --git a/TOXICITY_ANALYSIS.md b/TOXICITY_ANALYSIS.md
deleted file mode 100644
index 8241a3a..0000000
--- a/TOXICITY_ANALYSIS.md
+++ /dev/null
@@ -1,242 +0,0 @@
-# Toxicity Analysis System
-
-This document describes the toxicity analysis system for the Mastodon collector, adapted from the Bluesky collector implementation.
-
-## Overview
-
-The toxicity analysis system uses OpenAI's GPT-4o-mini to classify Mastodon posts across 12 toxicity categories:
-
-- **toxic**: rude, disrespectful, or aggressive language
-- **threat**: threats of violence, harm, or intimidation
-- **hate_speech**: targeting based on protected characteristics
-- **racism**: race/ethnicity-based targeting
-- **antisemitism**: anti-Jewish content
-- **islamophobia**: anti-Muslim content
-- **sexism**: gender-based discrimination
-- **homophobia**: anti-LGBTQ+ content
-- **insult**: personal attacks and name-calling
-- **dehumanization**: comparing people to animals/vermin
-- **extremism**: far-right/left extremist rhetoric
-- **ableism**: targeting people with disabilities
-
-## Architecture
-
-The system consists of:
-
-1. **Analyzer Module** (`app/analyzer/`) - Async batch processor for classification
-2. **Database Schema** (`scripts/02-toxicity.sql`) - Toxicity scores and analysis runs
-3. **Web Interface** - Dashboard and flagged content review
-4. **API Endpoints** - For manual review of flagged content
-
-## Setup
-
-### 1. Environment Variables
-
-Add to your `.env` file:
-
-```bash
-# OpenAI API key for toxicity analysis
-OPENAI_API_KEY=sk-...
-
-# Analyzer configuration (optional)
-ANALYZER_MODEL=gpt-4o-mini
-ANALYZER_BATCH_SIZE=10
-ANALYZER_CONCURRENCY=5
-ANALYZER_FLAG_THRESHOLD=0.5
-ANALYZER_LIMIT=0  # 0 = no limit, or set to test on limited number
-```
-
-### 2. Database Migration
-
-The toxicity schema is applied automatically when the analyzer runs for the first time. It creates:
-
-- `toxicity_scores` table - stores scores for each status
-- `analysis_runs` table - audit trail of analysis runs
-
-To manually apply the migration:
-
-```bash
-docker exec -i mastodon-collector-db-1 psql -U collector -d mastodon_collector < scripts/02-toxicity.sql
-```
-
-### 3. Install Dependencies
-
-Dependencies are already added to `requirements.txt`:
-- `openai==1.58.1` - OpenAI API client
-- `asyncpg==0.30.0` - Async PostgreSQL driver
-
-Rebuild the Docker containers to install:
-
-```bash
-docker-compose build
-docker-compose up -d
-```
-
-## Running the Analyzer
-
-### One-Time Analysis
-
-Run the analyzer manually to score all unscored statuses:
-
-```bash
-docker exec mastodon-collector-collector-1 python -m app.analyzer
-```
-
-### Test on Limited Sample
-
-To test on 100 statuses first:
-
-```bash
-docker exec mastodon-collector-collector-1 bash -c "ANALYZER_LIMIT=100 python -m app.analyzer"
-```
-
-### Automated Analysis (Future)
-
-You can schedule the analyzer to run periodically using cron or a scheduler service. For example, add to your `docker-compose.yml`:
-
-```yaml
-  analyzer:
-    build: .
-    command: python -m app.analyzer
-    environment:
-      - DATABASE_URL=postgresql://collector:${POSTGRES_PASSWORD}@db:5432/mastodon_collector
-      - OPENAI_API_KEY=${OPENAI_API_KEY}
-      - ANALYZER_LIMIT=${ANALYZER_LIMIT:-0}
-    depends_on:
-      - db
-    restart: "no"  # Run once, don't restart
-```
-
-Then trigger manually:
-```bash
-docker-compose run --rm analyzer
-```
-
-## Web Interface
-
-### Analysis Dashboard
-
-Visit http://localhost:8585/analysis to see:
-
-- Overall statistics (total scored, flagged count, averages)
-- Toxicity trends over time
-- Category breakdown chart
-- Recent analysis runs
-
-### Flagged Content Review
-
-Visit http://localhost:8585/analysis/flagged to:
-
-- Browse flagged content (threshold >= 0.5 by default)
-- Filter by category, account, date range, review status
-- Sort by overall toxicity or specific categories
-- Manually review and mark items as:
-  - ✓ Correct (correctly flagged)
-  - ✗ Incorrect (false positive)
-  - ? Unsure
-
-### Review Workflow
-
-1. Click on flagged items to review
-2. Use the review buttons (✓, ✗, ?) to mark your assessment
-3. Filter by `review_status=unreviewed` to focus on items needing review
-4. Use reviewed data to improve the classifier or adjust thresholds
-
-## Cost Estimation
-
-Based on GPT-4o-mini pricing (as of Jan 2025):
-- Input: $0.150 per 1M tokens
-- Output: $0.600 per 1M tokens
-
-Typical costs:
-- ~1,000 statuses = $0.05-0.15
-- ~10,000 statuses = $0.50-1.50
-
-The analyzer logs estimated costs after each run.
-
-## Architecture Details
-
-### Batch Processing
-
-The analyzer processes statuses in batches (default: 10 per API call) with concurrency control (default: 5 simultaneous batches). This optimizes for:
-
-- Cost efficiency (batch API calls)
-- Rate limit compliance
-- Parallel processing speed
-
-### Scoring Logic
-
-Each status receives:
-- 12 category scores (0.0 - 1.0)
-- Overall score = max of all categories
-- Flagged if overall >= threshold (default 0.5)
-
-### Human Review
-
-Manual reviews help:
-- Validate AI classifications
-- Identify patterns of false positives
-- Build training data for future improvements
-- Adjust thresholds per category if needed
-
-## Dutch Language Support
-
-The classifier is specifically trained to handle Dutch political content, including:
-
-- Dutch slang and coded terms ("gelukszoekers", "omvolking", "wappie", etc.)
-- Political context and satire
-- Zwarte Piet debates
-- Dutch far-right rhetoric
-
-## Templates
-
-The Bluesky collector templates can be adapted for Mastodon. Key files to create:
-
-1. `app/templates/analysis.html` - Main dashboard
-2. `app/templates/flagged.html` - Flagged content browser
-
-These templates should include:
-- Chart.js for visualizations
-- Filter forms for exploration
-- Review buttons for manual validation
-
-## Troubleshooting
-
-### No statuses being scored
-
-- Check that statuses exist: `SELECT COUNT(*) FROM statuses WHERE content IS NOT NULL AND reblog_of_id IS NULL;`
-- Check migration applied: `\dt toxicity_scores` in psql
-- Check OPENAI_API_KEY is set
-
-### Rate limit errors
-
-- Reduce `ANALYZER_CONCURRENCY` (try 2-3)
-- Reduce `ANALYZER_BATCH_SIZE` (try 5)
-- The analyzer retries with exponential backoff automatically
-
-### High false positive rate
-
-- Increase `ANALYZER_FLAG_THRESHOLD` (try 0.6 or 0.7)
-- Review flagged items and look for patterns
-- Dutch political content can be intense but not necessarily toxic
-
-### Template errors
-
-- Ensure templates exist in `app/templates/`
-- Check that analysis helper functions are imported correctly
-- Verify template filters are defined (`format_number`, `time_ago`, etc.)
-
-## Next Steps
-
-1. Copy analysis templates from Bluesky collector to `app/templates/`
-2. Add navigation links to analysis dashboard in base template
-3. Run initial analysis on sample data
-4. Review flagged content and adjust thresholds
-5. Set up automated analysis runs (cron/scheduler)
-6. Monitor costs and performance
-
-## References
-
-- Bluesky collector: https://forgejo.postxsociety.cloud/pieter/bluesky-collector
-- OpenAI API: https://platform.openai.com/docs
-- asyncpg: https://magicstack.github.io/asyncpg/
diff --git a/app/templates/base.html b/app/templates/base.html
index 462be28..5ced6db 100644
--- a/app/templates/base.html
+++ b/app/templates/base.html
@@ -234,6 +234,7 @@
         .justify-between { justify-content: space-between; }
         .truncate { white-space: nowrap; overflow: hidden; text-overflow: ellipsis; max-width: 400px; }
     </style>
+    {% block extra_css %}{% endblock %}
 </head>
 <body>
     <nav>
@@ -259,5 +260,6 @@
             {% block content %}{% endblock %}
         </div>
     </main>
+    {% block extra_js %}{% endblock %}
 </body>
 </html>
diff --git a/app/web.py b/app/web.py
index 3b26a25..cebd4bb 100644
--- a/app/web.py
+++ b/app/web.py
@@ -75,6 +75,13 @@ def truncate_text(text, length=200):
     return text[:length] + "..."
 
 
+@app.template_filter('encode_uri')
+def encode_uri(uri):
+    """URL encode a URI for use in query parameters."""
+    from urllib.parse import quote
+    return quote(str(uri), safe='')
+
+
 # Initialize database on startup
 with app.app_context():
     init_db()