mastodon-collector/TOXICITY_ANALYSIS.md
Pieter 0aa4a16fab Add toxicity analysis system for Mastodon statuses
Implements comprehensive toxicity analysis following the Bluesky collector architecture:

- Analyzer module with async batch processing using GPT-4o-mini
- Database schema for toxicity scores and analysis run tracking
- 12 toxicity categories (toxic, threat, hate_speech, racism, antisemitism, islamophobia, sexism, homophobia, insult, dehumanization, extremism, ableism)
- Web interface routes for analysis dashboard and flagged content review
- Manual review API endpoint for human validation
- Analysis helper functions for database queries
- Dutch language support with coded political term recognition

Usage:
  docker exec mastodon-collector-collector-1 python -m app.analyzer

See TOXICITY_ANALYSIS.md for full documentation.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2026-03-30 14:43:35 +02:00

6.7 KiB

Toxicity Analysis System

This document describes the toxicity analysis system for the Mastodon collector, adapted from the Bluesky collector implementation.

Overview

The toxicity analysis system uses OpenAI's GPT-4o-mini to classify Mastodon posts across 12 toxicity categories:

  • toxic: rude, disrespectful, or aggressive language
  • threat: threats of violence, harm, or intimidation
  • hate_speech: targeting based on protected characteristics
  • racism: race/ethnicity-based targeting
  • antisemitism: anti-Jewish content
  • islamophobia: anti-Muslim content
  • sexism: gender-based discrimination
  • homophobia: anti-LGBTQ+ content
  • insult: personal attacks and name-calling
  • dehumanization: comparing people to animals/vermin
  • extremism: far-right/left extremist rhetoric
  • ableism: targeting people with disabilities

Architecture

The system consists of:

  1. Analyzer Module (app/analyzer/) - Async batch processor for classification
  2. Database Schema (scripts/02-toxicity.sql) - Toxicity scores and analysis runs
  3. Web Interface - Dashboard and flagged content review
  4. API Endpoints - For manual review of flagged content

Setup

1. Environment Variables

Add to your .env file:

# OpenAI API key for toxicity analysis
OPENAI_API_KEY=sk-...

# Analyzer configuration (optional)
ANALYZER_MODEL=gpt-4o-mini
ANALYZER_BATCH_SIZE=10
ANALYZER_CONCURRENCY=5
ANALYZER_FLAG_THRESHOLD=0.5
ANALYZER_LIMIT=0  # 0 = no limit, or set to test on limited number

2. Database Migration

The toxicity schema is applied automatically when the analyzer runs for the first time. It creates:

  • toxicity_scores table - stores scores for each status
  • analysis_runs table - audit trail of analysis runs

To manually apply the migration:

docker exec -i mastodon-collector-db-1 psql -U collector -d mastodon_collector < scripts/02-toxicity.sql

3. Install Dependencies

Dependencies are already added to requirements.txt:

  • openai==1.58.1 - OpenAI API client
  • asyncpg==0.30.0 - Async PostgreSQL driver

Rebuild the Docker containers to install:

docker-compose build
docker-compose up -d

Running the Analyzer

One-Time Analysis

Run the analyzer manually to score all unscored statuses:

docker exec mastodon-collector-collector-1 python -m app.analyzer

Test on Limited Sample

To test on 100 statuses first:

docker exec mastodon-collector-collector-1 bash -c "ANALYZER_LIMIT=100 python -m app.analyzer"

Automated Analysis (Future)

You can schedule the analyzer to run periodically using cron or a scheduler service. For example, add to your docker-compose.yml:

  analyzer:
    build: .
    command: python -m app.analyzer
    environment:
      - DATABASE_URL=postgresql://collector:${POSTGRES_PASSWORD}@db:5432/mastodon_collector
      - OPENAI_API_KEY=${OPENAI_API_KEY}
      - ANALYZER_LIMIT=${ANALYZER_LIMIT:-0}
    depends_on:
      - db
    restart: "no"  # Run once, don't restart

Then trigger manually:

docker-compose run --rm analyzer

Web Interface

Analysis Dashboard

Visit http://localhost:8585/analysis to see:

  • Overall statistics (total scored, flagged count, averages)
  • Toxicity trends over time
  • Category breakdown chart
  • Recent analysis runs

Flagged Content Review

Visit http://localhost:8585/analysis/flagged to:

  • Browse flagged content (threshold >= 0.5 by default)
  • Filter by category, account, date range, review status
  • Sort by overall toxicity or specific categories
  • Manually review and mark items as:
    • ✓ Correct (correctly flagged)
    • ✗ Incorrect (false positive)
    • ? Unsure

Review Workflow

  1. Click on flagged items to review
  2. Use the review buttons (✓, ✗, ?) to mark your assessment
  3. Filter by review_status=unreviewed to focus on items needing review
  4. Use reviewed data to improve the classifier or adjust thresholds

Cost Estimation

Based on GPT-4o-mini pricing (as of Jan 2025):

  • Input: $0.150 per 1M tokens
  • Output: $0.600 per 1M tokens

Typical costs:

  • ~1,000 statuses = $0.05-0.15
  • ~10,000 statuses = $0.50-1.50

The analyzer logs estimated costs after each run.

Architecture Details

Batch Processing

The analyzer processes statuses in batches (default: 10 per API call) with concurrency control (default: 5 simultaneous batches). This optimizes for:

  • Cost efficiency (batch API calls)
  • Rate limit compliance
  • Parallel processing speed

Scoring Logic

Each status receives:

  • 12 category scores (0.0 - 1.0)
  • Overall score = max of all categories
  • Flagged if overall >= threshold (default 0.5)

Human Review

Manual reviews help:

  • Validate AI classifications
  • Identify patterns of false positives
  • Build training data for future improvements
  • Adjust thresholds per category if needed

Dutch Language Support

The classifier is specifically trained to handle Dutch political content, including:

  • Dutch slang and coded terms ("gelukszoekers", "omvolking", "wappie", etc.)
  • Political context and satire
  • Zwarte Piet debates
  • Dutch far-right rhetoric

Templates

The Bluesky collector templates can be adapted for Mastodon. Key files to create:

  1. app/templates/analysis.html - Main dashboard
  2. app/templates/flagged.html - Flagged content browser

These templates should include:

  • Chart.js for visualizations
  • Filter forms for exploration
  • Review buttons for manual validation

Troubleshooting

No statuses being scored

  • Check that statuses exist: SELECT COUNT(*) FROM statuses WHERE content IS NOT NULL AND reblog_of_id IS NULL;
  • Check migration applied: \dt toxicity_scores in psql
  • Check OPENAI_API_KEY is set

Rate limit errors

  • Reduce ANALYZER_CONCURRENCY (try 2-3)
  • Reduce ANALYZER_BATCH_SIZE (try 5)
  • The analyzer retries with exponential backoff automatically

High false positive rate

  • Increase ANALYZER_FLAG_THRESHOLD (try 0.6 or 0.7)
  • Review flagged items and look for patterns
  • Dutch political content can be intense but not necessarily toxic

Template errors

  • Ensure templates exist in app/templates/
  • Check that analysis helper functions are imported correctly
  • Verify template filters are defined (format_number, time_ago, etc.)

Next Steps

  1. Copy analysis templates from Bluesky collector to app/templates/
  2. Add navigation links to analysis dashboard in base template
  3. Run initial analysis on sample data
  4. Review flagged content and adjust thresholds
  5. Set up automated analysis runs (cron/scheduler)
  6. Monitor costs and performance

References