Pieter 0aa4a16fab Add toxicity analysis system for Mastodon statuses

Implements comprehensive toxicity analysis following the Bluesky collector architecture:

- Analyzer module with async batch processing using GPT-4o-mini
- Database schema for toxicity scores and analysis run tracking
- 12 toxicity categories (toxic, threat, hate_speech, racism, antisemitism, islamophobia, sexism, homophobia, insult, dehumanization, extremism, ableism)
- Web interface routes for analysis dashboard and flagged content review
- Manual review API endpoint for human validation
- Analysis helper functions for database queries
- Dutch language support with coded political term recognition

Usage:
  docker exec mastodon-collector-collector-1 python -m app.analyzer

See TOXICITY_ANALYSIS.md for full documentation.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

2026-03-30 14:43:35 +02:00

6.7 KiB

Raw Blame History

Toxicity Analysis System

This document describes the toxicity analysis system for the Mastodon collector, adapted from the Bluesky collector implementation.

Overview

The toxicity analysis system uses OpenAI's GPT-4o-mini to classify Mastodon posts across 12 toxicity categories:

toxic: rude, disrespectful, or aggressive language
threat: threats of violence, harm, or intimidation
hate_speech: targeting based on protected characteristics
racism: race/ethnicity-based targeting
antisemitism: anti-Jewish content
islamophobia: anti-Muslim content
sexism: gender-based discrimination
homophobia: anti-LGBTQ+ content
insult: personal attacks and name-calling
dehumanization: comparing people to animals/vermin
extremism: far-right/left extremist rhetoric
ableism: targeting people with disabilities

Architecture

The system consists of:

Analyzer Module (app/analyzer/) - Async batch processor for classification
Database Schema (scripts/02-toxicity.sql) - Toxicity scores and analysis runs
Web Interface - Dashboard and flagged content review
API Endpoints - For manual review of flagged content

Setup

1. Environment Variables

Add to your .env file:

# OpenAI API key for toxicity analysis
OPENAI_API_KEY=sk-...

# Analyzer configuration (optional)
ANALYZER_MODEL=gpt-4o-mini
ANALYZER_BATCH_SIZE=10
ANALYZER_CONCURRENCY=5
ANALYZER_FLAG_THRESHOLD=0.5
ANALYZER_LIMIT=0  # 0 = no limit, or set to test on limited number

2. Database Migration

The toxicity schema is applied automatically when the analyzer runs for the first time. It creates:

toxicity_scores table - stores scores for each status
analysis_runs table - audit trail of analysis runs

To manually apply the migration:

docker exec -i mastodon-collector-db-1 psql -U collector -d mastodon_collector < scripts/02-toxicity.sql

3. Install Dependencies

Dependencies are already added to requirements.txt:

openai==1.58.1 - OpenAI API client
asyncpg==0.30.0 - Async PostgreSQL driver

Rebuild the Docker containers to install:

docker-compose build
docker-compose up -d

Running the Analyzer

One-Time Analysis

Run the analyzer manually to score all unscored statuses:

docker exec mastodon-collector-collector-1 python -m app.analyzer

Test on Limited Sample

To test on 100 statuses first:

docker exec mastodon-collector-collector-1 bash -c "ANALYZER_LIMIT=100 python -m app.analyzer"

Automated Analysis (Future)

You can schedule the analyzer to run periodically using cron or a scheduler service. For example, add to your docker-compose.yml:

  analyzer:
    build: .
    command: python -m app.analyzer
    environment:
      - DATABASE_URL=postgresql://collector:${POSTGRES_PASSWORD}@db:5432/mastodon_collector
      - OPENAI_API_KEY=${OPENAI_API_KEY}
      - ANALYZER_LIMIT=${ANALYZER_LIMIT:-0}
    depends_on:
      - db
    restart: "no"  # Run once, don't restart

Then trigger manually:

docker-compose run --rm analyzer

Web Interface

Analysis Dashboard

Visit http://localhost:8585/analysis to see:

Overall statistics (total scored, flagged count, averages)
Toxicity trends over time
Category breakdown chart
Recent analysis runs

Flagged Content Review

Visit http://localhost:8585/analysis/flagged to:

Browse flagged content (threshold >= 0.5 by default)
Filter by category, account, date range, review status
Sort by overall toxicity or specific categories
Manually review and mark items as:
- ✓ Correct (correctly flagged)
- ✗ Incorrect (false positive)
- ? Unsure

Review Workflow

Click on flagged items to review
Use the review buttons (✓, ✗, ?) to mark your assessment
Filter by review_status=unreviewed to focus on items needing review
Use reviewed data to improve the classifier or adjust thresholds

Cost Estimation

Based on GPT-4o-mini pricing (as of Jan 2025):

Input: $0.150 per 1M tokens
Output: $0.600 per 1M tokens

Typical costs:

~1,000 statuses = $0.05-0.15
~10,000 statuses = $0.50-1.50

The analyzer logs estimated costs after each run.

Architecture Details

Batch Processing

The analyzer processes statuses in batches (default: 10 per API call) with concurrency control (default: 5 simultaneous batches). This optimizes for:

Cost efficiency (batch API calls)
Rate limit compliance
Parallel processing speed

Scoring Logic

Each status receives:

12 category scores (0.0 - 1.0)
Overall score = max of all categories
Flagged if overall >= threshold (default 0.5)

Human Review

Manual reviews help:

Validate AI classifications
Identify patterns of false positives
Build training data for future improvements
Adjust thresholds per category if needed

Dutch Language Support

The classifier is specifically trained to handle Dutch political content, including:

Dutch slang and coded terms ("gelukszoekers", "omvolking", "wappie", etc.)
Political context and satire
Zwarte Piet debates
Dutch far-right rhetoric

Templates

The Bluesky collector templates can be adapted for Mastodon. Key files to create:

app/templates/analysis.html - Main dashboard
app/templates/flagged.html - Flagged content browser

These templates should include:

Chart.js for visualizations
Filter forms for exploration
Review buttons for manual validation

Troubleshooting

No statuses being scored

Check that statuses exist: SELECT COUNT(*) FROM statuses WHERE content IS NOT NULL AND reblog_of_id IS NULL;
Check migration applied: \dt toxicity_scores in psql
Check OPENAI_API_KEY is set

Rate limit errors

Reduce ANALYZER_CONCURRENCY (try 2-3)
Reduce ANALYZER_BATCH_SIZE (try 5)
The analyzer retries with exponential backoff automatically

High false positive rate

Increase ANALYZER_FLAG_THRESHOLD (try 0.6 or 0.7)
Review flagged items and look for patterns
Dutch political content can be intense but not necessarily toxic

Template errors

Ensure templates exist in app/templates/
Check that analysis helper functions are imported correctly
Verify template filters are defined (format_number, time_ago, etc.)

Next Steps

Copy analysis templates from Bluesky collector to app/templates/
Add navigation links to analysis dashboard in base template
Run initial analysis on sample data
Review flagged content and adjust thresholds
Set up automated analysis runs (cron/scheduler)
Monitor costs and performance

References

Bluesky collector: https://forgejo.postxsociety.cloud/pieter/bluesky-collector
OpenAI API: https://platform.openai.com/docs
asyncpg: https://magicstack.github.io/asyncpg/

6.7 KiB Raw Blame History