mastodon-collector/TOXICITY_ANALYSIS.md

# Toxicity Analysis System

This document describes the toxicity analysis system for the Mastodon collector, adapted from the Bluesky collector implementation.

## Overview

The toxicity analysis system uses OpenAI's GPT-4o-mini to classify Mastodon posts across 12 toxicity categories:

- **toxic**: rude, disrespectful, or aggressive language
- **threat**: threats of violence, harm, or intimidation
- **hate_speech**: targeting based on protected characteristics
- **racism**: race/ethnicity-based targeting
- **antisemitism**: anti-Jewish content
- **islamophobia**: anti-Muslim content
- **sexism**: gender-based discrimination
- **homophobia**: anti-LGBTQ+ content
- **insult**: personal attacks and name-calling
- **dehumanization**: comparing people to animals/vermin
- **extremism**: far-right/left extremist rhetoric
- **ableism**: targeting people with disabilities

## Architecture

The system consists of:

1. **Analyzer Module** (`app/analyzer/`) - Async batch processor for classification
2. **Database Schema** (`scripts/02-toxicity.sql`) - Toxicity scores and analysis runs
3. **Web Interface** - Dashboard and flagged content review
4. **API Endpoints** - For manual review of flagged content

## Setup

### 1. Environment Variables

Add to your `.env` file:

```bash
# OpenAI API key for toxicity analysis
OPENAI_API_KEY=sk-...

# Analyzer configuration (optional)
ANALYZER_MODEL=gpt-4o-mini
ANALYZER_BATCH_SIZE=10
ANALYZER_CONCURRENCY=5
ANALYZER_FLAG_THRESHOLD=0.5
ANALYZER_LIMIT=0  # 0 = no limit, or set to test on limited number
```

### 2. Database Migration

The toxicity schema is applied automatically when the analyzer runs for the first time. It creates:

- `toxicity_scores` table - stores scores for each status
- `analysis_runs` table - audit trail of analysis runs

To manually apply the migration:

```bash
docker exec -i mastodon-collector-db-1 psql -U collector -d mastodon_collector < scripts/02-toxicity.sql
```

### 3. Install Dependencies

Dependencies are already added to `requirements.txt`:
- `openai==1.58.1` - OpenAI API client
- `asyncpg==0.30.0` - Async PostgreSQL driver

Rebuild the Docker containers to install:

```bash
docker-compose build
docker-compose up -d
```

## Running the Analyzer

### One-Time Analysis

Run the analyzer manually to score all unscored statuses:

```bash
docker exec mastodon-collector-collector-1 python -m app.analyzer
```

### Test on Limited Sample

To test on 100 statuses first:

```bash
docker exec mastodon-collector-collector-1 bash -c "ANALYZER_LIMIT=100 python -m app.analyzer"
```

### Automated Analysis (Future)

You can schedule the analyzer to run periodically using cron or a scheduler service. For example, add to your `docker-compose.yml`:

```yaml
  analyzer:
    build: .
    command: python -m app.analyzer
    environment:
      - DATABASE_URL=postgresql://collector:${POSTGRES_PASSWORD}@db:5432/mastodon_collector
      - OPENAI_API_KEY=${OPENAI_API_KEY}
      - ANALYZER_LIMIT=${ANALYZER_LIMIT:-0}
    depends_on:
      - db
    restart: "no"  # Run once, don't restart
```

Then trigger manually:
```bash
docker-compose run --rm analyzer
```

## Web Interface

### Analysis Dashboard

Visit http://localhost:8585/analysis to see:

- Overall statistics (total scored, flagged count, averages)
- Toxicity trends over time
- Category breakdown chart
- Recent analysis runs

### Flagged Content Review

Visit http://localhost:8585/analysis/flagged to:

- Browse flagged content (threshold >= 0.5 by default)
- Filter by category, account, date range, review status
- Sort by overall toxicity or specific categories
- Manually review and mark items as:
  - ✓ Correct (correctly flagged)
  - ✗ Incorrect (false positive)
  - ? Unsure

### Review Workflow

1. Click on flagged items to review
2. Use the review buttons (✓, ✗, ?) to mark your assessment
3. Filter by `review_status=unreviewed` to focus on items needing review
4. Use reviewed data to improve the classifier or adjust thresholds

## Cost Estimation

Based on GPT-4o-mini pricing (as of Jan 2025):
- Input: $0.150 per 1M tokens
- Output: $0.600 per 1M tokens

Typical costs:
- ~1,000 statuses = $0.05-0.15
- ~10,000 statuses = $0.50-1.50

The analyzer logs estimated costs after each run.

## Architecture Details

### Batch Processing

The analyzer processes statuses in batches (default: 10 per API call) with concurrency control (default: 5 simultaneous batches). This optimizes for:

- Cost efficiency (batch API calls)
- Rate limit compliance
- Parallel processing speed

### Scoring Logic

Each status receives:
- 12 category scores (0.0 - 1.0)
- Overall score = max of all categories
- Flagged if overall >= threshold (default 0.5)

### Human Review

Manual reviews help:
- Validate AI classifications
- Identify patterns of false positives
- Build training data for future improvements
- Adjust thresholds per category if needed

## Dutch Language Support

The classifier is specifically trained to handle Dutch political content, including:

- Dutch slang and coded terms ("gelukszoekers", "omvolking", "wappie", etc.)
- Political context and satire
- Zwarte Piet debates
- Dutch far-right rhetoric

## Templates

The Bluesky collector templates can be adapted for Mastodon. Key files to create:

1. `app/templates/analysis.html` - Main dashboard
2. `app/templates/flagged.html` - Flagged content browser

These templates should include:
- Chart.js for visualizations
- Filter forms for exploration
- Review buttons for manual validation

## Troubleshooting

### No statuses being scored

- Check that statuses exist: `SELECT COUNT(*) FROM statuses WHERE content IS NOT NULL AND reblog_of_id IS NULL;`
- Check migration applied: `\dt toxicity_scores` in psql
- Check OPENAI_API_KEY is set

### Rate limit errors

- Reduce `ANALYZER_CONCURRENCY` (try 2-3)
- Reduce `ANALYZER_BATCH_SIZE` (try 5)
- The analyzer retries with exponential backoff automatically

### High false positive rate

- Increase `ANALYZER_FLAG_THRESHOLD` (try 0.6 or 0.7)
- Review flagged items and look for patterns
- Dutch political content can be intense but not necessarily toxic

### Template errors

- Ensure templates exist in `app/templates/`
- Check that analysis helper functions are imported correctly
- Verify template filters are defined (`format_number`, `time_ago`, etc.)

## Next Steps

1. Copy analysis templates from Bluesky collector to `app/templates/`
2. Add navigation links to analysis dashboard in base template
3. Run initial analysis on sample data
4. Review flagged content and adjust thresholds
5. Set up automated analysis runs (cron/scheduler)
6. Monitor costs and performance

## References

- Bluesky collector: https://forgejo.postxsociety.cloud/pieter/bluesky-collector
- OpenAI API: https://platform.openai.com/docs
- asyncpg: https://magicstack.github.io/asyncpg/
Add toxicity analysis system for Mastodon statuses Implements comprehensive toxicity analysis following the Bluesky collector architecture: - Analyzer module with async batch processing using GPT-4o-mini - Database schema for toxicity scores and analysis run tracking - 12 toxicity categories (toxic, threat, hate_speech, racism, antisemitism, islamophobia, sexism, homophobia, insult, dehumanization, extremism, ableism) - Web interface routes for analysis dashboard and flagged content review - Manual review API endpoint for human validation - Analysis helper functions for database queries - Dutch language support with coded political term recognition Usage: docker exec mastodon-collector-collector-1 python -m app.analyzer See TOXICITY_ANALYSIS.md for full documentation. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> 2026-03-30 14:43:35 +02:00			`# Toxicity Analysis System`

			`This document describes the toxicity analysis system for the Mastodon collector, adapted from the Bluesky collector implementation.`

			`## Overview`

			`The toxicity analysis system uses OpenAI's GPT-4o-mini to classify Mastodon posts across 12 toxicity categories:`

			`- toxic: rude, disrespectful, or aggressive language`
			`- threat: threats of violence, harm, or intimidation`
			`- hate_speech: targeting based on protected characteristics`
			`- racism: race/ethnicity-based targeting`
			`- antisemitism: anti-Jewish content`
			`- islamophobia: anti-Muslim content`
			`- sexism: gender-based discrimination`
			`- homophobia: anti-LGBTQ+ content`
			`- insult: personal attacks and name-calling`
			`- dehumanization: comparing people to animals/vermin`
			`- extremism: far-right/left extremist rhetoric`
			`- ableism: targeting people with disabilities`

			`## Architecture`

			`The system consists of:`

			1. Analyzer Module (`app/analyzer/`) - Async batch processor for classification
			2. Database Schema (`scripts/02-toxicity.sql`) - Toxicity scores and analysis runs
			`3. Web Interface - Dashboard and flagged content review`
			`4. API Endpoints - For manual review of flagged content`

			`## Setup`

			`### 1. Environment Variables`

			Add to your `.env` file:

			```bash
			`# OpenAI API key for toxicity analysis`
			`OPENAI_API_KEY=sk-...`

			`# Analyzer configuration (optional)`
			`ANALYZER_MODEL=gpt-4o-mini`
			`ANALYZER_BATCH_SIZE=10`
			`ANALYZER_CONCURRENCY=5`
			`ANALYZER_FLAG_THRESHOLD=0.5`
			`ANALYZER_LIMIT=0 # 0 = no limit, or set to test on limited number`
			```

			`### 2. Database Migration`

			`The toxicity schema is applied automatically when the analyzer runs for the first time. It creates:`

			- `toxicity_scores` table - stores scores for each status
			- `analysis_runs` table - audit trail of analysis runs

			`To manually apply the migration:`

			```bash
			`docker exec -i mastodon-collector-db-1 psql -U collector -d mastodon_collector < scripts/02-toxicity.sql`
			```

			`### 3. Install Dependencies`

			Dependencies are already added to `requirements.txt`:
			- `openai==1.58.1` - OpenAI API client
			- `asyncpg==0.30.0` - Async PostgreSQL driver

			`Rebuild the Docker containers to install:`

			```bash
			`docker-compose build`
			`docker-compose up -d`
			```

			`## Running the Analyzer`

			`### One-Time Analysis`

			`Run the analyzer manually to score all unscored statuses:`

			```bash
			`docker exec mastodon-collector-collector-1 python -m app.analyzer`
			```

			`### Test on Limited Sample`

			`To test on 100 statuses first:`

			```bash
			`docker exec mastodon-collector-collector-1 bash -c "ANALYZER_LIMIT=100 python -m app.analyzer"`
			```

			`### Automated Analysis (Future)`

			You can schedule the analyzer to run periodically using cron or a scheduler service. For example, add to your `docker-compose.yml`:

			```yaml
			`analyzer:`
			`build: .`
			`command: python -m app.analyzer`
			`environment:`
			`- DATABASE_URL=postgresql://collector:${POSTGRES_PASSWORD}@db:5432/mastodon_collector`
			`- OPENAI_API_KEY=${OPENAI_API_KEY}`
			`- ANALYZER_LIMIT=${ANALYZER_LIMIT:-0}`
			`depends_on:`
			`- db`
			`restart: "no" # Run once, don't restart`
			```

			`Then trigger manually:`
			```bash
			`docker-compose run --rm analyzer`
			```

			`## Web Interface`

			`### Analysis Dashboard`

			`Visit http://localhost:8585/analysis to see:`

			`- Overall statistics (total scored, flagged count, averages)`
			`- Toxicity trends over time`
			`- Category breakdown chart`
			`- Recent analysis runs`

			`### Flagged Content Review`

			`Visit http://localhost:8585/analysis/flagged to:`

			`- Browse flagged content (threshold >= 0.5 by default)`
			`- Filter by category, account, date range, review status`
			`- Sort by overall toxicity or specific categories`
			`- Manually review and mark items as:`
			`- ✓ Correct (correctly flagged)`
			`- ✗ Incorrect (false positive)`
			`- ? Unsure`

			`### Review Workflow`

			`1. Click on flagged items to review`
			`2. Use the review buttons (✓, ✗, ?) to mark your assessment`
			3. Filter by `review_status=unreviewed` to focus on items needing review
			`4. Use reviewed data to improve the classifier or adjust thresholds`

			`## Cost Estimation`

			`Based on GPT-4o-mini pricing (as of Jan 2025):`
			`- Input: $0.150 per 1M tokens`
			`- Output: $0.600 per 1M tokens`

			`Typical costs:`
			`- ~1,000 statuses = $0.05-0.15`
			`- ~10,000 statuses = $0.50-1.50`

			`The analyzer logs estimated costs after each run.`

			`## Architecture Details`

			`### Batch Processing`

			`The analyzer processes statuses in batches (default: 10 per API call) with concurrency control (default: 5 simultaneous batches). This optimizes for:`

			`- Cost efficiency (batch API calls)`
			`- Rate limit compliance`
			`- Parallel processing speed`

			`### Scoring Logic`

			`Each status receives:`
			`- 12 category scores (0.0 - 1.0)`
			`- Overall score = max of all categories`
			`- Flagged if overall >= threshold (default 0.5)`

			`### Human Review`

			`Manual reviews help:`
			`- Validate AI classifications`
			`- Identify patterns of false positives`
			`- Build training data for future improvements`
			`- Adjust thresholds per category if needed`

			`## Dutch Language Support`

			`The classifier is specifically trained to handle Dutch political content, including:`

			`- Dutch slang and coded terms ("gelukszoekers", "omvolking", "wappie", etc.)`
			`- Political context and satire`
			`- Zwarte Piet debates`
			`- Dutch far-right rhetoric`

			`## Templates`

			`The Bluesky collector templates can be adapted for Mastodon. Key files to create:`

			1. `app/templates/analysis.html` - Main dashboard
			2. `app/templates/flagged.html` - Flagged content browser

			`These templates should include:`
			`- Chart.js for visualizations`
			`- Filter forms for exploration`
			`- Review buttons for manual validation`

			`## Troubleshooting`

			`### No statuses being scored`

			- Check that statuses exist: `SELECT COUNT(*) FROM statuses WHERE content IS NOT NULL AND reblog_of_id IS NULL;`
			- Check migration applied: `\dt toxicity_scores` in psql
			`- Check OPENAI_API_KEY is set`

			`### Rate limit errors`

			- Reduce `ANALYZER_CONCURRENCY` (try 2-3)
			- Reduce `ANALYZER_BATCH_SIZE` (try 5)
			`- The analyzer retries with exponential backoff automatically`

			`### High false positive rate`

			- Increase `ANALYZER_FLAG_THRESHOLD` (try 0.6 or 0.7)
			`- Review flagged items and look for patterns`
			`- Dutch political content can be intense but not necessarily toxic`

			`### Template errors`

			- Ensure templates exist in `app/templates/`
			`- Check that analysis helper functions are imported correctly`
			- Verify template filters are defined (`format_number`, `time_ago`, etc.)

			`## Next Steps`

			1. Copy analysis templates from Bluesky collector to `app/templates/`
			`2. Add navigation links to analysis dashboard in base template`
			`3. Run initial analysis on sample data`
			`4. Review flagged content and adjust thresholds`
			`5. Set up automated analysis runs (cron/scheduler)`
			`6. Monitor costs and performance`

			`## References`

			`- Bluesky collector: https://forgejo.postxsociety.cloud/pieter/bluesky-collector`
			`- OpenAI API: https://platform.openai.com/docs`
			`- asyncpg: https://magicstack.github.io/asyncpg/`