No description
Find a file
Pieter 2172efa701 Fix toxicity analysis web interface
- Fix analysis_helpers stats to match template expectations (posts/replies/mentions breakdown)
- Fix SQL interval syntax in trend query
- Fix URL routing in templates (analysis_flagged, accounts_list)
- Add .claude/ to .gitignore

Analysis dashboard now accessible at http://localhost:8585/analysis

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2026-03-30 17:07:12 +02:00
.claude Fix toxicity analysis web interface 2026-03-30 17:07:12 +02:00
app Fix toxicity analysis web interface 2026-03-30 17:07:12 +02:00
scripts Add toxicity analysis system for Mastodon statuses 2026-03-30 14:43:35 +02:00
.dockerignore Initial commit: Mastodon collector application 2026-02-09 08:05:54 +01:00
.env.example Initial commit: Mastodon collector application 2026-02-09 08:05:54 +01:00
.gitignore Fix toxicity analysis web interface 2026-03-30 17:07:12 +02:00
accounts.txt Initial commit: Mastodon collector application 2026-02-09 08:05:54 +01:00
docker-compose.yml Complete toxicity analysis system setup and testing 2026-03-30 15:39:36 +02:00
Dockerfile Initial commit: Mastodon collector application 2026-02-09 08:05:54 +01:00
README.md Initial commit: Mastodon collector application 2026-02-09 08:05:54 +01:00
requirements.txt Add toxicity analysis system for Mastodon statuses 2026-03-30 14:43:35 +02:00
TOXICITY_ANALYSIS.md Add toxicity analysis system for Mastodon statuses 2026-03-30 14:43:35 +02:00

Mastodon Collector

Collects posts, replies, and mentions from a list of Mastodon accounts and stores them in PostgreSQL. Includes a web UI for account management and data browsing, plus JSON/CSV APIs for your analysis pipeline.

Quick Start

# 1. Add accounts to monitor
echo "@user@mastodon.social" >> accounts.txt

# 2. Start everything
docker compose up -d

# 3. Open the dashboard
open http://localhost:8585

Architecture

Service Description Port
db PostgreSQL 16 5432
web Flask dashboard (Gunicorn) 8585
collector Background service, polls every 4 hours

Adding Accounts

Two methods:

  1. Text file — edit accounts.txt, one handle per line (@user@instance.social). Picked up on next collection cycle.
  2. Web UI — go to http://localhost:8585/accounts and use the form.

Configuration

Edit .env to customize:

POSTGRES_PASSWORD=collector_secret      # Change for production
FLASK_SECRET_KEY=change-me-in-production
POLL_INTERVAL_SECONDS=14400             # Default: 4 hours (14400s)

API Endpoints

For plugging into your analysis pipeline:

Endpoint Description
GET /api/stats Overview stats (counts by type)
GET /api/statuses Paginated statuses as JSON
GET /export Download all statuses as CSV

/api/statuses parameters

  • page — page number (default: 1)
  • per_page — results per page (default: 100, max: 500)
  • account_id — filter by internal account ID
  • type — filter by status type: post, reply, mention, reblog
  • since — ISO datetime, only return statuses after this time

Database Schema

Main tables:

  • monitored_accounts — accounts being tracked
  • statuses — collected posts with plain text + HTML content
  • mentions — who was @-mentioned in each status
  • media_attachments — images/videos attached to statuses
  • tags — hashtags used
  • collection_logs — audit trail of each collection run

Each status stores raw_json with the full Mastodon API response for future analysis needs.

Moving to a Server

# Copy the project
scp -r mastodon-collector/ user@server:~/

# On the server
cd mastodon-collector
# Edit .env with production secrets
docker compose up -d

Stopping

docker compose down          # Stop services, keep data
docker compose down -v       # Stop services AND delete database