Pieter 1c3f57d7e5 Add documentation and license, remove IDE files

Added comprehensive project documentation and MIT license. Removed Claude
IDE configuration files from repository tracking.

Documentation added:
- FINDINGS.md: Complete methodology report and research findings
  - 159 accounts tracked, 15,190 posts collected (Jan 1 - Mar 30)
  - Human review results: 40.4% correct, 59.6% false positives
  - AI toxicity detection limitations and recommendations
- OPERATIONS.md: Complete operations and maintenance guide
  - Service start/stop procedures
  - Database operations and queries
  - Configuration options
  - Troubleshooting guide
  - Data export instructions

License:
- Added MIT License to README.md
- Copyright 2026 Post X Society
- Open source with permissive license

Repository cleanup:
- Added .claude/ to .gitignore
- Removed .claude/settings.local.json from tracking
- Prevents IDE-specific files from being committed

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

2026-03-30 14:39:11 +02:00

8.3 KiB

Raw Blame History

Bluesky Toxicity Analysis - Main Findings

Study Overview

Period: January 1 – March 30, 2026 (89 days) Monitored Accounts: 159 Dutch political accounts Total Posts Collected: 15,190 posts

1. Data Collection Summary

Content Distribution

Primary Content (by tracked accounts):
- Original Posts: 3,032
- Replies: 3,652
- Total Primary: 6,684 posts
Secondary Content (mentions of tracked accounts):
- Unique Mention Posts: 8,506
- Note: Posts mentioning multiple tracked accounts counted once

Total Dataset

Combined Content: 15,190 posts
Collection Method: Automated via Bluesky Public API (every 4 hours)
Infrastructure: Docker containers with PostgreSQL database

2. Toxicity Detection Results

AI Model Performance

Model Used: OpenAI GPT-4.1-nano
Classification Categories: 12 toxicity dimensions
Flagging Threshold: Overall toxicity score ≥ 0.5 (50%)

Flagged Content

Primary Content (Posts/Replies): 97 posts flagged
Secondary Content (Mentions): 413 unique posts flagged
Total Flagged: 510 unique posts

Distribution Insight

81% of flagged content came from mentions (external users → politicians)
19% of flagged content came from politicians themselves
External users directed significantly more toxic content toward politicians than politicians produced

3. Human Review Results

Review Completion

Total Items Reviewed: 510 posts (100% of flagged content)
Review Period: January 1 – March 30, 2026
Review Interface: Custom web application with ✓/✗/? buttons

Validation Results

Primary Content (Posts/Replies by Politicians)

Status	Count	Percentage
✓ Correctly Flagged	32	33.0%
✗ Incorrectly Flagged	65	67.0%
? Unsure	0	0.0%
Total	97	100%

Secondary Content (Mentions of Politicians)

Status	Count	Percentage
✓ Correctly Flagged	174	42.1%
✗ Incorrectly Flagged	239	57.9%
? Unsure	0	0.0%
Total	413	100%

Combined Results

Status	Count	Percentage
✓ Correctly Flagged	206	40.4%
✗ Incorrectly Flagged	304	59.6%
? Unsure	0	0.0%
Total	510	100%

4. Key Findings

4.1 High False Positive Rate

Overall False Positive Rate: 59.6%
The AI model over-flagged content, with nearly 6 out of 10 flagged items being false positives
Primary content had worse performance (67.0% false positives) than mentions (57.9%)

4.2 Model Limitations Identified

Threshold Sensitivity: The 0.5 threshold appears too low for Dutch political discourse
Context Misinterpretation: Strong policy language, political criticism, and satire frequently misclassified as toxic
Cultural/Linguistic Gaps: Dutch political communication patterns may not align with model training data
Nuance Detection: Difficulty distinguishing between heated but legitimate debate and actual toxicity

4.3 Directional Toxicity Pattern

External mentions (8,506 posts) generated 413 flagged items (4.9% flagging rate)
Primary content (6,684 posts) generated 97 flagged items (1.5% flagging rate)
Politicians receive approximately 3× more toxic content than they produce (by flagging rate)
However, after human review, both sources showed high false positive rates

4.4 Accuracy Comparison

Mentions accuracy: 42.1% (slightly better)
Primary content accuracy: 33.0% (worse)
Neither content type achieved acceptable accuracy for automated moderation
Possible explanation: Politicians' language more frequently uses strong policy terms that trigger false positives

5. Implications for Automated Moderation

What This Study Reveals

AI Cannot Replace Human Judgment: 59.6% false positive rate makes unsupervised automation dangerous
Threshold Optimization Needed: Current 0.5 threshold too aggressive; may need 0.7+ for political content
Domain-Specific Training Required: Political discourse needs specialized models or fine-tuning
Human-in-the-Loop Essential: Automated flagging useful for triage, but human review mandatory

Recommended Approach

Use AI toxicity detection as first-pass screening only
Require human review for all flagged content before action
Consider higher thresholds (0.7–0.8) for political accounts
Train domain-specific models on Dutch political discourse
Implement appeals process for false positives

6. Technical Implementation Success

What Worked Well

Automated Collection: 4-hour collection cycles captured comprehensive dataset
Human Review Interface: Web UI with ✓/✗/? buttons efficient for manual validation
Date Filtering: Allowed focused analysis of specific time periods
Engagement Metrics: Successfully captured likes, replies, reposts, quotes for mentions
Deduplication Logic: Properly handled posts mentioning multiple tracked accounts

Infrastructure Performance

Uptime: 99%+ (only brief scheduler issue Feb 23-24)
Data Integrity: PostgreSQL database handled 15K+ posts without issues
Analysis Throughput: GPT-4.1-nano processed all content efficiently
Web Interface: Responsive UI for 500+ manual reviews

7. Study Limitations

Single Model Used: Only tested GPT-4.1-nano; ensemble approaches not evaluated
No Inter-Rater Reliability: Single human reviewer; no validation of review consistency
Limited Context: Dutch political context; findings may not generalize to other domains
Arbitrary Threshold: 0.5 threshold not scientifically optimized
Limited Time Period: 3-month window may not capture seasonal variations in discourse
No Appeal Process: No mechanism for accounts to contest flagging decisions

8. Recommendations for Future Work

Short-Term Improvements

Threshold Optimization: Test 0.6, 0.7, 0.8 thresholds and measure precision/recall
Category-Specific Tuning: Different thresholds for different toxicity categories
Context Windows: Analyze conversation threads, not isolated posts
Multi-Model Validation: Test other models (Perspective API, custom fine-tuned models)

Long-Term Research

Dutch Political Corpus: Create labeled training dataset for Dutch political discourse
Fine-Tune Models: Train specialized classifiers on validated Dutch political content
Longitudinal Study: Track patterns over election cycles and major events
Cross-Platform Analysis: Compare Bluesky toxicity patterns with Twitter/X, Mastodon
Inter-Rater Reliability Study: Multiple reviewers to validate human judgment consistency

9. Data Access

Database Content (as of March 30, 2026)

Accounts Table: 159 tracked political accounts
Posts Table: 6,684 posts and replies
Mentions Table: 8,506 unique mention posts
Toxicity Scores: 6,684 scored primary posts
Mention Toxicity Scores: 8,506 scored mentions
Human Reviews: 510 manual validations

Exported Datasets Available

Full post content with toxicity scores
Human review decisions with timestamps
Engagement metrics (likes, replies, reposts, quotes)
Time-series data for trend analysis

10. Conclusion

This study demonstrates that while AI-powered toxicity detection can identify potential concerns in large-scale social media content, it cannot reliably moderate without substantial human oversight. The 59.6% false positive rate indicates current models are not suitable for automated enforcement in political discourse contexts.

Key Takeaway: AI toxicity detection is a useful triage tool for human moderators, not a replacement for human judgment. Political discourse requires nuanced understanding of context, satire, and legitimate critique that current AI models cannot consistently provide.

Project Status: Data collection complete. Web interface remains available for analysis and reporting. Database preserved for future research.

Generated: March 30, 2026 Study Period: January 1 – March 30, 2026 Monitored Platform: Bluesky Social Network Geographic Focus: Dutch Political Discourse

8.3 KiB Raw Blame History Unescape Escape