Back to browse
Slop or not – can you tell AI writing from human in everyday contexts?

Slop or not – can you tell AI writing from human in everyday contexts?

by eigen-vector·Mar 12, 2026·19 points·20 comments

AI Analysis

●●SolidCrowd PleaserBig Brain

Pre-2022 human posts vs AI models in a crowd-sourced detection benchmark.

Strengths
  • Pre-2022 human baseline ensures no AI contamination in ground truth data.
  • Platform-specific modes like HN-only add useful context to the benchmark.
  • Logs model, tier, and response time for detailed analysis later.
Weaknesses
  • AI detection games are common; novelty relies entirely on the dataset quality.
  • No immediate utility beyond entertainment and future research publication.
Category
Target Audience

AI researchers, data scientists, HN users interested in AI capabilities

Similar To

Human or AI · AI or Not

Post Description

I’ve been building a crowd-sourced AI detection benchmark. Two responses to the same prompt — one from a real human (pre-2022, provably pre prevalence of AI slop on the internet), one generated by AI. You pick the slop. Three wrong and you’re out.

The dataset: 16K human posts from Reddit, Hacker News, and Yelp, each paired with AI generations from 6 models across two providers (Anthropic and OpenAI) at three capability tiers. Same prompt, length-matched, no adversarial coaching — just the model’s natural voice with platform context. Every vote is logged with model, tier, source, response time, and position.

Early findings from testing: Reddit posts are easy to spot (humans are too casual for AI to mimic), HN is significantly harder.

I'll be releasing the full dataset on HuggingFace and I'll publish a paper if I can get enough data via this crowdsourced study.

If you play the HN-only mode, you’re helping calibrate how detectable AI is on here specifically.

Would love feedback on the pairs — are any trivially obvious? Are some genuinely hard?

Similar Projects