Back to browse
I built an SDK that scrambles HTML so scrapers get garbage

I built an SDK that scrambles HTML so scrapers get garbage

by larsmosr·Mar 12, 2026·16 points·38 comments

AI Analysis

●●●BangerWizardryBig Brain

CSS flex ordering makes textContent return garbage while visual rendering stays perfect.

Strengths
  • Genuinely clever CSS reordering technique that breaks textContent scraping.
  • Honest threat model admitting it doesn't stop headless browsers or OCR.
  • Multiple layers: honeypots, canvas images, clipboard interception, forensic breadcrumbs.
Weaknesses
  • Waitlist-only access limits immediate testing and adoption.
  • Fundamental limitation: any CSS-executing browser can still extract content.
Category
Target Audience

Content creators and publishers protecting against AI crawlers

Similar To

CopyProtect · Digiprove

Post Description

Hey HN -- I'm a solo dev. Built this because I got tired of AI crawlers reading my HTML in plain text while robots.txt did nothing.

The core trick: shuffle characters and words in your HTML using a seed, then use CSS (flexbox order, direction: rtl, unicode-bidi) to put them back visually. Browser renders perfectly. textContent returns garbage.

On top of that: email/phone RTL obfuscation with decoy characters, AI honeypots that inject prompt instructions into LLM scrapers, clipboard interception, canvas-based image rendering (no img src in DOM), robots.txt blocking 30+ AI crawlers, and forensic breadcrumbs to prove content theft.

What it doesn't stop: headless browsers that execute CSS, screenshot+OCR, or anyone determined enough to reverse-engineer the ordering. I put this in the README's threat model because I'd rather say it myself than have someone else say it for me. The realistic goal is raising the cost of scraping -- most bots use simple HTTP requests, and we make that useless.

TypeScript, Bun, tsup, React 18+. 162 tests. MIT licensed. Nothing to sell -- the SDK is free and complete.

Best way to understand it: open DevTools on the site and inspect the text.

GitHub: https://github.com/obscrd/obscrd

Similar Projects