CastReader – Free TTS Extension That Reads Kindle Cloud Reader

Name: CastReader – Free TTS Extension That Reads Kindle Cloud Reader
Availability: InStock
Author: vinxu

by vinxu·Mar 12, 2026·2 points·1 comment

Visit Project View on HN

AI Analysis

●●SolidNiche GemBig Brain

Glyph decoding bypasses Kindle's font rendering to enable TTS where other extensions fail.

Strengths

•Intercepting KindleModuleManager and decoding binary font tables is genuinely clever technical work
•Word-level highlight sync from glyph decoding rather than OCR shows real attention to accuracy

Weaknesses

•Chrome Web Store shows item currently unavailable, raising deployment concerns
•Narrow use case limits audience to Kindle Cloud Reader users specifically

Post Description

Every TTS browser extension fails on Kindle Cloud Reader. The reason: Amazon renders text using custom font subsets where glyph IDs don't map to standard Unicode. You select text, copy it, and get garbage. The DOM is useless.

CastReader solves this by intercepting KindleModuleManager to capture font and token data, decoding glyph mappings from the binary font tables, then running Tesseract.js OCR locally in an offscreen document to calibrate the decoder. The final text comes from glyph decoding (not OCR) so it's accurate enough for word-level highlight sync. WeRead (the largest Chinese reading platform) has a similar problem — it renders everything on canvas. CastReader uses a main-world content script injected at document_start to intercept fetch responses containing chapter data before the page consumes them.

For normal websites, there's a 3-tier extraction pipeline: 15+ site-specific extractors (Notion, Google Docs, ChatGPT, Claude, arXiv, etc.), a learned CSS selector rule system, and a universal visible-text-block algorithm that fuses ideas from Readability.js, Boilerpipe, and JusText — container scoring with text density, link density scaling, stop-word classification, and progressive retry with flag degradation.

TTS runs through Kokoro, an open model supporting 40+ languages. Audio plays directly in the content script so highlight sync reads currentTime with zero latency — no message passing, no offscreen document relay.

Limitations I should be honest about: the voice library is small (Kokoro only, no premium neural voices), no mobile support, extraction still fails on some complex layouts (there's a manual content selector fallback), and the TTS server is something I run myself, so uptime isn't guaranteed.

Completely free. No signup, no usage limits, no premium tier. Chrome and Edge.