Back to browse
GitHub Repository

Snap any image, screenshot, or webpage into plaintext. No GPU. No cloud. One command.

100 starsPython

CPU-only OCR for screenshots, images, and webpages

by mrkn1·May 24, 2026·4 points·9 comments

AI Analysis

●●SolidSolve My ProblemCozy

CPU-only VLM OCR beats Tesseract on layout without needing CUDA or cloud APIs.

Strengths
  • Quantized ONNX model runs on CPU without CUDA dependencies or cloud API keys.
  • Extracts main content image from webpages automatically via readability parsing.
  • Single Python module with self-installing dependencies makes deployment trivial for automation.
Weaknesses
  • Webpage OCR only targets the main image, ignoring full-page text extraction.
  • Zero stars and one commit suggests early stage with unproven maintenance.
Category
Target Audience

Developers needing offline OCR, privacy-focused users, automation scripters

Similar To

EasyOCR · Tesseract · Umi-OCR

Similar Projects

Klovr – Convert any webpage to Markdown (Cloudflare covers only 5%)

Nice, focused product: site-specific extraction rules (CSS selectors/metadata overrides), edge-first delivery (<500ms p99) and SDKs for Node/Python make it quick to drop into an LLM pipeline and claim 40–60% token savings. That said, HTML→Markdown is a crowded niche (Pandoc, Jina, Firecrawl and dozens of scrapers already exist), so Klovr needs clearer differentiation — e.g. demonstrable extraction accuracy, enterprise-grade rule sharing, or unique model-aware trimming — to move beyond 'handy utility'.

Solve My ProblemSlick
vaibhavlodha98
213mo ago