Back to browse
GitHub Repository

Multi-service, multi-platform optical character recognition

251 starsPython

Multi platform/multi service (several REd for it) OCR daemon/texthooker

by AuroraWright·Mar 5, 2026·1 point·0 comments

AI Analysis

●●SolidSolve My ProblemNiche Gem

Multi-engine OCR texthooker with native macOS/Windows packages and Wayland support.

Strengths
  • Reverse-engineered undocumented APIs (Google Lens, Apple Vision, OneOCR) unlock powerful free OCR engines.
  • True cross-platform with feature parity on Windows/macOS/Linux including Wayland, with packaged binaries lowering friction.
  • Continuous clipboard monitoring and screen diffing are clever for real-time language study workflows.
Weaknesses
  • Niche appeal: primarily for Japanese learners; unclear how well it generalizes to other languages.
  • Multi-engine complexity may introduce fragility; no evidence of production stability or error handling depth.
Target Audience

Language learners (especially Japanese), power users, and developers needing cross-platform OCR with multiple backend support.

Similar To

Tesseract · PaddleOCR · Google Cloud Vision

Post Description

Have worked on it for several years now as part of studying Japanese, can scan for clipboard images or directly, act as a "text hooker" by capturing screen portions/windows (and diffing for changes over time), or receive images through websockets/unix socket. All the GUI portions (configuration, log viewer that replaces terminal in the packaged versions, coordinate selection) use tkinter. It can use several OCR engines both local and online, some were reverse engineered for it by me and a friend (Google Lens uses the Chrome API, Apple Live Text uses a private ObjectiveC version of the API meant for Webkit). Other interesting ones: OneOCR is a local version of the Azure OCR model that somehow they ship in the Windows 11 snipping tool, Chrome Screen AI seems to be a local version of Lens that they ship for offline PDF annotations in Chrome (both are extremely good). Has feature parity on Windows/macOS/Linux, including Wayland through a shim which emulates mss (Python screenshot library) with the screencast api and an original implementation of ext_data_control_v1 (for those Wayland compositors that support it).

Similar Projects