Back to browse
GitHub Repository

Local-first document AI. Run 100% locally by default, with API, CLI, and Web UI.

8 starsPython

ParseHawk – 100% Local Document AI with API, CLI, and Web UI

by francisrafal·Jun 25, 2026·4 points·0 comments

AI Analysis

●●SolidDark HorseBig Brain

Local document extraction with JSON schema enforcement beats cloud APIs for sensitive data.

Strengths
  • Constrained decoding enforces JSON schema validation on extraction output
  • vLLM Metal support enables practical local inference on Apple Silicon Macs
  • Three interfaces: REST API for services, CLI for scripts, Web UI for humans
Weaknesses
  • No Windows support limits enterprise adoption significantly
  • Requires 16GB+ unified memory which excludes many older MacBooks
Category
Target Audience

Developers working with sensitive documents, privacy-conscious teams

Similar To

LlamaParse · Azure Document Intelligence · AWS Textract

Post Description

I just released ParseHawk v0.1.0: Apache-2.0 licensed 100% local document AI platform that extracts JSON from PDFs, images etc. It builds on top of NuMind's NuExtract3 but additionally enforces a provided JSON schema with constrained decoding. It works on Apple Silicon with pre-bundled vllm-metal as well as Linux + NVIDIA with vllm. Looking forward to your feedback!

Similar Projects