Back to browse
Parseflow, how to parse documents when you're broke

Parseflow, how to parse documents when you're broke

by bollethegoalie·May 21, 2026·2 points·0 comments

AI Analysis

MidShip ItBold Bet

Student-built extraction API competing directly with established players like LlamaParse.

Strengths
  • Transparent pricing model targets small teams priced out of enterprise document solutions.
  • Returns diagnostic metadata alongside chunks to help debug parsing failures.
  • Supports async jobs and batch processing for high-volume document ingestion.
Weaknesses
  • No open-source alternative to verify parsing logic or self-host for sensitive data.
  • Crowded market with LlamaIndex, Unstructured, and AWS Textract already solving this.
Category
Target Audience

Developers building RAG pipelines or document processing apps

Similar To

LlamaParse · Unstructured.io · Markitdown

Post Description

Hello HN, I built Parseflow, it's a simple, evidence focused extraction API which can extract take PDFs, DOCX and TXT files and extract/chunk the info inside them to improve LLM context and reduce token usage. If you want to try out a demo, you can find it here: demo.parseflow.tech

I am still a student dev, graduating high school this year so I still have a lot to learn. I am trying to build this project to help pay for tuition this year but also to help me learn. So any feedback, advice, questions, etc... are super appreciated and either I will try to respond to the comments or you can email me at [email protected]

Thanks, bollethegoalie

Similar Projects

AI/ML●●Solid

ProofPudding – Document Extraction API with Citations (PDF/Docx)

ProofPudding returns extraction results with explicit links back to the exact page and source text, supports native and scanned PDFs plus DOCX/images, and ships Python/TypeScript SDKs — handy for agents that need auditable facts. It’s a pragmatic product (per-extraction pricing and confidence scores are nice), but the market is crowded; I want clarity on underlying models, real-world accuracy numbers, and how it compares to Document AI/Textract in edge cases.

Solve My ProblemSlick
garai
103mo ago