Extend UI – open-source UI kit for modern document apps

Name: Extend UI – open-source UI kit for modern document apps
Availability: InStock
Author: kbyatnal

by kbyatnal·Jun 10, 2026·251 points·81 comments

Visit Project View on HN

AI Analysis

●●SolidSolve My ProblemCozy

Bounding box citations for AI workflows is a clever differentiator for document viewers.

Strengths

•Bounding box citations enable AI agents to reference specific document regions
•MIT licensed components battle-tested in production at Extend themselves
•Covers edge cases for PDF, DOCX, XLSX viewers that other libraries miss

Weaknesses

•Component libraries are inherently incremental rather than groundbreaking
•Document viewer space has existing solutions though none complete

Post Description

We're open-sourcing 14 components & examples today for PDF, DOCX, and XLSX viewers, plus bounding box citations, file upload, e-signature, and more. It's MIT licensed and fully customizable.

Demo video here: https://share.extend.ai/kRmSGKRF

When we started, we tried every file viewer and document component library we could find. Unfortunately, none of them had all the functionality (and polish) that we wanted, so we ended up building our own for https://extend.ai/. It was only ever meant to be internal, but enough customers kept asking for it that we decided to open source it.

It's useful for building document processing agents, real-time user facing document intake flows, or all kinds of internal tooling.

We naively thought this would be a solved problem. Turns out, making PDF/XLSX/DOCX viewers that work at scale is not trivial...we use and maintain it for Extend ourselves, so we've fixed a lot of edge cases that came up while running millions of pages / day through our own system. Our hope is that with our resources + community support, it'll keep getting better over time.

Similar Projects

AI/ML●●Solid

ProofPudding – Document Extraction API with Citations (PDF/Docx)

ProofPudding returns extraction results with explicit links back to the exact page and source text, supports native and scanned PDFs plus DOCX/images, and ships Python/TypeScript SDKs — handy for agents that need auditable facts. It’s a pragmatic product (per-extraction pricing and confidence scores are nice), but the market is crowded; I want clarity on underlying models, real-world accuracy numbers, and how it compares to Document AI/Textract in edge cases.

Solve My ProblemSlick

garai

104mo ago

Productivity●●Solid