I built a Harvey-style tabular review app, then open sourced the code

Name: I built a Harvey-style tabular review app, then open sourced the code
Availability: InStock
Author: afistfullof

by afistfullof·Apr 9, 2026·4 points·0 comments

Visit Project View on HN

AI Analysis

●MidBig BrainNiche Gem

Encoder-based extraction guarantees zero hallucinations compared to Harvey's generative approach.

Strengths

•Architectural choice of encoder models eliminates generative hallucination risks entirely.
•Provides grounded extraction with confidence scores for every annotation.
•Detailed guide enables replication without relying on black-box generative APIs.

Weaknesses

•Requires Isaacus API keys, making it a lead gen tool rather than standalone OSS.
•Legal tabular review is a narrow workflow compared to general document analysis.

Post Description

I spent the past couple of weekends building an open-source alternative to Harvey/Legora's popular tabular review application for lawyers.

The project was sparked by a viral LinkedIn post from lawyer Joshua Upin, who described being shown a hallucinated citation by Harvey that was falsely attributed to one of Harvey’s competitors. Seeing such a basic failure emerge from their architecture made me ask a simple question: could I recreate a similar product of theirs without using a single generative model, and in doing so make hallucinations architecturally impossible?

As it turns out, quite a lot.

In building the app, I did not use a single external or generative model. The entire system uses models my organisation trained and owns. More specifically, it uses a combination of Kanon 2 Enricher, Kanon 2 Embedder, and Kanon Answer Extractor. All three are encoder-based, and there are no generative models anywhere in the stack.

That means hallucinations are architecturally impossible. It also means the system can retrieve, classify, extract, and link information in a much more structured and interactive way than products that lean heavily on generation.

At its core, the app turns contracts into a wiki-style, interconnected knowledge graph: a network of entities, annotations, spans, and relations that users can explore interactively. Key features like parties, locations, dates, signatures, and terms are extracted on the first pass. From there, users can define custom spans and relations, extending the graph as they go.

The end result is a tabular review system that matches the core experience offered by the market and, in several meaningful respects, goes beyond it.

I embedded a static version of the app at the top of the linked page so people can try it directly. The static version has real public contracts processed using the application. These contracts relate to public figures like Mark Zuckerberg, Elon Musk, and Jensen Huang, making it easy to verify the accuracy of the stack. The linked page also works as a step-by-step guide for anyone who wants to build something similar themselves.

Similar Projects

AI/ML●Mid

Open-source, local-first legal AI workspace for lawyers

Yet another legal AI wrapper when Harvey and Casetext already dominate this space.

Solve My Problem

rohasnagpal

201mo ago

Open Source●Mid

OpenRevise is the Harvey for all industries

The repo nails the governance bits: MECE decomposition, a strict source‑gate, and JSON patch specs so changes are only made when verifiable fulltext exists. It emits true DOCX tracked edits and a Q→source audit mapping — exactly the kind of deterministic audit trail regulated teams want — but the project is still early (few stars, light demos) and it’s unclear how it integrates with verification or LLM orchestration out of the box.

Niche GemSolve My Problem

alfredray

305mo ago

AI/ML●●Solid