Back to browse
GitHub Repository

Identity resolution as code - declarative engine for matching, merging, and mastering entity data

7 starsPython

Kanoniv – Identity resolution in 170 lines of YAML Built in Rust

by dreynow·Feb 17, 2026·1 point·0 comments

AI Analysis

●●SolidNiche GemShip ItSolve My Problem

YAML-driven record linkage beats hand-rolled SQL, but Splink already solved this.

Strengths
  • Replaces 350+ lines of SQL with declarative YAML spec; real comparison repo shows dbt vs Splink vs Kanoniv.
  • Rust+PyO3 architecture optimizes O(n²) record matching without Python overhead.
  • Well-scoped problem: identity resolution is painful, YAML abstraction is genuine UX win.
Weaknesses
  • Directly competes with Splink (open-source, established, well-funded); no clear differentiation.
  • Limited to pip install; early ecosystem with no enterprise features or multi-language bindings.
Category
Target Audience

Data engineers, data quality teams, customer data platform builders.

Similar To

Splink · dbt

Post Description

I kept rebuilding the same identity resolution pipeline at every company i have been at. Normalize emails, block on name+phone, score pairs, cluster, survivorship. 350+ lines of SQL each time, and it still missed half the matches.

Kanoniv is a declarative identity resolution engine. You write a YAML spec (sources, blocking keys, scoring weights, survivorship rules) and it runs a Fellegi-Sunter probabilistic matcher in Rust via PyO3.

I published a comparison repo with the same 6,500 records resolved 3 ways: dbt SQL, Splink, and Kanoniv, so you can see the tradeoffs yourself: https://github.com/kanoniv/kanoniv/tree/main/examples/custom...

Free local SDK (pip install kanoniv).

Happy to answer questions about the matching algorithm or the Rust/PyO3 architecture.

Moving the heavy lifting to Rust via PyO3 is a smart move for performance especially when dealing with the O(n2) nature of record linkage.

Similar Projects

AI/ML●●Solid

Orloj – agent infrastructure as code (YAML and GitOps)

Kubernetes for AI agents with YAML manifests and GitOps workflows.

Big BrainBold Bet
An0n_Jon
20122mo ago