GitHub Repository

Privacy middleware for LLM & RAG pipelines - consistent pseudonymization, encrypted vault, SSE streaming rehydration.

30 starsRust

I built proxy that keeps RAG working while hiding PII

Name: I built proxy that keeps RAG working while hiding PII
Availability: InStock
Author: rohansx

by rohansx·Mar 12, 2026·4 points·0 comments

Visit Project View on HN

AI Analysis

●●SolidBig BrainSolve My Problem

Consistent pseudonymization beats redaction when RAG embeddings must survive.

Strengths

•Reversible mapping preserves semantic meaning for vector search, unlike [REDACTED] placeholders.
•Single Rust binary with ONNX NER means no Python dependency hell or 200ms latency tax.
•Encrypted vault and SSE streaming rehydration handle production concerns out of the box.

Weaknesses

•LLM privacy middleware is getting crowded; needs clear differentiation from Presidio wrappers.
•GLiNER2 ONNX model adds binary size; unclear if optional or mandatory for core functionality.

Post Description

Hey HN,

When you send real documents or customer data to LLMs, you face a painful tradeoff:

- Send raw text → privacy disaster - Redact with [REDACTED] → embeddings break, RAG retrieval fails, multi-turn chats become useless, and the model often refuses to answer questions about the redacted entities.

The practical solution is consistent pseudonymization: the same real entity always maps to the same token (e.g. “Tata Motors” → ORG_7 everywhere). This preserves semantic meaning for vector search and reasoning, then you rehydrate the response so the provider never sees actual names, numbers or addresses.

I got fed up fighting this with Presidio + custom glue (truncated RAG chunks, declension in Indian languages, fuzzy merging for typos/siblings, LLM confusion, percentages breaking math). So I built Cloakpipe as a tiny single-binary Rust proxy.

It does: • Multi-layer detection (regex + financial rules + optional GLiNER2 ONNX NER + custom TOML) • Consistent reversible mapping in an AES-256-GCM encrypted vault (memory zeroized) • Smart rehydration that survives truncated chunks like [[ADDRESS:A00 • Built-in fuzzy resolution for typos and similar names • Numeric reasoning mode so percentages still work for calculations

Fully open source (MIT), zero Python dependencies, <5 ms overhead.

Repo: https://github.com/rohansx/cloakpipe Demo & quick start: https://app.cloakpipe.co/demo

Would love feedback from anyone who has audited their RAG data flow or is struggling with the redaction-vs-semantics problem — especially in legal, fintech, or non-English workflows.

What approaches have you landed on?