Back to browse
GitHub Repository

Privacy middleware for LLM & RAG pipelines - consistent pseudonymization, encrypted vault, SSE streaming rehydration.

30 starsRust

I built proxy that keeps RAG working while hiding PII

by rohansx·Mar 12, 2026·4 points·0 comments

AI Analysis

●●SolidBig BrainSolve My Problem

Consistent pseudonymization beats redaction when RAG embeddings must survive.

Strengths
  • Reversible mapping preserves semantic meaning for vector search, unlike [REDACTED] placeholders.
  • Single Rust binary with ONNX NER means no Python dependency hell or 200ms latency tax.
  • Encrypted vault and SSE streaming rehydration handle production concerns out of the box.
Weaknesses
  • LLM privacy middleware is getting crowded; needs clear differentiation from Presidio wrappers.
  • GLiNER2 ONNX model adds binary size; unclear if optional or mandatory for core functionality.
Category
Target Audience

Developers building RAG pipelines with privacy requirements

Similar To

Presidio · Microsoft Purview · Privacera

Post Description

Hey HN,

When you send real documents or customer data to LLMs, you face a painful tradeoff:

- Send raw text → privacy disaster - Redact with [REDACTED] → embeddings break, RAG retrieval fails, multi-turn chats become useless, and the model often refuses to answer questions about the redacted entities.

The practical solution is consistent pseudonymization: the same real entity always maps to the same token (e.g. “Tata Motors” → ORG_7 everywhere). This preserves semantic meaning for vector search and reasoning, then you rehydrate the response so the provider never sees actual names, numbers or addresses.

I got fed up fighting this with Presidio + custom glue (truncated RAG chunks, declension in Indian languages, fuzzy merging for typos/siblings, LLM confusion, percentages breaking math). So I built Cloakpipe as a tiny single-binary Rust proxy.

It does: • Multi-layer detection (regex + financial rules + optional GLiNER2 ONNX NER + custom TOML) • Consistent reversible mapping in an AES-256-GCM encrypted vault (memory zeroized) • Smart rehydration that survives truncated chunks like [[ADDRESS:A00 • Built-in fuzzy resolution for typos and similar names • Numeric reasoning mode so percentages still work for calculations

Fully open source (MIT), zero Python dependencies, <5 ms overhead.

Repo: https://github.com/rohansx/cloakpipe Demo & quick start: https://app.cloakpipe.co/demo

Would love feedback from anyone who has audited their RAG data flow or is struggling with the redaction-vs-semantics problem — especially in legal, fintech, or non-English workflows.

What approaches have you landed on?

Similar Projects

Security●●Solid

OneCLI – Vault for AI Agents in Rust

Agents never see real keys, but Vault already does secret injection.

Solve My ProblemSlick
guyb3
161523mo ago