VectorLens – See why your RAG hallucinates, no config

Name: VectorLens – See why your RAG hallucinates, no config
Availability: InStock
Author: gustav-proxi

by gustav-proxi·Mar 9, 2026·1 point·0 comments

Visit Project View on HN

AI Analysis

●●SolidBig BrainDark Horse

Zero-config RAG tracing when LangSmith needs heavy instrumentation.

Strengths

•Monkey-patching common clients means zero code changes to existing pipelines
•Local hallucination detection with sentence-transformers keeps data private
•Auto-intercepts OpenAI, Anthropic, Gemini, ChromaDB, FAISS without config

Weaknesses

•Monkey-patching is fragile across library version updates
•RAG observability space already has LangSmith, Arize, Helicone

Post Description

I built VectorLens because I was tired of "log file archaeology" every time my RAG pipeline hallucinated. Usually, when an LLM gives a wrong answer, you're stuck guessing which retrieved chunk misled it—or why the right chunk was ignored.

Existing observability tools either require a cloud signup, an enterprise contract, or heavy manual instrumentation of your code. I wanted something that stayed local and just worked.

The Solution: Three lines of code

Python import vectorlens vectorlens.serve() # Open http://127.0.0.1:7756 # Your RAG code runs as-is (OpenAI, Anthropic, Gemini, ChromaDB, FAISS, etc. are auto-intercepted) How it works technically:

Zero-Config Interception: It monkey-patches common LLM and Vector DB clients. You don't have to change your functions or wrap your calls; it intercepts the data flow automatically.

Local Hallucination Detection: It uses sentence-transformers (a 22MB model) to compare the LLM’s output sentences against the retrieved context. If the similarity is too low, it's flagged as a hallucination.

Perturbation Attribution: To figure out "why," it measures how the output changes when specific chunks are removed or modified. This gives you a clear score of which data points actually drove the response.

Fully Local: No data leaves your machine. The dashboard is a local React app updated via WebSockets.

Why use this over other tools?

Privacy: No cloud uploads or API keys for the debugger itself.

No Vendor Lock-in: Works with local models (Ollama/Mistral) just as easily as it does with GPT-4.

Speed: It runs detection in a background thread, so it doesn't block your main application logic.

I’m looking for feedback on the attribution accuracy and if there are specific Vector DBs you'd like to see supported next.

GitHub: https://github.com/Gustav-Proxi/vectorlens

Similar Projects

AI/ML●Mid

RAG-LCC – config-driven RAG framework for fast experimentation

Focuses on pre-retrieval document classification to fix context quality, not just embedding search.

Niche GemShip It

HarinezumIgel

201mo ago

Developer Tools●●Solid

Caliper – Auto Instrumented LLM Observability with Custom Metadata

Zero-code instrumentation via monkey-patching, but Langsmith, Helicone, and Arize already do this.

Solve My ProblemShip It

OliverGuy

203mo ago

AI/ML●●●Banger

Legal RAG Bench

Legal RAG benchmark revealing embedding quality > LLM choice by 19-point margin.

Big BrainNiche GemSolve My Problem

beowa

413mo ago

AI/ML●Mid

Vex Runtime reliability layer that auto-corrects AI-Agent hallucination

Hallucination guardrails middleware, but is it better than prompt engineering plus Claude?

Solve My Problem

vex-ai-dev

123mo ago

Gaming●Mid

Talisman – A Android instrument played with two thumbs

Thumb instrument auto-tunes so you cannot play wrong notes.

CozyEye Candy

ycosynot

1231mo ago

Developer Tools●●●Banger

OTel native agent to instrument applications

AST parsing beats prompt-scoped AI by finding every DB call across framework boundaries.

Big BrainSolve My ProblemWizardry

tiwarinitish86

101mo ago