RAG Doctor – CLI tool to diagnose broken RAG pipelines

Name: RAG Doctor – CLI tool to diagnose broken RAG pipelines
Availability: InStock
Author: anvarxadja

by anvarxadja·Mar 13, 2026·2 points·2 comments

Visit Project View on HN

AI Analysis

●●●BangerSolve My ProblemBig Brain

ESLint for RAG pipelines that avoids using AI to debug AI hallucinations.

Strengths

•Deterministic rule engine avoids the irony of using LLMs to debug LLM systems.
•CLI-first design fits naturally into CI pipelines and local development workflows.
•Root cause diagnosis maps findings to specific architectural failures like context overload.

Weaknesses

•Rule-based approach might miss nuanced semantic retrieval failures.
•Requires access to trace data which some managed vector DBs might obscure.

Post Description

Hi HN,

I’ve been working with a lot of Retrieval-Augmented Generation pipelines recently and kept running into the same debugging problem.

When a RAG system produces bad answers, people usually blame the LLM. But in many cases the issue is somewhere in the pipeline itself.

Things like:

documents not chunked correctly embedding models mismatched retrieval not happening before generation context windows overflowing vector database configuration problems prompt injection exposure

These kinds of issues are surprisingly hard to detect in large codebases.

So I started building a small CLI tool called RAG Doctor that analyzes a project and tries to detect structural problems in RAG pipelines.

The idea is similar to ESLint, but for RAG architectures.

The tool parses the codebase, runs a rule engine, and reports potential issues in the pipeline.

One design choice I made early on was to keep the analysis deterministic. AI is not used to generate findings, only to explain them in human language. This keeps the results reproducible and makes the tool usable in CI workflows.

It’s still early, but I’m curious whether others have run into similar debugging problems when building RAG systems.

If you’ve been working on RAG infrastructure, I’d love to hear what kinds of issues you see most often.

Repo: https://github.com/NeuroForgeLabs/rag-doctor

Any feedback would be appreciated.

Similar Projects

AI/ML●●Solid

Incremental RAG ingestion, only changed chunks get re-embedded

Chunk-level incremental sync saves 67% embedding calls on partial document edits.

Big BrainSolve My Problem

shamikhan005

205d ago

AI/ML●●Solid

RAG chunking playground: visualize how your docs get split

Visual chunking comparison beats guessing — export production-ready code.

Solve My ProblemNiche Gem

Horatius77

101mo ago

AI/ML●●Solid

A tool to create and evaluate document processing pipelines for RAG

LLM-as-judge metrics beat guessing chunk sizes, but Ragas and LangSmith already exist.

Solve My ProblemSlick

martimchaves

202mo ago

Data●●●Banger

NRC nuclear licensing RAG pipeline and regulatory embeddings dataset

First public NRC regulatory embeddings dataset—37K chunks ready for ChromaDB and Pinecone.

Niche GemSolve My Problem

davenporten

202mo ago

Developer Tools●●Solid

Cursor-doctor – find out why Cursor ignores your rules

Fixes a real Cursor friction point—rules silently fail—but only useful if you're already using Cursor.

Solve My ProblemNiche Gem

nedcodes

113mo ago

Data●Mid

RAG-Ready Extractor – Structure-aware ingestion with semantic scoring

Noise-filtered PDF/web extraction for RAG, but already solved by Jina, Firecrawl.

Solve My Problem

cddIT

313mo ago