DuoRAG – A dual stack RAG that self-evolves

Name: DuoRAG – A dual stack RAG that self-evolves
Availability: InStock
Author: cagz

by cagz·Mar 27, 2026·3 points·0 comments

Visit Project View on HN

AI Analysis

●●SolidBig BrainSolve My Problem

Self-evolving schema fixes RAG's aggregation problem without predicting queries upfront.

Strengths

•Dual-store routing lets LLM choose vector or SQL backend based on question type.
•Schema evolution detects missing fields mid-session and backfills without re-ingestion.
•Blocks incomplete semantic answers for aggregate questions instead of silent failures.

Weaknesses

•Requires OpenAI API — no local model support for the routing LLM.
•RAG improvement tools already exist from well-funded competitors.

Post Description

Imagine a corpus of documents with scientist biographies.

The traditional RAG works fine until you ask questions like: - "Who was born before 1800?" - "How many are mathematicians?" - "List names and birthdays for mathematicians"

These result in an incomplete answer due to top-k, with no signs of incompleteness.

For an initial corpus, it is possible to improve this problem by extracting metadata for a predetermined set of fields. This approach has two problems:

- One has to predict all the questions that can be asked against the corpus upfront. - Constantly revising that prediction as the documents change, e.g. adding Nobel prizes later, or extending the document set to contain artists.

DuoRAG aims to solve both problems by:

- An initial metadata (schema) discovery before the first ingestion - Self-update schema with candidate fields when it fails to answer a question