Back to browse
GitHub Repository
9 starsPython

L88 – A Local RAG System on 8GB VRAM (Need Architecture Feedback)

by adithyadrdo·Feb 24, 2026·3 points·0 comments

AI Analysis

●●SolidShip ItNiche Gem

Agentic RAG with self-evaluator loop, but evaluator/generator sharing one model due to VRAM constraints.

Strengths
  • LangGraph self-correction loop (Router → Analyzer → Rewriter → Retriever → Generator → Self-Evaluator) shows thoughtful multi-step reasoning design.
  • Full-stack implementation: React frontend, FastAPI backend, Ollama integration, FAISS+BGE retrieval—actually shipped and running.
  • Honest about limitations: author (age 18) transparently flags architectural debt (evaluator/generator collapse, needs feedback).
Weaknesses
  • Shared evaluator/generator model defeats self-correction purpose, a critical architectural flaw author acknowledges.
  • Local RAG is crowded (Jan 2025): Ollama + LangChain, LM Studio, Llamaindex, Continue IDE—no clear differentiation in retrieval or inference.
Category
Target Audience

Researchers, data scientists, developers building local LLM applications with privacy requirements

Similar To

LM Studio · Ollama + LangChain · Llamaindex local

Post Description

Hey everyone,

I’ve been working on a project called L88 — a local RAG system that I initially focused on UI/UX for, so the retrieval and model architecture still need proper refinement.

Repo: https://github.com/Hundred-Trillion/L88-Full

I’m running this on 8GB VRAM and a strong CPU (128GB RAM). Embeddings and preprocessing run on CPU, and the main model runs on GPU. One limitation I ran into is that my evaluator and generator LLM ended up being the same model due to compute constraints, which defeats the purpose of evaluation.

I’d really appreciate feedback on:

Better architecture ideas for small-VRAM RAG

Splitting evaluator/generator roles effectively

Improving the LangGraph pipeline

Any bugs or design smells you notice

Ways to optimize the system for local hardware

I’m 18 and still learning a lot about proper LLM architecture, so any technical critique or suggestions would help me grow as a developer. If you check out the repo or leave feedback, it would mean a lot — I’m trying to build a solid foundation and reputation through real projects.

Thanks!

Similar Projects

AI/ML●●Solid

SwarmClaw – Manage a swarm of OpenClaw agents from one self-hosted UI

OpenClaw orchestration with MCP support, but agent management is crowded.

Ship ItNiche Gem
jamesweb
402mo ago
Developer Tools●●Solid

SwarmClaw – Orchestration dashboard for OpenClaw and AI agents

OpenClaw control plane + 15 providers, but orchestration dashboards are crowded.

Big BrainNiche Gem
jamesweb
513mo ago