Breathe-Memory – Associative memory injection for LLMs (not RAG)
Graph-based context compression beats lossy summarization when tokens run out.
An associative memory you can run anywhere. Write facts in plain language, recall them by meaning. No tables, no schema, no embeddings, no model required.
130× more facts per GiB than vectors, but no semantic similarity matching.
Developers building local-first AI applications needing lightweight memory
Chroma · LanceDB · Mem0
Graph-based context compression beats lossy summarization when tokens run out.
Removes LLM from memory CRUD path—sub-10ms reads beat Mem0/Zep's 200-500ms by design.
Compresses 28M tokens to 100k queryable chars local-only; duplicates RAG problems at smaller scale.
Biologically-inspired memory consolidation that prunes unused facts and strengthens associations overnight.
Single-file mmap storage plus an HNSW vector index and explicit graph edges is an elegant, practical combo — think "SQLite for agent memory" with CRC-32 crash recovery and zero-server convenience. The C++20 core + nanobind gives zero-copy NumPy views and GIL-free searches, and the claimed FAISS-like throughput makes this genuinely interesting for local setups; main caveat is build/toolchain friction and how rich the surrounding ecosystem becomes.
Vector DBs store memories; this one forgets, consolidates, and flags contradictions like human memory.