MemReader: From Passive to Active Extraction for Long-Term Agent Memory
Active memory extraction with GRPO beats passive transcription on LOCOMO benchmarks.

Entity-centric memory cuts context 90% while matching full-text performance on NovelQA.
Researchers, RAG builders, teams processing long documents for QA systems
Retrieval-Augmented Generation (RAG) · Graph-based knowledge indexing · LlamaIndex entity extractors
NERDs (Networked Entity Representation Documents) are Wikipedia-style entity pages that LLM agents build for themselves by reading a large corpus chunk-by-chunk. Instead of reprocessing the full text at query time, a downstream agent searches and reasons over these entity documents.
The idea comes from a pattern that keeps showing up: brains, human cognition, knowledge bases, and transformer internals all organize complex information around entities and their relationships. NERDs apply that principle as a preprocessing step for long-context understanding.
We tested on NovelQA (86 novels, avg 200K+ tokens). On entity-tracking questions (characters, relationships, plot, settings) NERDs match full-context performance while using ~90% fewer tokens per question, and token usage stays flat regardless of document length. To highlight the methods limitation, we also tested it on counting tasks and locating specific passages (which aren't entity-centered) where it did not preform as well.
nerdviewer.com lets you browse all the entity docs we generated across the 86 novels. Click through them like a fan-wiki. It's a good way to build intuition for what the agent produces.
Paper: https://www.techrxiv.org/users/1021468/articles/1381483-thin...
Active memory extraction with GRPO beats passive transcription on LOCOMO benchmarks.
Bi-temporal validity and time-travel queries beat simple vector stores for agent memory.
Agent framework in a single C header file that actually runs local models offline.
Single-file SQLite memory for LLMs simplifies complex vector DBs for local workflows.
Hierarchical scopes let teams share code style rules across agents.
Persistent memory for coding agents when Cursor and Devin already dominate this space.