Back to browse
GitHub Repository

A structured belief revision backend for autonomous agents. It ingests natural language inputs, extracts and stores discrete beliefs, and manages them through a 15-phase agent pipeline covering confidence scoring, contradiction detection, salience decay, and evidence tracking.

4 starsPython

ABES – a memory architecture for belief revision in AI agents

by bradkinnard·Mar 6, 2026·1 point·0 comments

AI Analysis

●●SolidBig BrainWizardry

15-phase belief scheduler with decay mechanics, but unproven in production agent pipelines.

Strengths
  • Explicit belief state model (confidence, salience, contradiction pressure, evidence) avoids RAG text-soup
  • 15-phase scheduler + reinforcement/decay mechanics show real design depth
  • 822 passing tests + 825/1000 cognitive eval suggest rigorous verification
Weaknesses
  • No evidence of deployment in real agent systems or commercial use
  • Contradiction handling and belief mutation mechanics lack published benchmarks against simpler baselines
Category
Target Audience

AI researchers, autonomous agent developers, LLM engineers building long-horizon systems

Similar To

LangChain memory systems · ChromaDB · Pinecone vector memory

Post Description

I’ve been building ABES (Adaptive Belief Ecology System), a memory architecture for AI agents based on the idea that memory should manage belief state over time, not only retrieve prior text.

The system models memory as structured beliefs with explicit state, including confidence, salience, contradiction pressure, lifecycle status, memory tier, evidence balance, lineage, user and session scope, and decay behavior. Beliefs can be reinforced, weakened, contested, updated, mutated, or deprecated as new evidence arrives.

The goal is to support longer-running agents that need to deal with stale information, conflicting information, confidence change, and belief revision, rather than only recalling similar prior content.

Current implementation includes a structured belief model, reinforcement and decay mechanics, contradiction handling, tiered memory behavior, session isolation, API support, Docker support, and testing/evaluation infrastructure.

What has been verified so far in the project’s published tests and evals:

822 passing tests

a 1,000-prompt evaluation with an overall score of 825/1000 (82.5%)

reported category scores of 96.8% episodic memory, 94.4% working memory, and 92.8% semantic memory

a 15-block side-by-side evaluation against a raw Ollama baseline, where ABES passed 14/15 blocks and the baseline passed 6/15

a 200-prompt cognitive stress test reported as 3 consecutive runs at 200/200

Two easy verification points:

run PYTHONPATH=$PWD pytest tests/ -q from the repo root

inspect results/side_by_side_eval.json for the block-level comparison output

I do not consider internal tests and project-published evals to be sufficient external validation. The next stages are stronger benchmarking, improved contradiction handling and belief revision, stronger temporal and relational structure, longer-horizon testing, multi-agent shared memory work, and better observability of belief transitions.

Similar Projects

AI/ML●●Solid

Membrane, revisable memory for long lived AI agents

Instead of an append-only log, this treats memory as typed records you can promote, supersede, fork, merge, or retract — with provenance and decay baked in. It exposes a gRPC API plus TypeScript and Python clients and adds trust/sensitivity filters, which makes the idea actually usable for persistent agents; I want to see more integration examples and scaling/eval numbers, but the core concept and implementation are smart and useful.

Big BrainNiche Gem
GustyCube
103mo ago
AI/ML●●Solid

Experience-engine – reflection-based memory layer for local LLMs

Turns chat history into structured 'belief' and 'cognitive pattern' blocks you can inject into prompts, with simple APIs like run_reflection and run_synthesis that read like a research prototype. It's smart about separating V1 (domain beliefs) from V2 (transferable cognitive patterns), but it's clearly early-stage — tiny repo, Ollama-only workflow, and few commits mean you should treat it as an experimental MVP rather than a drop-in production memory system.

Big BrainNiche GemShip It
ashishluthara
313mo ago