HelixDB – A Graph Database built on Object-storage
Graph-vector-FTS in one database, but Weaviate and Neo4j already offer hybrid search.
Bundles vector, graph, and full-text search with local ML inference into one Go binary.
Backend engineers building RAG pipelines or search features
Weaviate · Elasticsearch · Qdrant
I built this to give developers a single-binary deployment with native ML inference (via a built-in service called Termite), meaning you don't need external API calls for vector search unless you want to use them.
Some things that might interest this crowd:
Capabilities: Multimodal indexing (images, audio, video), MongoDB-style in-place updates, and streaming RAG.
Distributed Systems: Multi-Raft setup built on etcd's library, backed by Pebble (CockroachDB's storage engine). Metadata and data shards get their own Raft groups.
Single Binary: antfly swarm gives you a single-process deployment with everything running. Good for local dev and small deployments. Scale out by adding nodes when you need to.
Ecosystem: Ships with a Kubernetes operator and an MCP server for LLM tool use.
Native ML inference: Antfly ships with Termite. Think of it like a built-in Ollama for non-generative models too (embeddings, reranking, chunking, text generation). No external API calls needed, but also supports them (OpenAI, Ollama, Bedrock, Gemini, etc.)
License: I went with Elastic License v2, not an OSI-approved license. I know that's a topic with strong feelings here. The practical upshot: you can use it, modify it, self-host it, build products on top of it, you just can't offer Antfly itself as a managed service. Felt like the right tradeoff for sustainability while still making the source available.
Happy to answer questions about the architecture, the Raft implementation, or anything else. Feedback welcome!
Graph-vector-FTS in one database, but Weaviate and Neo4j already offer hybrid search.
Single-file mmap storage plus an HNSW vector index and explicit graph edges is an elegant, practical combo — think "SQLite for agent memory" with CRC-32 crash recovery and zero-server convenience. The C++20 core + nanobind gives zero-copy NumPy views and GIL-free searches, and the claimed FAISS-like throughput makes this genuinely interesting for local setups; main caveat is build/toolchain friction and how rich the surrounding ecosystem becomes.
Replaces the pgvector plus AGE plus model server stack with a single binary and query.
Replaces PostgreSQL + Redis + Neo4j + ClickHouse with one Rust binary.
Graph RAG without Neo4j — pure vector search beats HippoRAG on multi-hop benchmarks.
ACT-R scoring and active forgetting beat standard vector similarity for agent context.