StreamHouse – S3-native Kafka alternative written in Rust
Removes broker disk complexity entirely—S3 as durable log cuts Kafka ops burden and cost dramatically.
Open-source event streaming platform built on S3. Kafka-compatible APIs, built-in SQL engine, schema registry — one Rust binary replaces Kafka + ZooKeeper + KSQL. Retention costs pennies, not thousands
S3-native storage slashes Kafka costs from thousands to $23 per TB monthly.
Backend engineers, DevOps teams running event-driven architectures
Kafka · Redpanda · Pulsar
I built StreamHouse, an open-source streaming platform that replaces Kafka's broker-managed storage with direct S3 writes. The goal: same semantics, fraction of the cost.
How it works: Producers batch and compress records, a stateless server manages partition routing and metadata (SQLite for dev, PostgreSQL for prod), and segments land directly in S3. Consumers read from S3 with a local segment cache. No broker disks to manage, no replication factor to tune — S3 gives you 11 nines of durability out of the box.
What's there today: - Producer API with batching, LZ4 compression, and offset tracking (62K records/sec) - Consumer API with consumer groups, auto-commit, and multi-partition fanout (30K+ records/sec) - Kafka-compatible protocol (works with existing Kafka clients) - REST API, gRPC API, CLI, and a web UI - Docker Compose setup for trying it locally in 5 minutes
What's not there yet: - Battle-tested production deployments (I'm the only user so far) - Connectors for consumers to immediately connect to (i.e clickhouse, elastic search etc)
The cost model is what motivated this. Kafka's storage costs scale with replication factor × retention × volume. With S3 at $0.023/GB/month, storing a TB of events costs ~$23/month instead of hundreds on broker EBS volumes.Written in Rust, 15 crates thus far. Apache 2.0 licensed.
GitHub: https://github.com/gbram1/streamhouse How it works blog on my main website: https://streamhouse.app/how-it-works
Happy to answer questions about the architecture, tradeoffs, or what I learned building this.
Removes broker disk complexity entirely—S3 as durable log cuts Kafka ops burden and cost dramatically.
Stateless agents with S3 storage cuts Kafka+Flink costs by 92%, but confluent-kafka wire protocol still needs fixes.
S3-backed Kafka eliminates broker state management; inspired by Warpstream but open-source.
Wire-protocol parsing means zero Docker overhead for Kafka integration tests.
Real-time server grid beats htop, but web UIs for sysadmin dashboards already exist everywhere.