Back to browse

What I learned running a crypto data pipeline at 120M messages/day

by Qalypto·Mar 4, 2026·4 points·1 comment

Post Description

Been running this for 5 months now. 4 exchanges (Binance, Bybit, OKX, Bitget), 10 perpetuals, all on a single Hetzner box for 46 euro/month.

Stack: Python asyncio, Kafka in KRaft mode, ClickHouse, k3s. Cloudflare Tunnel handles ingress.

Some things that broke along the way:

ORDERBOOK GAPS Exchanges skip sequence numbers sometimes. Your local book drifts and you dont notice until something goes wrong. Had to build per-symbol gap detection with automatic snapshot recovery. Each exchange does sequencing differently so thats four separate implementations.

CLICKHOUSE INSERTS Started with small batches, ClickHouse was at 30% CPU just doing merges. Bumped batch size to 5000 rows with 2 second intervals, dropped to 8%. Also moved inserts to an async queue so the Kafka consumer never blocks.

LOGGING At 500 msg/s the logger was allocating thousands of strings per second. OOM killer got us twice before I figured it out. Set everything on the data path to WARNING and it went away.

Current numbers: - 120M+ messages/day - P50: 250ms, P95: 400ms latency - >99.8% data coverage - 5 months, no major incidents

If anyone wants to poke around the data:

qalypto.com/data-lab

CSV samples, no signup needed.

Happy to answer questions.

Similar Projects

Developer Tools●●Solid

Mimir – Shared memory and inter-agent messaging for Claude Code swarms

Mimir hooks into Claude Code lifecycle events so agents can 'mark' facts (e.g., "API uses snake_case") into a DuckDB-backed memory and RAG pipeline, then auto-injects that context as additionalContext for later agents. It's a pragmatic, well-scoped solution to the annoying problem of agent amnesia — very useful if you run agent swarms, but its impact is limited by Claude Code adoption and the need for the surrounding infra (BGE keys, hooks).

Niche GemShip It
deejaydev
213mo ago