Agent Caching in Fiddler
Proxy-level LLM caching saves tokens during dev without instrumenting your code.

Wire-protocol transparent pooling and edge caching vs Hyperdrive's Cloudflare-only lock-in.
Backend engineers, SaaS platforms with global users, serverless applications
Hyperdrive · Prisma Accelerate · PgBouncer
- PgBouncer pools connections but doesn't cache and runs in a single region.
- Hyperdrive does both but only works from Cloudflare Workers.
- Prisma Accelerate requires the Prisma ORM.
PgBeam is a PostgreSQL proxy that speaks the wire protocol natively. You only change one environment variable:
Before: postgresql://user:[email protected]:5432/postgres
After: postgresql://user:[email protected]:5432/postgres
Three things happen:
1. Routing: GeoDNS points to the nearest proxy (6 regions today)
2. Connection pooling: Warm upstream connections, no TLS/auth cost per query
3. Query caching: SELECTs cached at the edge with stale-while-revalidate. Writes, transactions, and volatile functions like NOW() or RANDOM() are never cached.
Live benchmark at https://pgbeam.com/benchmark with real TLS PostgreSQL connections from 20 global regions, comparing direct vs. PgBeam (cached and uncached). No synthetic data.
This is a technical preview meant for design partners and early customers via a private beta before scaling the infrastructure. Feedback is welcomed!
Proxy-level LLM caching saves tokens during dev without instrumenting your code.
RDMA-backed distributed KV cache cuts prefill latency 3.1× where vLLM's built-in caching maxes out.
Zero-code sharding proxy with cross-shard aggregates in production, serving millions QPS today.
Browser-based latency tester, but speedtest.net and similar tools already exist.
Academic paper on TTFT optimization with no implementation to evaluate.
DDSketch learns latency thresholds automatically, no manual tuning like Finagle or Linkerd.