Back to browse
GitHub Repository

Transparent, transport-layer semantic cache for LLM API calls, powered by Redis 8 Vector Sets.

2 starsPython

Khazad – Transparent Semantic Cache for LLM Calls on Redis Vector Sets

by guglielmoce·Jun 29, 2026·3 points·0 comments

AI Analysis

●●SolidBig BrainSlick

Transport-layer interception beats GPTCache with zero code changes required.

Strengths
  • Model-aware caching prevents gpt-4o answers serving gpt-4o-mini requests
  • Conversation-aware embedding uses full message history, not just last turn
  • Streaming support captures chunk-by-chunk for both sync and async clients
Weaknesses
  • Semantic caching tradeoffs may serve wrong answers for nuanced queries
  • Redis 8 Vector Sets requirement limits deployment options
Category
Target Audience

Developers building LLM-powered applications with high-volume repetitive queries

Similar To

GPTCache · CacheLLM · LLMCache

Similar Projects