LLMxRay an open-source observability tool for LLMs
Multilingual tokenization comparison across Arabic, Chinese, French that LangSmith ignores.
The identity layer for AI agents. So your agent stays itself, forever.
Cross-model prompt calibration using actual research, not just API chaining.
Developers working with multiple LLM providers or migrating prompts across models
PromptPerfect · DSPy · LangChain prompt management
A transfer engine that learns a mapping between model behaviors using source/target prompt pairs A MAP-RPE evolutionary loop that iteratively improves candidates against a scoring function until behavioral parity is reached
Works fully local via Ollama. Also supports OpenRouter for cross-hosted runs. No telemetry, no cloud dependency. Built with Python, Typer, Pydantic. Happy to go deep on the calibration algorithm or the tradeoffs in the scoring design.
Multilingual tokenization comparison across Arabic, Chinese, French that LangSmith ignores.
Three-line wrapper cuts LLM costs 80%+ via prompt classification and same-provider routing.
Local semantic caching cuts LLM costs without changing your code.
Prompt compression cuts token costs 40-60%, but it's lossless text optimization, not a novel insight.
If you're burning through Claude/OpenAI credits, this is a low-friction stopgap: it classifies prompts in ~10ms and routes trivial tasks to cheaper/local models while reserving premium APIs for complex work. The agentic-task detection, reasoning-aware routing, session pinning and context-window fallback are practical touches that avoid mid-thread model bouncing and 429 failures. It isn't reinventing the space (OpenRouter and others exist), but it's focused on real-world cost tradeoffs and drop-in compatibility.
Tackles persona collapse with architecture, but lacks proof-of-concept or working implementation.