Back to browse
GitHub Repository

LLM Evaluation for Phoenix Apps

34 starsElixir

Aludel – LLM eval workbench for Phoenix apps

by wood-archer·Mar 30, 2026·2 points·0 comments

AI Analysis

●●SolidNiche GemShip It

Phoenix LiveView embedding beats switching to LangSmith for Elixir teams.

Strengths
  • LiveView embedding means zero context-switching for Phoenix devs
  • Immutable prompt versioning with performance tracking over time
  • Multi-provider cost and latency comparison in real-time
Weaknesses
  • Elixir-only limits audience in a Python-dominated LLM tooling space
  • LLM eval is crowded with LangSmith, Arize, and Helicone already established
Target Audience

Elixir/Phoenix developers building LLM-powered applications

Similar To

LangSmith · Arize Phoenix · Helicone

Similar Projects

AI/ML●●Solid

Quoracle: Self-replicating multi-LLM-consensus agents (Elixir)

Quoracle actually does something interesting: it queries a pool of models and only executes actions they agree on, while letting agents spawn children and persist full state to Postgres — all visible in a LiveView dashboard. The per-model conversation history, recursive hierarchy, and explicit consensus pipeline are clever touches; it’s clearly aimed at experimentation rather than drop-in production use (the README even flags security and deployment caveats).

WizardryNiche Gem
shelvick
213mo ago
AI/ML●●Solid

Quoracle, a recursive consensus-based multi-agent orchestrator (Elixir)

Quoracle forces you to stop trusting one model and instead runs every decision through an explicit consensus pipeline, with per-model conversation history persisted to Postgres and a LiveView dashboard for realtime inspection. Agents can spawn children recursively and communicate via messages, which makes it a neat sandbox for studying emergent behaviors or building robust multi-model workflows — heavy, opinionated, and clearly aimed at folks who want to experiment rather than ship a lightweight chatbot.

WizardryNiche Gem
shelvick
114mo ago