Aludel – LLM eval workbench for Phoenix apps

Name: Aludel – LLM eval workbench for Phoenix apps
Availability: InStock
Author: wood-archer

by wood-archer·Mar 30, 2026·2 points·0 comments

Visit Project View on HN

AI Analysis

●●SolidNiche GemShip It

Phoenix LiveView embedding beats switching to LangSmith for Elixir teams.

Strengths

•LiveView embedding means zero context-switching for Phoenix devs
•Immutable prompt versioning with performance tracking over time
•Multi-provider cost and latency comparison in real-time

Weaknesses

•Elixir-only limits audience in a Python-dominated LLM tooling space
•LLM eval is crowded with LangSmith, Arize, and Helicone already established

Similar Projects

Developer Tools●●Solid

A2UI for Elixir/Phoenix/LiveView

Renders AI agent JSONL as LiveView components, but the protocol is still v0.9.

Niche GemShip It

maxekman

204mo ago

Developer Tools●●●Banger

Deterministic, Replayable Demos for Phoenix LiveView

Runtime clicks for you instead of waiting for user input like shepherd.js.

Big BrainNiche Gem

ralmidani

302mo ago

AI/ML●●Solid

Quoracle: Self-replicating multi-LLM-consensus agents (Elixir)

Quoracle actually does something interesting: it queries a pool of models and only executes actions they agree on, while letting agents spawn children and persist full state to Postgres — all visible in a LiveView dashboard. The per-model conversation history, recursive hierarchy, and explicit consensus pipeline are clever touches; it’s clearly aimed at experimentation rather than drop-in production use (the README even flags security and deployment caveats).

WizardryNiche Gem

shelvick

215mo ago

AI/ML●●Solid

Quoracle, a recursive consensus-based multi-agent orchestrator (Elixir)

Quoracle forces you to stop trusting one model and instead runs every decision through an explicit consensus pipeline, with per-model conversation history persisted to Postgres and a LiveView dashboard for realtime inspection. Agents can spawn children recursively and communicate via messages, which makes it a neat sandbox for studying emergent behaviors or building robust multi-model workflows — heavy, opinionated, and clearly aimed at folks who want to experiment rather than ship a lightweight chatbot.

WizardryNiche Gem

shelvick

115mo ago

AI/ML●●Solid

LLMadness – March Madness Model Evals

Claude Opus spent $59.55 versus MiMo-Flash at $0.39 for identical bracket predictions.

Dark HorseBig Brain

rjkeck2

524mo ago

Developer Tools●●Solid

AI-Evals.io – Evaluate this site with the tools it reviews

LLM evaluation guide eats its own dogfood with eval-based site design.

Solve My ProblemBig Brain

alexhans

505mo ago