Back to browse
GitHub Repository

EmbedAudit CLI for auditing embedding spaces. Runs neighborhood consistency checks, drift detection, intrinsic dimensionality, and outlier analysis across sentence-transformers and OpenAI embeddings with HTML reports.

5 starsPython

CLI tool to analyze your Vector Embeddings!

by gauravvij137·Feb 20, 2026·2 points·1 comment

AI Analysis

MidShip It

Embedding auditor with 5 checks and pretty plots, but crowded niche with unclear novelty.

Strengths
  • Five semantic checks cover real failure modes: outliers, boundary ambiguity, polarity mismatches, global collapse
  • Multiple input formats and flexible clustering (HDBSCAN, KMeans) add utility for different workflows
  • Built by NEO agent: nice marketing angle, but the actual code is straightforward ML pipeline composition
Weaknesses
  • Orchestrates existing libraries (UMAP, HDBSCAN, HuggingFace) without novel technique; no architectural insight
  • Embedding analysis tools already exist (Nucleus, embedding comparison in LLMOps platforms); not clear why this one wins
Target Audience

ML engineers, NLP researchers, anyone validating embedding quality before production

Similar To

Nucleus Semantic Search · Embedding validators in Weights & Biases · Custom embedding audit scripts in pandas/scikit-learn

Similar Projects