SIB-ENGINE Pre-emptive hallucination detection via geometric structure
Monitors internal latent collapse before tokens are sampled, not output semantics.
Real-time hallucination detection for LLMs via Geometric Drift Analysis in Hidden States.
Detects hallucinations via latent space geometry instead of text analysis, but 54% detection rate is incomplete.
LLM researchers, inference engineers, and developers building safety-critical AI systems on consumer hardware.
Anthropic's Constitutional AI (safety monitoring) · Hugging Face SafetyChecker · Together AI's GuardRail
KEY RESULTS (Gemma-2B, N=1000): • 54% hallucination detection with 7% false positive rate • <1% computational overhead (runs on RTX 3050 with 4GB VRAM) • ROC-AUC: 0.8995
WHY IT'S DIFFERENT: Traditional methods analyze the output text semantically. SIB-ENGINE monitors "geometric drift" in hidden states during generation - identifying the structural collapse of the latent space before the first incorrect token is sampled.
This approach offers unique advantages: • Real-time intervention: Stop generation mid-stream • Language-agnostic: No semantic analysis needed • Privacy-preserving: Never reads the actual content • Extremely lightweight: Works on consumer hardware
HOW IT WORKS: SIB-ENGINE monitors the internal stability of the model's computation. While the system utilizes multiple structural signals to detect instability, two primary indicators include:
Representation Stability: Tracking how the initial intent is preserved or distorted as it moves through the model's transformation space.
Cross-Layer Alignment: Monitoring the consensus of information processing across different neural depths to identify early-stage divergence.
When these (and other proprietary structural signals) deviate from the expected stable manifold, the system flags a potential hallucination before it manifests in the output.
DEMO & CODE: • Demo video: https://www.youtube.com/watch?v=H1_zDC0SXQ8 • GitHub: https://github.com/yubainu/sibainu-engine • Raw data: raw_logs.csv (full transparency)
LIMITATIONS: • Tested on Gemma-2B only (2.5B parameters) • Designed to scale, but needs validation on larger models • Catches "structurally unstable" hallucinations (about half) • Best used as first-line defense in ensemble systems
TECHNICAL NOTES: • No external models needed (unlike self-consistency methods) • No knowledge bases required (unlike RAG approaches) • Adds ~1% inference time vs. 300-500% for semantic methods • Works by monitoring the process not the product
I'd love feedback on: • Validation on larger models (Seeking strategic partnerships and compute resources for large-scale validation.) • Integration patterns for production systems • Comparison with other structural approaches • Edge cases where geometric signals fail
This represents a fundamentally different paradigm: instead of asking "is this text correct?", we ask "was the generation process unstable?" The answer is surprisingly informative.
Happy to discuss technical details in the comments!
Monitors internal latent collapse before tokens are sampled, not output semantics.
Detects hallucinations via hidden state geometry in under 1ms with no training required.
Detects hallucinations mid-generation via hidden state geometry, not output analysis.
Peer-reviewed LLM hallucination detector using uncertainty quantification, published in JMLR and TMLR.
Regression tests catch cross-domain hallucinations, but prompt-based approach won't scale.
Rigorous 38-day Gemini drift study with citation-mapped predictions and confidence scores.