1k desktop beats vendor sparse library 474× on Mistral-7B

Name: 1k desktop beats vendor sparse library 474× on Mistral-7B
Availability: InStock
Author: heggenhougen

by heggenhougen·Mar 17, 2026·1 point·1 comment

Visit Project View on HN

AI Analysis

●●●BangerBold BetWizardry

133.5× speedup with identical SHA-256 hash across NVIDIA, AMD, Intel, Apple Silicon.

Strengths

•Cryptographic output verification proves mathematical correctness across platforms
•University of Miami independent validation adds credibility to benchmarks
•No model retraining or hardware changes required for deployment

Weaknesses

•Closed-source with patents pending limits community verification
•Extraordinary claims need reproducible benchmarks beyond landing page

Post Description

I've been working on a sparse compute primitive for AI inference. The idea came on a bike ride — I'm a mathematician and serial entrepreneur, not a GPU engineer, and I wanted to see if the matrix math at the heart of every AI model could be restructured to skip unnecessary work while producing identical outputs. Results on my HP All-in-One (i7-1165G7, 4 cores, 64GB, Windows 11):

Mistral-7B real weights, 0% sparsity (fully dense): 127× vs CPU dense, 474× vs CPU sparse, 0.7ms first token vs 49.4ms sparse, 99.2% less energy Llama-2-7B real weights, 0% sparsity: 59× vs dense, 228× vs sparse, 0.8ms first token

On NVIDIA B200 with real HuggingFace weights:

Llama-4 Maverick 400B: 133.5× faster, 99.9% less energy, 52× faster first token DeepSeek-R1 (256 experts): 78.9× faster, 98.7% less energy

The canonical SHA-256 hash appears identically across NVIDIA, AMD, Google TPU, Intel, and Apple Silicon — same math, different silicon, same answer. Independently validated by University of Miami Frost Institute. Open verifier at rolv.ai — runs on any hardware, generates your own baseline hash. No IP in the verifier. Happy to answer technical questions about how it works.