TurboQuant-WASM – Google's vector quantization in the browser
Google's ICLR 2026 quantization paper running client-side with SIMD-accelerated dot products.
A vector index built on TurboQuant, written in Rust with Python bindings
Data-oblivious quantization beats Product Quantization on online updates.
ML engineers building vector search systems
FAISS · SPTAG · DiskANN
Google's ICLR 2026 quantization paper running client-side with SIMD-accelerated dot products.
Standalone KV cache compression script implementing TurboQuant with 1.55x ratio.
99.9% compression claims need peer review—zero stars, one commit, no standard benchmarks.
Article promising 2026 tech but just tells you to use standard Ollama.
Runs 405B model compression on a single 32GB GPU when others need enterprise clusters.
Interactive outlier slider proves why rotation beats naive quantization.