IgniteMS – batch text embeddings at 253K msg/s on 8x A100
Beats Hugging Face TEI by 3x with raw TensorRT and zero Python runtime overhead.
Fast self-hosted embedding engine for search, RAG, and reindexing workloads on NVIDIA GPUs. Built in Rust + TensorRT for teams that care about scale, cost, and control.
3.6x faster than Hugging Face TEI on same hardware with zero Python overhead at runtime.
ML engineers running large-scale embedding pipelines on NVIDIA GPUs
Hugging Face TEI · Fastembed · SentenceTransformers
Beats Hugging Face TEI by 3x with raw TensorRT and zero Python runtime overhead.
Single Rust library replaces backend servers for LLM + speech in Unity and mobile apps.
This brings the Vercel AI SDK ergonomics into Rust with a type-safe LanguageModelRequest builder, #[tool] macros to expose callable tools, streaming text and structured JSON outputs, and compatibility with Vercel UI stacks. The sheer provider count (70+) and ready-made agent tooling are compelling for Rust shops; quality will hinge on per-provider coverage and runtime compatibility, but the docs, examples, and CI indicate serious follow-through.
Single image to explorable 3D scene is technically impressive but mostly a novelty demo.
Vector search inside images beats caption/title matching for finding obscure public domain art.
Rust LSM-Tree engine, but RocksDB and Redb already dominate this space.