Piqc – GPU waste scanner for LLM inference clusters
Read-only GPU waste scanner finds 20-40% cluster spend waste without agents or sidecars.
Kubernetes scanner that discovers LLMs running on vLLM and extracts their deployment and runtime facts.
One-command GPU waste scanner when Kubecost requires full Prometheus setup.
ML engineers and DevOps teams running LLM inference on Kubernetes
Kubecost · OpenCost · CloudHealth
Read-only GPU waste scanner finds 20-40% cluster spend waste without agents or sidecars.
Browser-based GPU cluster for LLM inference with HTTP API and SSE broker coordination.
94% GPU reduction claim needs verifiable benchmarks to stand out.
One-command benchmark suite comparing Ollama and XGBoost performance with a shared Streamlit dashboard.
Saves neoclouds months of engineering by turning bare metal racks into managed Kubernetes clusters.
Sequential-dispatch methodology corrects 20x overestimation in prior WebGPU benchmarks.