Tok/s on a 35B MoE model using a $100 AMD crypto APU and Vulkan
Clever hardware hack but this is a config guide, not a shipped tool.
AMD BC-250 (PS5 APU) setup guide — Ollama + Vulkan inference, poor man's AI assistant via Signal, stable-diffusion.cpp image generation
Kernel ttm.pages_limit workaround unlocks 16GB UMA for Vulkan inference on repurposed crypto hardware.
Budget-conscious developers wanting local AI inference hardware
llama.cpp · Ollama · stable-diffusion.cpp
Clever hardware hack but this is a config guide, not a shipped tool.
Runs 19.5GB Qwen3.5 on 12GB RAM iPhone via memory swapping.
LLMs trading crypto with 20x leverage while a HP system forces them to stay active.
450k context on 32GB VRAM using turboquant KV cache compression.
Orchestrates real-time skepticism between models to catch hallucinations before you see them.
Hardware-backed private inference, but requires trust in Onera's server infrastructure anyway.