I benchmarked Gemma 4 E2B – the 2B model beat the 12B on multi-turn
2B model beats 12B on some tasks, saving hardware costs for edge deployment.

50-token compact code output beats raw 5,000-token Excalidraw JSON — clever compression.
Frontend developers, ML engineers interested in edge inference
WebLLM · Transformers.js · MLC Chat
2B model beats 12B on some tasks, saving hardware costs for edge deployment.
Runs Gemma 4 E2B and Kokoro TTS locally with barge-in and vision.
Local LLM agent with DOM tools running entirely in-browser via WebGPU.
Basic canvas demo when Chrome's own docs already cover this API.
Milkdrop running in the browser via WebGPU is pure nostalgia fuel.
Auto-animated Excalidraw diagrams from prompts beats manual canvas setup.