Lyria.run – Music generation powered by Google's Lyria 3
Lyria 3 wrapper with clean UI, but Suno and Udio already own this space.
Google's MRT2 → Core ML, full Apple Neural Engine execution, zero GPU. Runs locally on iPhone without burning your hand.
Splits a 230M-param model across ANE and CPU to avoid thermal throttling on iPhone.
Mobile developers, musicians, on-device AI engineers
Off Grid AI · MLX · Core ML tools
As a v̵i̵b̵e̵ ̵c̵o̵d̵i̵n̵g̵ ̵a̵d̵d̵i̵c̵t̵ agentic AI maxxi and person who has melted iPhones before (link at bottom), I took that as a personal challenge and made it my weekend project.
On Saturday, I got it to run for 10min straight on an iPhone 12 Pro from 2020 without melting the phone or - shockingly - touching the GPU.
How? I chopped the model up into 5 pieces and set them each to run on different parts of Apple's system on a chip (SoC).
My past experience taught me that if you can actually leverage it, the iPhone's NPU is incredibly powerful, and power efficient. If you're doing sustained real-time generation for long periods of time on a device without a fan, you gotta use the neural engine or else you will melt the device.
See: https://accelerateordie.com/p/we-melted-iphones-for-science
The Apple Neural Engine has a ton of constraints, the main one being that it only accepts fixed shape inputs, and only supports some architectures -- which is why I chopped the model up into pieces.
But it works! And I wrote zero lines of code by hand. Back when I was running VC-backed companies, I would have needed a small team of grumpy greybeard engineers to do this and it would have taken 2-6 weeks. Now I can feed my own nerd fetish and do this stuff myself.
Next up: I'm building an iPhone app that ties into your heart rate, movement data, location etc to generate a real-time soundtrack to you life.
What a time to be alive!
Lyria 3 wrapper with clean UI, but Suno and Udio already own this space.
Local Whisper + NLLB translation with 300ms latency overlay for Discord and games.
Pure Vulkan compute enables LLMs inside game loops without CUDA lock-in.
Streaming speech-to-text on-device beats Whisper's wait-for-silence UX pattern.
Just a recording of Google's existing NotebookLM feature, not a tool.
Steer live AI music generation with text prompts inside your DAW.