GitHub Repository

Google's MRT2 → Core ML, full Apple Neural Engine execution, zero GPU. Runs locally on iPhone without burning your hand.

21 starsPython

Magenta Real-Time Music Generation on iPhone, Without the GPU

Name: Magenta Real-Time Music Generation on iPhone, Without the GPU
Availability: InStock
Author: MediaSquirrel

by MediaSquirrel·Jun 10, 2026·9 points·0 comments

Visit Project View on HN

AI Analysis

●●●BangerWizardryBig BrainZero to One

Splits a 230M-param model across ANE and CPU to avoid thermal throttling on iPhone.

Strengths

•Model partitioning across silicon components based on hardware affinity is genuinely novel.
•Published rigorous metrics: 12-digit correlation, zero GPU time, 14ms p99 latency.
•Runs 10 minutes straight on iPhone 12 Pro without melting or thermal throttling.

Weaknesses

•iOS-only; no Android or cross-platform path demonstrated yet.
•Specific to Magenta RealTime 2; unclear if technique generalizes to other models.

Post Description

Last Thursday, Deepmind released Magenta Realtime 2 , an open source music generation model. They said it could run on Mac, but not iPhone.

As a v̵i̵b̵e̵ ̵c̵o̵d̵i̵n̵g̵ ̵a̵d̵d̵i̵c̵t̵ agentic AI maxxi and person who has melted iPhones before (link at bottom), I took that as a personal challenge and made it my weekend project.

On Saturday, I got it to run for 10min straight on an iPhone 12 Pro from 2020 without melting the phone or - shockingly - touching the GPU.

How? I chopped the model up into 5 pieces and set them each to run on different parts of Apple's system on a chip (SoC).

My past experience taught me that if you can actually leverage it, the iPhone's NPU is incredibly powerful, and power efficient. If you're doing sustained real-time generation for long periods of time on a device without a fan, you gotta use the neural engine or else you will melt the device.

See: https://accelerateordie.com/p/we-melted-iphones-for-science

The Apple Neural Engine has a ton of constraints, the main one being that it only accepts fixed shape inputs, and only supports some architectures -- which is why I chopped the model up into pieces.

But it works! And I wrote zero lines of code by hand. Back when I was running VC-backed companies, I would have needed a small team of grumpy greybeard engineers to do this and it would have taken 2-6 weeks. Now I can feed my own nerd fetish and do this stuff myself.

Next up: I'm building an iPhone app that ties into your heart rate, movement data, location etc to generate a real-time soundtrack to you life.

What a time to be alive!