NeuralForge – Fine-Tune LLMs on Your Mac Using Apple Neural Engine
Fine-tune LLMs on Apple Neural Engine using reverse-engineered private frameworks — genuinely novel approach.
Fine-tune Gemma 4 and 3n with audio, images and text on Apple Silicon, using PyTorch and Metal Performance Shaders.
Only Apple Silicon toolkit streaming GCS data during audio fine-tuning without OOM.
ML engineers with Apple Silicon Macs fine-tuning multimodal models
MLX-LM · Unsloth · axolotl
Gemma 3n came out, so I added that. Kinda went nuts, tbh.
Then I put it on the shelf.
When Gemma 4 came out a few days ago, I dusted it off, cleaned it up, broke out the Gemma part from the Whisper fine-tuning and added support for Gemma 4.
I'm presenting it for you here today to play with, fork and improve upon.
One thing I have learned so far: It's very easy to OOM when you fine-tune on longer sequences! My local Mac Studio has 64GB RAM, so I run out of memory constantly.
Anywho, given how much interest there is in Gemma 4, and frankly, the fact that you can't really do audio fine-tuning with MLX, that's really the reason this exists (in addition to my personal interest). I would have preferred to use MLX and not have had to make this, but here we are. Welcome to my little side quest.
And so I made this. I hope you have as much fun using it as I had fun making it.
-Matt
Fine-tune LLMs on Apple Neural Engine using reverse-engineered private frameworks — genuinely novel approach.
Wraps mlx-lm fine-tuning into a guided desktop UI, but local LLM tools are crowded.
Tauri GUI wrapper around mlx-lm—useful for Mac users, but local fine-tuning UIs already exist.
Cool demo, but there's no actual tool to use—just a video and writeup.
Unified memory trick lets a 2B model beat 12B; trains on MacBook with zero cloud costs.
2B model beats 12B on some tasks, saving hardware costs for edge deployment.