I fit a 9-agent LLM pipeline into 1.5GB of RAM on iOS

Name: I fit a 9-agent LLM pipeline into 1.5GB of RAM on iOS
Availability: InStock
Author: TheCosmicStage

by TheCosmicStage·Mar 5, 2026·2 points·0 comments

Visit Project View on HN

AI Analysis

●●●BangerWizardryBig BrainShip It

ExecuTorch compilation + speculative decoding cuts 9-agent LLM to 1.5GB on iOS.

Strengths

•Blackboard pattern decouples multi-agent reasoning without sequential context degradation, solving a real architectural problem.
•Ahead-of-time PyTorch compilation to .pte binaries eliminates wrapper overhead; speculative decoding gives 2.2-3.6x speedup measured rigorously.
•Tiered model strategy (1B/3B/11B) with identical architecture across hardware—thoughtful constraint-driven design balancing capability with device reality.

Weaknesses

•Pre-release tech spec with no live demo, ship date, or user testing—vaporware risk outweighs the architectural innovation.
•Whisper voice input + biometrics promised but incomplete; shipping timeline unclear and missing critical journaling features (export, sync, backup).

Post Description

"Hey HN. I've been building a completely offline AI journal. The biggest hurdle was the memory footprint of running multiple agent personas. I ended up bypassing standard wrappers and using Meta's ExecuTorch to compile the PyTorch graphs ahead-of-time for the Apple Neural Engine, plus 4-bit quantization. Happy to answer any questions about the CoreML backend or managing the 'Blackboard' state object for the agents without killing the battery."

Similar Projects

Health●●Solid

Odozi – open-source iOS journaling app

Correlates mood against Screen Time and HealthKit data automatically on device.

CozyNiche Gem

jlarks32

601mo ago

Developer Tools●●Solid

Composable middleware for LLM inference Optimization Passes

Tower-style middleware stacking for inference guardrails beats bolted-on if-statements.

Big BrainNiche GemShip It

human_hack3r

703mo ago

Health●Mid

SweatDiary – simple workout journal, native iOS and macOS app

Free native workout diary with iCloud sync, but Strava and Hevy already dominate.

CozyEye Candy

frooto443

111mo ago

AI/ML●●Solid

LLM-use – cost-effective LLM orchestrator for agents

Smart local‑first routing that only escalates to expensive cloud planners when necessary is the standout idea — combined with per‑run cost accounting and full Ollama offline support it solves a real operational itch. The repo is a pragmatic, CLI/TUI-focused toolkit (scraping + cache, MCP server mode) that feels useful for teams wanting a no‑friction orchestrator, but it’s playing in a crowded space of agent frameworks so the novelty is incremental rather than revolutionary.

Niche GemBig Brain

justvugg

213mo ago

AI/ML●●●Banger

Whichllm – Find and run the best local LLM for your hardware

One command finds and runs the best local LLM for your exact hardware specs.

Solve My ProblemBig BrainNiche Gem

andyyyy64

302mo ago

AI/ML●●Solid

Memex – A local-first AI journal that keeps everything as Markdown

Local-first AI journal with multi-agent architecture when most competitors store everything in the cloud.

Dark HorseSolve My Problem

sparkleMing

102d ago