Back to browse
GitHub Repository

Claude Code for local LLMs. Unified backend, setup, and coding harness for your own models.

45 starsPython

OpenJet – An offline agent harness for memory-constrained edge hardware

by lforster·Mar 15, 2026·1 point·0 comments

AI Analysis

●●●BangerNiche GemSolve My ProblemShip It

Context condensing under memory pressure solves the actual pain of edge AI agents.

Strengths
  • Automatic context condensing prevents OOM crashes when 9B models eat all available RAM.
  • Air-gapped mode with persistent memory files handles interrupted sessions gracefully.
  • Supports llama.cpp, SGLang, and TensorRT-LLM backends with unified TUI interface.
Weaknesses
  • Requires manual llama-server build — not as turnkey as cloud-hosted alternatives.
  • Jetson-specific optimization limits appeal to broader local-LLM audience.
Category
Target Audience

Edge AI developers, Jetson users, offline-first ML practitioners

Similar To

Off Grid · Ollama · LM Studio

Post Description

Hi HN,

I am building a terminal UI for self-hosted AI agents on Jetsons and other edge devices with unified memory.

The reason I started it was that most local agent harnesses seems aimed at machines with plenty of RAM and a stable internet-connected developer environment. On Jetson-class hardware, the annoying problems are different: context growth eats memory, sessions break, models may fit but leave very little headroom, and a lot of tools assumes cloud access.

Recent additions include:

- air-gapped mode - automatic context condensing under memory pressure - persistent memory files and /memory controls - harness modes for chat/code/review/debug workflows - replayable traces for evals/debugging - multimodal local image input - OpenTelemetry support

I’d love for you to try it out. The code is up on GitHub, and contributions/roasts of my memory management are very welcome. On a 8GB, I got the latest Qwen3.5-9B running (it just about fits in the memory).

Contributions are welcome ofc. Github: https://github.com/L-Forster/open-jet

Similar Projects

AI/ML●●Solid

Hipocampus – Persistent memory harness for AI agents

Compaction tree cuts context from 100K tokens to 3K without losing memory.

Big BrainNiche Gem
kevin-hs-sohn
223mo ago