1-800-CODER, macOS app where you call an AI developer to edit your page

by abi·May 14, 2026·1 point·0 comments

Visit Project View on HN

AI Analysis

●●●BangerBig BrainShip ItSlick

Point-and-talk UI beats prompt engineering by streaming screenshots on speech pauses.

Strengths

•Triggering image capture on speech_stopped events creates a natural conversational rhythm.
•Cursor-aware screenshots give the model spatial context that text prompts lack.
•Voice-first interaction mimics pair programming with a human freelancer.

Weaknesses

•Limited to GPT-4o level frontend skills; backend logic edits likely fail.
•Requires constant screenshot streaming which could raise privacy or latency concerns.

Post Description

Sharing a small Mac app I built around OpenAI’s gpt-realtime-2 model. You call up a voice coding agent and talk to it like you’d talk to a freelancer ("make the hero tighter, put a product image on the right, that one's too big"). You can even point at things on your webpage and say “remove this” or “make that bold”. Pointing feels like a killer feature. It pushes the conversation bandwidth way up, and feels just like working with a real person over a screen share.

- Video demo: https://youtu.be/RteRVM7BSps - Github URL: https://github.com/abi/1-800-CODER

How is pointing implemented? GPT-Realtime-2 only supports image inputs (unlike Gemini Live which also supports video inputs). So, the app sends a screenshot including the cursor when the model emits a speech_stopped event. That way, the agent always has a fresh visual before it replies.

Limitations:

- GPT-Realtime-2 is okay at front-end changes, probably at a GPT-4o level. Small modifications like copy changes, adding/removing elements, formatting updates work really well at low latency. In fact, for these types of changes, this might be my ideal interface. But if you wanted this app to be more useful for larger changes or generating UI from scratch, you’d want to hook up a subagent system that runs a smarter model like GPT 5.5 or Claude Opus. - GPT-Realtime-2 is expensive. The good news though is that bandwidth is really high here so you might save time with this interface.

Similar Projects

Productivity●●Solid

SharePad – share a USB iPad as a clean window in any call

Eliminates the QuickTime dance for sharing iPad screens in Zoom calls.

Solve My ProblemCozy

jonyardley

313d ago

AI/ML●Mid

Mega LLMs – Universal AI chat client for any OpenAI-compatible API

Plug any OpenAI-compatible provider into a single UI, switch models mid-session, and run side-by-side comparisons while tracking usage — everything you'd expect from a multi-model chat client. The design is eye-catching and the web/desktop split suggests a real app, but this is a crowded niche; the product will live or die on stability of provider integrations, context/memory handling, and clear privacy controls.

SlickSolve My Problem

p32929

204mo ago

AI/ML●●Solid

Modern AI client for Mac with agentic tools, clean UI, builtin privacy

Rich inline charts and maps beat Claude Desktop's text-only responses.

SlickNiche Gem

elvean

201mo ago

Developer Tools●●Solid

AppDesk – Native macOS Client for App Store Connect

Native macOS client replacing the sluggish App Store Connect web UI with local AI.

Solve My ProblemSlickNiche Gem

prasadrl

502mo ago

Productivity●●Solid

Editing 2000 photos made me build a macOS bulk photo editor

Native Mac batch editor that keeps 2000 wedding photos off the cloud.

Solve My ProblemCozy

om202

13252mo ago

Developer Tools●Mid

Give a Voice to Your AI Agent

The project is a pragmatic, no-friction way to route MCP client output to macOS TTS — you get a runnable speak_server.py, ready-made CLI snippets for Gemini and Claude, and persona profiles that alter spoken behavior. Small but thoughtful extras like dynamic AGENTS.MD and persona presets make it useful for prototyping voice-first agents. The downside is obvious: it’s macOS-only and targets a narrow audience, but for that audience it removes a lot of friction.

Niche GemShip It

pcbmaker20

104mo ago