Back to browse
1-800-CODER, macOS app where you call an AI developer to edit your page

1-800-CODER, macOS app where you call an AI developer to edit your page

by abi·May 14, 2026·1 point·0 comments

AI Analysis

●●●BangerBig BrainShip ItSlick

Point-and-talk UI beats prompt engineering by streaming screenshots on speech pauses.

Strengths
  • Triggering image capture on speech_stopped events creates a natural conversational rhythm.
  • Cursor-aware screenshots give the model spatial context that text prompts lack.
  • Voice-first interaction mimics pair programming with a human freelancer.
Weaknesses
  • Limited to GPT-4o level frontend skills; backend logic edits likely fail.
  • Requires constant screenshot streaming which could raise privacy or latency concerns.
Category
Target Audience

Frontend developers, no-code builders, rapid prototypers

Similar To

Cursor · Replit Agent · v0.dev

Post Description

Sharing a small Mac app I built around OpenAI’s gpt-realtime-2 model. You call up a voice coding agent and talk to it like you’d talk to a freelancer ("make the hero tighter, put a product image on the right, that one's too big"). You can even point at things on your webpage and say “remove this” or “make that bold”. Pointing feels like a killer feature. It pushes the conversation bandwidth way up, and feels just like working with a real person over a screen share.

- Video demo: https://youtu.be/RteRVM7BSps - Github URL: https://github.com/abi/1-800-CODER

How is pointing implemented? GPT-Realtime-2 only supports image inputs (unlike Gemini Live which also supports video inputs). So, the app sends a screenshot including the cursor when the model emits a speech_stopped event. That way, the agent always has a fresh visual before it replies.

Limitations:

- GPT-Realtime-2 is okay at front-end changes, probably at a GPT-4o level. Small modifications like copy changes, adding/removing elements, formatting updates work really well at low latency. In fact, for these types of changes, this might be my ideal interface. But if you wanted this app to be more useful for larger changes or generating UI from scratch, you’d want to hook up a subagent system that runs a smarter model like GPT 5.5 or Claude Opus. - GPT-Realtime-2 is expensive. The good news though is that bandwidth is really high here so you might save time with this interface.

Similar Projects

AI/MLMid

Mega LLMs – Universal AI chat client for any OpenAI-compatible API

Plug any OpenAI-compatible provider into a single UI, switch models mid-session, and run side-by-side comparisons while tracking usage — everything you'd expect from a multi-model chat client. The design is eye-catching and the web/desktop split suggests a real app, but this is a crowded niche; the product will live or die on stability of provider integrations, context/memory handling, and clear privacy controls.

SlickSolve My Problem
p32929
204mo ago

Give a Voice to Your AI Agent

The project is a pragmatic, no-friction way to route MCP client output to macOS TTS — you get a runnable speak_server.py, ready-made CLI snippets for Gemini and Claude, and persona profiles that alter spoken behavior. Small but thoughtful extras like dynamic AGENTS.MD and persona presets make it useful for prototyping voice-first agents. The downside is obvious: it’s macOS-only and targets a narrow audience, but for that audience it removes a lot of friction.

Niche GemShip It
pcbmaker20
104mo ago