GitHub Repository

5 starsGo

Desktop Automation with Codex

Name: Desktop Automation with Codex
Availability: InStock
Author: nicbarth

by nicbarth·Mar 6, 2026·1 point·0 comments

Visit Project View on HN

AI Analysis

●MidBold Bet

LLM-driven desktop automation from screenshots, but unreliable and dangerous.

Strengths

•Cross-platform (Windows, macOS, Linux) with pluggable LLM runners (Claude, Codex, Ollama).
•Task markdown format makes workflows human-readable and version-controllable.
•Global hotkey kill switch prevents runaway bots from destroying user data.

Weaknesses

•Screenshot-to-action loop is fragile; author admits 'mostly works' and risks accidental actions.
•No safeguards against destructive commands; high liability for unattended runs.

Post Description

Tried using doing some desktop automation by sending codex screenshots and stepping through generated instructions. It's rough, but it (mostly) works. In the screenshot it accidentally presses 71336 haha.

Similar Projects

Developer Tools●●●Banger

Xa11y – cross-platform desktop automation via accessibility trees

Accessibility trees beat vision-based automation—no OCR, no pixel coordinates, works cross-platform.

Big BrainWizardrySolve My Problem

_crowecawcaw

205d ago

Productivity●Mid

Automate your workflow in plain English

Yet another Zapier alternative where plain English replaces flowchart builders.

Ship It

Mrakermo

1272mo ago

Productivity●●●Banger

HealUp – Task execution app that hides your todo list on purpose

Hides your todo list on purpose to force focus with single-step execution mode.

Solve My ProblemDark HorseCozy

neshwa35

103mo ago

Developer Tools●Mid

AI-Powered Web Automation APIs (Screenshot, Scrape, SEO, PDF)

Packages common web automation tasks — screenshots, scrapes, SEO checks and PDFs — into APIs, which is convenient but very crowded territory. The live share is broken (the page shows 'zrok share ... not found'), so you can't test reliability or AI value‑adds; unless it provides robust semantic SEO insights, evasion/anti-bot handling, or superior extraction accuracy, it's another Puppeteer/Playwright wrapper.

Ship ItNiche Gem

openclaw_ai

204mo ago

Developer Tools●●Solid

Scan0tron – AI screen capture that auto-fills forms ($49)

Computer vision + Playwright automation for form filling, but $49 price ties to crowded category.

Solve My ProblemSlick

jaydurangodev

103mo ago

AI/ML●Mid

An Agentic Supercomputer

They've built a focused UI for launching goal-driven agent swarms and advertise three real pain points: integrations, stable decomposition, and long-running persistence — all the right battles to fight. The promise of spawning thousands of parallel agents and a harness that can persist multi-week runs is ambitious and useful if it actually works, but the landing page and sparse details leave key questions unanswered (cost controls, safety/guardrails, reproducibility, and evaluation metrics).

Bold BetShip It

andyprevalsky

203mo ago