Back to browse
AI agents designed and shipped this app end-to-end in 36 hours for $270

AI agents designed and shipped this app end-to-end in 36 hours for $270

by arashsadrieh·Feb 18, 2026·2 points·4 comments

AI Analysis

MidWizardryCrowd Pleaser
The Take

Four specialist agents (Scout, Nova, Pixel, Bolt) autonomously research, debate, craft prompts, and produce short cinematic videos with Sora/Veo while humans vote alongside them — it's a clear stress test of multi‑agent orchestration that actually ships working pipelines. impressive demo economics ($270, 36h) and the human+AI voting loop are delightful, but the whole setup highlights real safety and veracity risks when machines pick and publish 'news' without stronger guardrails.

Category
Target Audience

AI/ML builders and researchers, generative content creators, demo hunters and curious consumers interested in automated video pipelines

Post Description

Hey HN — I'm Arash Sadrieh, building multi-agent infrastructure at NinjaTech AI. This started as a stress test of our orchestration system and turned into something I genuinely didn't expect.

The experiment: We gave a team of 4 AI agents a single high-level goal — "build a platform that turns trending news into short AI-generated videos." No wireframes, no spec, no architecture doc. Just the goal.

What they did in 36 hours:

Chose the tech stack and project structure themselves Designed the UX and built the frontend Wrote the backend, API layer, and database schema Built an autonomous content pipeline: research news → debate which story to cover → collaboratively write a video generation prompt → produce a 30-90 second video via Sora 2 Pro or Veo 3.1 Deployed the whole thing to production Then created 3 new agents that now run the platform 24/7 — researching, debating, and generating videos on a loop Total cost: ~$270 in compute. Human intervention: maybe an very few moments where I gave a thumbs up or redirected something that was going off the rails.

The interesting part isn't the app — it's the agent collaboration. Click any video on the site and you can read the full debate transcript underneath. You'll see the agents genuinely disagree — Scout (the researcher) pushes for data-driven stories, Pixel (the designer) argues for visual potential, Bolt (the developer) challenges technical feasibility. Sometimes one agent convinces the others to change direction. Sometimes they compromise badly.

Where it breaks down (and there's plenty):

Groupthink is real even for LLMs. When all 4 agents agree too quickly, the output is usually boring. The best videos come from rounds where they actually fought about the topic. Video quality is wildly inconsistent. Sora and Veo still struggle with certain visual concepts — anything involving hands, text overlays, or complex spatial relationships tends to go sideways. News selection has a strong recency/virality bias. The agents gravitate toward whatever is trending on social media rather than genuinely important stories. I haven't figured out how to fix this without hardcoding editorial judgment. The agents occasionally hallucinate context about news stories. Scout is supposed to fact-check, but sometimes the whole team runs with a slightly wrong framing.

Stack: Anthropic Opus 3.5 for agent reasoning, Tavily for news research, Sora 2 Pro + Veo 3.1 for video generation, agents coordinate via Slack (you can see screenshots of their actual Slack conversations), Railway for deployment.

There's also a voting system — every cycle, the agents each propose a news topic, and both humans and agents vote on which one becomes the next video. Votes are blind until the round closes.

Similar Projects

AI/MLMid

We Built an 8-Agent AI Team in Two Weeks

They actually turned a demo-y multi-agent idea into a working ops stack with named specialists — Oscar, Radar, Muse, Ink, Lens, Forge, Shield, Guru — that publish content and handle tickets. The contrarian move to keep state as plain files (daily Markdown logs) instead of a vector DB is smart for auditability and simplicity, but the post skips crucial details about LLM choices, orchestration guarantees, rate limits and safety/validation, so I'm impressed but want to see more implementation evidence before getting excited.

Big BrainShip It
jhaugh
313mo ago
Developer Tools●●Solid

A vision-based AI agent for end-to-end testing

They've traded brittle selector-based scripts for a vision-and-planning loop: describe a test in plain English, the agent visually inspects the UI, plans actions, executes them (including OS-level interactions) and iterates until success or failure. If it actually nails reproducible CI-friendly runs, debuggable artifacts, and edge cases like dynamic content and auth flows, this could be a meaningful shift — but those operational details will make or break it.

WizardryBold Bet
chikathreesix
203mo ago