Back to browse
GitHub Repository

The open-source runtime for AI agents. Sandboxed execution with built-in tools, human-in-the-loop approvals, Slack integration, and durable workflows with automatic retries and prompt caching. You write the agent. Polos handles the infrastructure.

32 starsTypeScript

Polos: Open-source runtime for AI agents with sandbox and durable exec

by ndeodhar·Feb 25, 2026·2 points·0 comments

AI Analysis

●●●BangerSolve My ProblemSlick

Production-grade AI agent runtime with sandboxes and durable execution—ships today.

Strengths
  • Solves real operational pain: durability (auto-retry, resume-from-step), sandboxing, and observability at once.
  • Author's credibility: infrastructure engineer from Google with 99.999% uptime experience—not a hype hire.
  • Comprehensive: durable execution + prompt caching + Slack integration + observability dashboard—production-ready today.
Weaknesses
  • Early-stage (24 stars)—adoption risk; unclear if DevEx holds up at scale.
  • Docker/E2B sandboxing is table stakes; differentiation is orchestration quality, not a novel architecture.
Target Audience

Teams building production AI agents who need reliability, safety, and operational visibility.

Similar To

Temporal.io · n8n · Zapier Central

Post Description

Hi HN, I'm Neha. I spent years at Google building infrastructure that handled billions of events at 99.999% reliability. When I started building AI agents, I was surprised at how much production plumbing you're expected to own yourself.

The agent itself is the easy part. The hard part is everything around it: where does it execute safely? What happens when it fails midway through a workflow? How do you trigger it from your existing tools? How do you even know what it did?

I kept stitching together Docker, a workflow engine, a notification layer, and custom retry logic. Every team I talked to was doing the same thing. So I built Polos - an open-source runtime that handles the production layer so you just write the agent.

What it does:

- Sandboxed execution: agents run sensitive operations inside managed Docker containers with built-in tools for file I/O, bash, and web search. You don't manage the sandbox or its lifecycle, Polos does. Will support more sandboxes like E2B in the future.

- Slack integration: @mention an agent in Slack, get responses in thread. Trigger workflows from Slack, receive notifications, collect input. Agents become part of your team's existing workflow.

- Durable workflows: if an agent fails mid-run, it resumes from the exact step that failed. Built-in prompt caching with 60-80% cost savings on retries.

- Observability: OpenTelemetry tracing for every step, tool call, and decision.

- LLM agnostic: works with OpenAI, Anthropic, Google, or any provider via Vercel AI SDK and LiteLLM.

The stack is Rust orchestrator (Axum + Tokio + PostgreSQL), Python and TypeScript SDKs, and Vite UI. You can install and run a durable, sandboxed agent in under 5 minutes:

```

curl -fsSL https://install.polos.dev/install.sh | bash

npx create-polos

cd my-project && polos dev

```

Here's a 3-min demo of a coding agent that picks up a GitHub issue, fixes the code in a sandbox, and submits a PR: https://www.youtube.com/watch?v=KYVBpdZ_5eM

Happy to discuss technical decisions and more: why Rust for the orchestrator, how durable execution works without a DAG, and the sandbox lifecycle model.

GitHub: https://github.com/polos-dev/polos

Similar Projects