Fleet – Python supervisor for running coding agents in parallel
Centralized beads queue eliminates per-project init for 50+ parallel Claude sessions.
Python supervisor for running coding agents in parallel
Running 15 concurrent agents without burning through API limits faster than CrewAI or AutoGen.
Developers running multiple coding agents concurrently
CrewAI · AutoGen · LangGraph
- CLAUDE.md is a terrible abstraction. These files load unconditionally, they often contain descriptions irrelevant to the task at hand, and they stack from your working directory upward. The result is wasted tokens and confusion from injecting irrelevant instructions into the session.
- Skills are bad, but not as bad as CLAUDE.md. They use a progressive disclosure approach: only the skill description goes into the session, and Claude loads the full skill text with a tool when it's needed. That's one level better, but it still doesn't let you scale — you can't create 10K skills, as that would eat your entire usable context. Claude recently introduced a skills budget that silently drops less frequently used skills from the session entirely. You can still invoke them in an interactive session, but the model can't invoke them in a background session.
- Some plugins may be installed more than once. During cleanup I found that a few of mine were installed in multiple locations, consuming double the tokens on duplicated instructions.
- Attaching plugins to every session is a bad idea at scale. You want to be precise about which plugins are actually useful and attach them per task.
- Use a hierarchical knowledge base instead of CLAUDE.md / skills / plugins. It lets you benefit from real progressive disclosure: keep your instructions and tool descriptions in it and let Claude navigate through it quickly and cheaply.
- System tools consume ~15K tokens (7% of the session). You can't manage this — they're just attached, and disabling tools doesn't remove them from the context.
- AskUserQuestion isn't available in background sessions. You need to implement your own tool — MCP- or CLI-based — to give `claude -p` the ability to talk to you.
- You become selective about which model handles each task. Decompose work into harder and simpler subtasks so you can route the simpler ones to weaker, cheaper models and save tokens.
- Your context-switching skill improves over time.
Fleet repo: https://github.com/sermakarevich/fleet
Centralized beads queue eliminates per-project init for 50+ parallel Claude sessions.
Clever spinner-tip hijacking, but it's a curated list glued to someone else's product.
Mimir hooks into Claude Code lifecycle events so agents can 'mark' facts (e.g., "API uses snake_case") into a DuckDB-backed memory and RAG pipeline, then auto-injects that context as additionalContext for later agents. It's a pragmatic, well-scoped solution to the annoying problem of agent amnesia — very useful if you run agent swarms, but its impact is limited by Claude Code adoption and the need for the surrounding infra (BGE keys, hooks).
Automatic session memory for agents when Cline and Cursor already track context.
Budget control with auto-kill saves money when agents go rogue on long runs.
55 GPU lessons from hello-window to SSAO, each a standalone C program with Claude Code skills baked in.