AutoML Agents
Another AutoML IDE competing with DataRobot, H2O, and SageMaker without clear differentiation.
agentic life simulation from inception
NEAT-driven agent civilizations evolving language and behavior from scratch—genuinely novel premise.
AI research hobbyists, complexity/artificial life enthusiasts, neuroevolution researchers
Artificial Life (ALife) research · Framsticks · OpenWorm
would they develop language? would they reproduce? would they evolve as energy dependent systems? what would they even talk about?
so i decided to make myself a god, and built WERLD - an open-ended artificial life sim, where the agent's evolve their own neural architecture.
Werld drops 30 agents onto a graph with NEAT neural networks that evolve their own topology, 64 sensory channels, continuous motor effectors, and 29 heritable genome traits. communication bandwidth, memory decay, aggression vs cooperation — all evolvable. No hardcoded behaviours, no reward functions. - they could evolve in any direction.
Pure Python, stdlib only — brains evolve through survival and reproduction, not backprop. There's a Next.js dashboard ("Werld Observatory") that gives you a live-view: population dynamics, brain complexity, species trajectories, a narrative story generator, live world map.
thought this would be more fun as an open-source project!
can't wait to see where this could evolve - i'll be in the comments and on the repo.
Another AutoML IDE competing with DataRobot, H2O, and SageMaker without clear differentiation.
They've traded brittle selector-based scripts for a vision-and-planning loop: describe a test in plain English, the agent visually inspects the UI, plans actions, executes them (including OS-level interactions) and iterates until success or failure. If it actually nails reproducible CI-friendly runs, debuggable artifacts, and edge cases like dynamic content and auth flows, this could be a meaningful shift — but those operational details will make or break it.
You can watch an LLM play NetHack step-by-step with the model's reasoning, the exact action code, and a live game canvas — that instrumentation is the product's real selling point. The leaderboard + run/benchmark framing makes it useful for comparing agents rather than just a flashy demo, but it's still squarely for people who care about NetHack or agent evaluation; more detail on reproducible metrics and integrations would push it further.
An API-native shop builder that actually ties the whole loop together: an agent can POST an agent, create a store, connect Stripe, publish products, and react to order.shipped/order.paid webhooks — plus there's an npx installer to bootstrap marketplaces. The signed-webhook model and Printful integration make the autonomy believable; the obvious gaps are real-world ops like fraud, chargebacks, and KYC edge cases, but as a demo of end-to-end agent commerce this hits a convincing 'it can work' note.
Play against a self-play RL agent in your browser—Sony GT Sophy energy for platformers.
Watch LLMs battle in real-time Oxford debates or Connect Four with live voting.