Meadow Mind – a 7B diffusion LLM plays Gym games with zero training
Fixed-latency language-rule decisions beat traditional token-by-token LLM agents.

Parallel token decoding beats autoregressive LLMs on throughput, if the math holds up.
ML researchers and hobbyists interested in alternative LLM architectures
Mercury · LLaDA · Diffusion-LM
Fixed-latency language-rule decisions beat traditional token-by-token LLM agents.
Train a working LLM in 5 minutes on free Colab with a fish personality.
Masked-token pretraining on CAD meshes achieves 0.729 R² reconstruction.
TPU training wrapper built on torchprime; solves a real problem but torchprime already exists.
Infers layer shapes from connections and exports standard PyTorch scripts.
Diffusion models generate executable VM memory images instead of LLMs writing code.