Prime Intellect's Prime-RL Aims to Train Trillion-Parameter AI Models

Prime Intellect released a free, open tool called Prime-RL 0.6.0 on June 23, 2026, which can train AI models with up to a trillion parameters. The tool uses a teaching method called “reinforcement learning” and works on very large “Mixture-of-Experts” models. Prime Intellect is giving the recipe away for free, opening the door for small teams, startups, and researchers to try work that once needed lots of money. Prime-RL is an open framework for “asynchronous reinforcement learning,” where different parts of the work do not wait in a strict line, keeping costly computers busy instead of idle. A Mixture-of-Experts (MoE) model is made of many smaller “expert” networks, switching on only the experts needed for each task, allowing the model to be very large yet still run at a fair cost. Prime Intellect showed Prime-RL training a model called GLM-5 on software engineering tasks, which are “agentic” tasks where the AI works in many steps and uses tools. Some tasks ran for 100 or more turns per try, and the training handled a sequence length of 131,000 tokens. The step time stayed under five minutes, using a batch of 256 rollouts on 28 H200 nodes.