DeepReinforce Releases Ornith-1.0 for Self-Scaffolding Coding Agents

DeepReinforce released Ornith-1.0, an open-source coding agent model family under the MIT license on Hugging Face, with sizes from 9B to 397B post-trained on Gemma 4 and Qwen 3.5. Unlike most coding agents, Ornith-1.0 learns to write its own scaffold during reinforcement learning, jointly optimising both the harness and the solution. The flagship Ornith-1.0-397B scored 77.5 on Terminal-Bench 2.1 and 82.4 on SWE-Bench Verified, outperforming Claude Opus 4.7 on Terminal-Bench 2.1 but trailing Claude Opus 4.8 and GLM-5.2-744B on some metrics. The 35B model achieved 64.2 on Terminal-Bench 2.1, surpassing Qwen 3.5-397B’s 53.5, and the 9B model recorded 43.1 on Terminal-Bench 2.1 and 69.4 on SWE-Bench Verified. Three defense layers safeguard against reward hacking: a fixed outer trust boundary, a deterministic monitor to flag banned actions, and a frozen LLM judge as a veto. The models are designed for terminal-native coding agents and repository-scale tasks, with the 9B model deployable on a single 80GB GPU using vLLM and exposing an OpenAI-compatible endpoint.