Liquid AI Ships LFM2.5-230M with llama.cpp, MLX, vLLM, SGLang, and ON…

Liquid AI shipped LFM2.5-230M, its smallest model to date, targeting agentic tasks on phones, robots, and automation devices. The 230-million-parameter, text-only model is built on the LFM2 architecture with 14 layers, a context length of 32,768 tokens, and a vocabulary size of 65,536. It runs on-device at 213 tok/s on a Galaxy S25 Ultra and 42 on a Raspberry Pi 5, with a 293–375 MB footprint. Day-one support includes llama.cpp, MLX, vLLM, SGLang, and ONNX. Both base and instruction-tuned checkpoints are open-weight on Hugging Face under the lfm1.0 license. The model was pre-trained on 19 trillion tokens and post-trained via supervised fine-tuning with distillation from LFM2.5-350M, direct preference optimization, and multi-domain reinforcement learning. On benchmarks, LFM2.5-230M scores 71.71 on IFEval, 38.40 on IFBench, and 22.51 on CaseReportBench, beating Qwen3.5-0.8B and Gemma 3 1B IT on instruction following and data extraction. It trails on MMLU-Pro with 20.25 and scores 5.26 on τ²-Bench Telecom. Liquid AI states the model is built for data extraction and tool use,