Wow!
Liquid AI just released LFM2.5-230M, their smallest and most efficient model yet. This 230 million parameter powerhouse is built for speed and real-world use on phones, robots, Raspberry Pi devices, and other edge hardware.
We love how it pushes the boundaries of what tiny models can achieve.
The numbers speak for themselves. LFM2.5-230M delivers up to 213 tokens per second decode speed on a Galaxy S25 Ultra CPU. On a Raspberry Pi 5, it hits 42 tokens per second.
These speeds make it one of the fastest options in its class while using the smallest memory footprint.
It outperforms many models more than twice its size on key tasks like instruction following, data extraction, and tool use. This efficiency comes from its LFM2 architecture, pre-training on 19 trillion tokens, and distillation from the larger 350M model.
The result is a compact model with a 32K context window that feels much smarter than its size suggests.
We especially like this model for practical agentic applications. Liquid AI demonstrated it running entirely on-device on a Unitree G1 robot powered by an NVIDIA Jetson Orin.
The model takes natural language instructions and turns them into structured multi-step plans with tool calls. Imagine robots, home automation, or phone-based agents that work offline, privately, and instantly.
It shines in large-scale data extraction pipelines and lightweight agentic tasks. Whether you need fast summarization, structured output, or reliable tool calling,
LFM2.5-230M delivers without relying on the cloud.
The model supports a wide range of frameworks right out of the box: •llama.cpp (GGUF) for edge devices •MLX for Apple Silicon •vLLM and SGLang for GPU serving •ONNX for cross-platform compatibility
Both the instruct version (LFM2.5-230M) and base version (LFM2.5-230M-Base) are available now.
Liquid AI reminds us that intelligence density matters most. A 230M model that runs this fast and performs this well opens the door to truly ubiquitous AI.
No more cloud dependency for many everyday tasks. Faster, cheaper, and more private experiences become possible on hardware people already own.
It is a big win for developers building on-device applications, robotics teams, and anyone who values speed and privacy.
Hugging Face.https://www.liquid.ai/blog/lfm2-5-230m















