Longcat-2.0 is interesting. 1.6T model, roughly frontier quality though not near the top, but all trained on custom ASICs. Attempting to break the tyranny of training compute shortages.
Agility Robotics AI lead Chris Paxton details Longcat-2.0, a 1.6-trillion-parameter model trained entirely on custom ASICs
Story Overview
The headline frames Agility Robotics AI lead Chris Paxton as the source detailing a 1.6-trillion-parameter model trained solely on custom ASICs, yet available evidence attributes LongCat-2.0 to Meituan's team with no documented connection to Paxton or the robotics firm, leaving the reported origin unsupported.
Who actually shipped the weights?
Meituan open-sourced the full 1.6T MoE model with 1M context, LongCat Sparse Attention, and N-gram embeddings after pretraining on over 50K domestic ASICs, releasing weights on Hugging Face under MIT license.
Benchmarks land near the frontier at flash-sale rates
LongCat-2.0 posts 59.5 on SWE-bench Pro and strong agentic scores while offering pay-as-you-go API pricing at $0.30 per million input tokens during the launch promo, with context caching free.
Users are optimistic about Longcat-2.0's frontier-scale training run on custom ASICs because it could ease compute shortages and help new entrants challenge established labs.
No Digg Deeper questions have been answered for this story yet.
Most Activity
A frontier-scale, nearly frontier quality, training run that is entirely on custom ASICs would make it much easier for new entrants to challenge the big players, especially in an environment like coding where I think building a good harness + RL training can make a huge difference vs pure data scaling
Longcat-2.0 is interesting. 1.6T model, roughly frontier quality though not near the top, but all trained on custom ASICs. Attempting to break the tyranny of training compute shortages.

@chris_j_paxton Interesting to see if this trend prevails
@chris_j_paxton i find this confusing. anthropic/google have been using TPUs for years? anthropic has been using trainiums at least for inference as well.
Longcat-2.0 is interesting. 1.6T model, roughly frontier quality though not near the top, but all trained on custom ASICs. Attempting to break the tyranny of training compute shortages.