Cohere releases Command A+, its most powerful large language model optimized for minimal hardware, as open-source software under the Apache 2.0 license
Quantized version now available on Hugging Face with reduced serving footprints.
Many users praised Cohere's open-source Command A+ LLM for running efficiently on minimal hardware while staying competitive, whereas some questioned its licensing or dismissed the model outright.
Most Activity
Command A+ from @cohere is out now :) its our best model yet and its open source apache 2.0
Introducing: Cohere Command A+
We’ve created our most powerful LLM yet, optimized it to run on as little hardware as possible, and released it open-source for all.
It's been *almost* a bit quiet around LLM architecture releases in the past two weeks 😅
Interesting tidbit is the parallel block design. Via the Cmd-A the tech report "equivalent performance but significant improvement in throughput compared to the vanilla transformer block."
Introducing: Cohere Command A+
We’ve created our most powerful LLM yet, optimized it to run on as little hardware as possible, and released it open-source for all.
Releasing open-source under the Apache 2.0 license. We want to give developers direct access to enterprise-grade agentic capabilities from experimentation to production.
Sovereign AI. For all.
Download Command A+: https://huggingface.co/CohereLabs/command-a-plus-05-2026-w4a4
Or learn more: http://cohere.com/blog/command-a-plus
Worldlier: native support for 48 world languages and improved efficiency in non-European languages.
interesting open model by cohere with lots of unusual architecture choices, here is a recap:
> parallel transformer, so MoE and attention are computed in parallel. likely doing some kind of MLP/attention disaggregation here? > lots of query heads, query total dim is 4x hidden size > big shared expert, 4x router size > no scaling after normalization of the top k > LayerNorm instead of RMS norm > 32 layer only, no dense layer at the start
Introducing: Cohere Command A+
We’ve created our most powerful LLM yet, optimized it to run on as little hardware as possible, and released it open-source for all.
Cohere is on such a great open-source trajectory lately. Beautiful Apache 2.0 model! https://huggingface.co/CohereLabs/command-a-plus-05-2026-bf16
Command A+ from @cohere is out now :) its our best model yet and its open source apache 2.0
wait… did Cohere just release Command A+ models under Apache 2.0 for the first time ever?! 🙊
welcome to Europe! 🤗
Introducing: Cohere Command A+
We’ve created our most powerful LLM yet, optimized it to run on as little hardware as possible, and released it open-source for all.
Nick really championed us going Apache 2 for this release and for Cohere Transcribe. Not an obvious decision and one that required many discussions. Like Nick says, I hope the model is more useful and empowering as a result.
Command A+ from @cohere is out now :) its our best model yet and its open source apache 2.0
Command A+ is available on @huggingface with W4A4 quantization 🤗
Cut your serving footprint dramatically with virtually zero performance degradation.
Try it now: https://huggingface.co/CohereLabs/command-a-plus-05-2026-w4a4
Introducing: Cohere Command A+
We’ve created our most powerful LLM yet, optimized it to run on as little hardware as possible, and released it open-source for all.
Our first fully open source Apache 2 model :)
Introducing: Cohere Command A+
We’ve created our most powerful LLM yet, optimized it to run on as little hardware as possible, and released it open-source for all.
> be cohere > join forces with some German companies > immediately open source your best model > life is good
wait… did Cohere just release Command A+ models under Apache 2.0 for the first time ever?! 🙊
welcome to Europe! 🤗
Cohere has fallen, DS-MoE shape reigns supreme
Introducing: Cohere Command A+
We’ve created our most powerful LLM yet, optimized it to run on as little hardware as possible, and released it open-source for all.
Open source Command A+ model
This tech can go one of two ways. It can go the way the internet and mobile phones did - in which technological hegemony resulted in a mostly disempowering tech.
Or it can empower the people that use it.
We are working towards that second one.
Introducing: Cohere Command A+
We’ve created our most powerful LLM yet, optimized it to run on as little hardware as possible, and released it open-source for all.
Out today! Our most capable agentic model: - Runs on one B200 - 48 languages (including العربية, 日本語, 한국어) - Open source (Apache 2.0 ) - Multimodal: text + images - 218B Mixture-of-Experts model, 25B active parameters
Introducing: Cohere Command A+
We’ve created our most powerful LLM yet, optimized it to run on as little hardware as possible, and released it open-source for all.
Cohere dropped Command A+ 🔥
> 25B/219B MoE vision language model > supports 48 languages with efficient tokenizer > tool-calling/agentic + 128k context window > transformers day-0 support 🤗 free license 💗
Extremely excited to present Command A+, our first sparse model!
I am very proud of the work we did to enable this model. We built our sparse training stack from the ground up over the past year with a lot of custom kernels, performance engineering to enable us to train large sparse models with a very small compute footprint.
We presented our work on a fully dropless FP8 kernel stack for sparse models at Nvidia GTC earlier this year. We are extremely ambitious and are marching towards a state-of-the-art model training stack, come join us if you are interested in pushing the frontier!!
Apply here - https://jobs.ashbyhq.com/cohere/d42f5fd4-1ffc-45b9-957c-f09862db6af6 Our work here - https://www.nvidia.com/en-us/on-demand/session/gtc26-s82225/
Introducing: Cohere Command A+
We’ve created our most powerful LLM yet, optimized it to run on as little hardware as possible, and released it open-source for all.
4 shared experts with 8 routed experts active? so 12/132, that's crazy, i wonder why. most papers like Towards Greater Leverage would suggest 1 shared expert or minimal (i think we should decouple shared expert size anyway eventually)
also, 128 attention heads with GQA???
218B A22B 128 experts, 8 active: 7 routed + 1 shared (Not sure about this, the HF config says 4 shared?) Max 128K context, kinda short SWA + Full, 2:1 NVFP4, W4A4 for expert weights Sigmoid then normalize token router
Extremely fast on HuggingChat (served by Cohere) https://huggingface.co/chat/models/CohereLabs/command-a-plus-05-2026-bf16
Introducing: Cohere Command A+
We’ve created our most powerful LLM yet, optimized it to run on as little hardware as possible, and released it open-source for all.
We ❤️ Open Source
Introducing: Cohere Command A+
We’ve created our most powerful LLM yet, optimized it to run on as little hardware as possible, and released it open-source for all.
Check out our latest open-source model, built for efficiency, with a focus on business use-cases, available for all.
Introducing: Cohere Command A+
We’ve created our most powerful LLM yet, optimized it to run on as little hardware as possible, and released it open-source for all.