/Tech11h ago

Cohere co-founder Nick Frosst says Cohere-Transcribe took first place on the new Far-Field Automatic Speech Recognition leaderboard

The Apache 2.0-licensed model generalized to the unseen benchmark.

12263196714.4K
Original post
Nick Frosst@nickfrosst#658inTech

New leaderboard for audio transcription just launched and our apache 2.0 Cohere-Transcribe is at the top. This eval didn't exist when we trained the model, so its nice to see us do so well on it.

https://huggingface.co/spaces/treble-technologies/ffasr

4:11 AM · Jun 10, 2026 · 7.3K Views
Sentiment

Users are pleased that Cohere Transcribe topped new audio transcription and far-field ASR benchmarks because the results validate prior non-trendy work, sustain momentum, and include open-sourced small models.

Pos
100.0%
Neg
0.0%
4 comments with sentiment.
Cluster Engagement
Posts from X
Most Activity
Most Activity
VIEWS506BOOKMARKS3LIKES10
Cohere@cohere

In March, Transcribe topped the OpenASR leaderboard for general-purpose speech recognition. Today, it leads a benchmark designed to go beyond and test robustness in real-world, far-field audio environments.

Give it a try and share back what you build: https://huggingface.co/CohereLabs/cohere-transcribe-03-2026

2hViews 506Likes 10Bookmarks 3
RETWEETS8
Cohere@cohere

Cohere Transcribe, our open-source speech recognition model, is #1 on the new @huggingface Far-Field ASR benchmark.

2hViews 8.1KLikes 206Bookmarks 52
REPLIES1
Cohere@cohere

These tests measure performance in varying signal-to-noise conditions: the kinds of audio found in meeting rooms, contact centres, & phone calls.

In other words, environments where enterprise speech applications actually operate. Cohere Transcribe ranked #1 across every metric:

2hViews 363Likes 3
Cohere@cohere

Transcribe achieved a 17.9 WER - nearly 2 points ahead of IBM Granite Speech and 3.6 points ahead of NVIDIA’s Parakeet.

Still Apache 2.0 and runs on your laptop. Enterprise performance 🤝 developer ergonomics.

Full results: https://huggingface.co/spaces/treble-technologies/ffasr

2hViews 422Likes 10Bookmarks 1
Cohere@cohere

Want to learn more about the Far-Field ASR benchmark?

Join Treble's webinar tomorrow, June 11th, with @shinjiw_at_cmu, Cohere's @Julianfmack, and other industry leaders discussing the future of far-field speech recognition.

Register here: https://www.treble.tech/insights/treble-hugging-face-ffasr-webinar

2hViews 336Likes 3
Latent Local@latentlocal

@nickfrosst Nice, keeping that momentum after 30b.

8hViews 39Likes 1
Furkan Gözükara@FurkanGozukara

@cohere @victormustar @huggingface Can it make word level timestamps or not? If not what is the trained durations that it will generate sentence longs

Block text not useful as subtitle

1hViews 88
Isabelle Plante@Izzyplante

@nickfrosst Glad to see that Cohere is dropping small open sourced models like this

7hViews 37

@cohere @huggingface Might try this out to help a call center in Quebec, hope it will work well in French. What kind of hardware would be required to have near real-time translation for, let's say, 10 to 20 users? And if we use the API, can you guarantee the confidential handling of information?

1hViews 31
The Weird Canadian@Weird_Canadian

@cohere @huggingface And how much did this cost Canadian taxpayers to build?

2hViews 31
Guilherme O'Tina@guilhermeotina

the margin grows as conditions get worse. in low snr cohere is ~4 WER points ahead of ibm, but in near field theyre basically tied. the asymmetric encoder/decoder split makes sense for this: most compute goes to acoustic features, and the light decoder keeps latency down. curious how this runs on device vs cloud

2hViews 10
Rugbist@rugbist_

@nickfrosst kind of satisfying when the metric catches up to the work already done.

proof you built the right thing, not the trendy thing.

11hViews 2
Alex YGift@Radipdegen

@nickfrosst undetected ecov did transcribe audio prior and the apache 2.0 is more stronger

11h