Cartesia releases Ink-2, a streaming speech-to-text model that tops the Artificial Analysis leaderboard for accuracy
Built-in semantic endpoints detect when a user finishes speaking
Our new model Ink-2 tops AA's leaderboard for streaming speech-to-text!
Ink-2 comes with plenty of features optimized for real-time voice agents. With top-class models for both TTS and STT, the team at @cartesia keeps pushing the frontier of models for interactive intelligence.
Cartesia Ink-2 debuts as #1 for accuracy on the brand-new streaming speech-to-text leaderboard from @ArtificialAnlys! We designed Ink-2 from the ground up for voice agents - with low latency, eager transcripts, and semantic endpointing.
Our new speech-to-text model Ink-2 is out and #1 on Artificial Analysis.
It’s built for streaming — low latency, fast eager mode and built in semantic endpoints to detect when users are done talking
New architectures & algorithms made this Pareto-dominance possible
Cartesia Ink-2 debuts as #1 for accuracy on the brand-new streaming speech-to-text leaderboard from @ArtificialAnlys! We designed Ink-2 from the ground up for voice agents - with low latency, eager transcripts, and semantic endpointing.
Last week @cartesia topped the tts leaderboard, now crushing both ends of the stt-tts sandwich
Cartesia Ink-2 debuts as #1 for accuracy on the brand-new streaming speech-to-text leaderboard from @ArtificialAnlys! We designed Ink-2 from the ground up for voice agents - with low latency, eager transcripts, and semantic endpointing.