/Tech38d ago

Cartesia releases Sonic-3.5 text-to-speech model that takes first place on Artificial Analysis’s Speech Arena leaderboard in both global and open-weights rankings

It follows the Sonic-1 launch from less than two years earlier.

19352305628.2K

#224

Original post

Karan Goel#969

Sathwik Tejaswi@SathwikTejaswi

We dropped our latest TTS model - Sonic 3.5 today and it's #1 on @ArtificialAnlys's leaderboard

Try it out and let us know what you think

@cartesia @_albertgu @krandiash @jundesai @bclyang #ai #tts

10:46 AM · May 22, 2026 · 2.7K Views

Sentiment

Users are congratulating Cartesia on Sonic-3.5 topping the TTS leaderboard because the achievement validates ambitious modeling bets and marks a major technical win.

Pos

100.0%

Neg

0.0%

4 comments with sentiment.

Cluster Engagement

Digg Deeper

No Digg Deeper questions have been answered for this story yet.

Posts from X

Most Activity

VIEWS16.4KBOOKMARKS28LIKES165RETWEETS16REPLIES7

Albert Gu@_albertgu

Extremely proud of the team @cartesia for launching Sonic 3.5, which sets a new state of the art for TTS

I personally led the technical direction of this model; we built it ground up from first principles, and it contains multiple non-trivial ideas that differ substantially from anything we’ve seen in the literature. It’s been very gratifying to see research bets play out and the strong research team at Cartesia continue to grow!

Artificial Analysis@ArtificialAnlys

Cartesia’s Sonic-3.5 takes the #1 spot on the Artificial Analysis Speech Arena Leaderboard, surpassing Inworld Realtime TTS 1.5 Max and Google’s Gemini 3.1 Flash TTS

Sonic-3.5 is the latest TTS model from @cartesia . It supports 42 languages, including 9 Indian languages, with 500+ voices available out of the box. The model has been highly preferred among voters in the TTS Arena, with its demonstrated naturalness and accurate transcript following.

Key takeaways: ➤ Quality: Sonic-3.5 has an Elo score of 1,218 (+16/-16) based on 1,144 arena appearances, placing it ahead of Inworld Realtime TTS 1.5 Max at 1,194 and Gemini 3.1 Flash TTS at 1,209

➤ Pricing: Sonic-3.5 is priced at $39/1M characters, a premium compared to Gemini 3.1 Flash TTS at $18.3/1M characters, and Inworld Realtime TTS 1.5 Max at $35/1M characters

➤ Speed: 105.5 characters per second, compared to 205 characters per second for Inworld Realtime TTS 1.5 Max and 26.3 characters per second for Gemini 3.1 Flash TTS

See more details and listen to samples below 🧵

38d16.4K16528

Albert Gu@_albertgu

Extremely proud of the team @cartesia_ai for launching Sonic 3.5, which sets a new state of the art for TTS

Artificial Analysis@ArtificialAnlys

Cartesia’s Sonic-3.5 takes the #1 spot on the Artificial Analysis Speech Arena Leaderboard, surpassing Inworld Realtime TTS 1.5 Max and Google’s Gemini 3.1 Flash TTS

Key takeaways: ➤ Quality: Sonic-3.5 has an Elo score of 1,218 (+16/-16) based on 1,144 arena appearances, placing it ahead of Inworld Realtime TTS 1.5 Max at 1,194 and Gemini 3.1 Flash TTS at 1,209

➤ Pricing: Sonic-3.5 is priced at $39/1M characters, a premium compared to Gemini 3.1 Flash TTS at $18.3/1M characters, and Inworld Realtime TTS 1.5 Max at $35/1M characters

➤ Speed: 105.5 characters per second, compared to 205 characters per second for Inworld Realtime TTS 1.5 Max and 26.3 characters per second for Gemini 3.1 Flash TTS

See more details and listen to samples below 🧵

38d3.1K484

Karan Goel@krandiash

Our new speech model Sonic-3.5 is now #1 on Artificial Analysis's leaderboard.

Less than 2 years ago, we released Sonic-1, the fastest speech model in the world.

Sonic-3.5 now brings the best speech model for conversation with the lowest latency in production.

Artificial Analysis@ArtificialAnlys

Cartesia’s Sonic-3.5 takes the #1 spot on the Artificial Analysis Speech Arena Leaderboard, surpassing Inworld Realtime TTS 1.5 Max and Google’s Gemini 3.1 Flash TTS

Key takeaways: ➤ Quality: Sonic-3.5 has an Elo score of 1,218 (+16/-16) based on 1,144 arena appearances, placing it ahead of Inworld Realtime TTS 1.5 Max at 1,194 and Gemini 3.1 Flash TTS at 1,209

➤ Pricing: Sonic-3.5 is priced at $39/1M characters, a premium compared to Gemini 3.1 Flash TTS at $18.3/1M characters, and Inworld Realtime TTS 1.5 Max at $35/1M characters

➤ Speed: 105.5 characters per second, compared to 205 characters per second for Inworld Realtime TTS 1.5 Max and 26.3 characters per second for Gemini 3.1 Flash TTS

See more details and listen to samples below 🧵

38d2K461

Brandon Yang@bclyang

@_albertgu and yet he's still not happy

38d477

Timothy Luong (Chongz)@chongz

@_albertgu When I hear we've forced @_albertgu to lead technical direction

38d623

Zhiping Xiu@zhiping_xiu

@_albertgu Congrats!

38d311

Brandon Yang@bclyang

@elipughresearch :fat_yoshi:

38d151

Alexey Moiseenkov@Darkolorin

@krandiash Huge! Congrats!

38d111

Eli@elipughresearch

Insanely cool to see Sonic-3.5 outcompeting even the big slow models :)

The TTS team cooked 🙀

Artificial Analysis@ArtificialAnlys

Cartesia’s Sonic-3.5 takes the #1 spot on the Artificial Analysis Speech Arena Leaderboard, surpassing Inworld Realtime TTS 1.5 Max and Google’s Gemini 3.1 Flash TTS

Key takeaways: ➤ Quality: Sonic-3.5 has an Elo score of 1,218 (+16/-16) based on 1,144 arena appearances, placing it ahead of Inworld Realtime TTS 1.5 Max at 1,194 and Gemini 3.1 Flash TTS at 1,209

➤ Pricing: Sonic-3.5 is priced at $39/1M characters, a premium compared to Gemini 3.1 Flash TTS at $18.3/1M characters, and Inworld Realtime TTS 1.5 Max at $35/1M characters

➤ Speed: 105.5 characters per second, compared to 205 characters per second for Inworld Realtime TTS 1.5 Max and 26.3 characters per second for Gemini 3.1 Flash TTS

See more details and listen to samples below 🧵

38d4K313