Most multilingual text-to-speech API
By stated language coverage, OpenAI's gpt-4o-mini-tts is the broadest in this set at around 50 languages, ahead of Cartesia Sonic (42) and ElevenLabs (32), while Deepgram Aura-2 covers only seven. Language counts mix Tier-1 documentation with marketing and shift between model versions, so treat them as indicative rather than audited. A light estimate from provider documentation, June 2026.
DefaultOpenAI gpt-4o-mini-ttsmeeste_talenOpenAI gpt-4o-mini-tts
Provider offerings
| Offering | Price ($/1M chars) | ELO | TTFA | Langs | Capabilities |
|---|---|---|---|---|---|
| ElevenLabs Multilingual v2 / Eleven v3ElevenLabs | 100 | — | 264 ms | 32 | streamingcloning |
| Cartesia Sonic 3 / Sonic 3.5Cartesia | 39* | 1203 | 188 ms | 42 | streamingcloning |
| Google Gemini 3.1 Flash TTSGoogle | 12* | 1214 | — | — | streaming |
| OpenAI gpt-4o-mini-ttsOpenAI | 15* | — | — | 50 | streaming |
| MiniMax Speech 2.5 TurboMiniMax | 78 | — | — | — | streamingcloning |
| Deepgram Aura-2Deepgram | 30 | — | 313 ms | 7 | streaming |
* token-/credit-priced — the headline understates real per-unit cost, so it is excluded from the cheapest ranking.
Sources
- Artificial Analysis — Text to Speech Leaderboard (Speech Arena, blind-vote ELO)2026-06-22
- Coval — Best Text-to-Speech Providers in 2026 (independent TTFA/TTFB benchmark, captured 2026-05-04)2026-06-01
- ElevenLabs — API Pricing2026-06-22
- Cartesia — Pricing2026-06-22
- Google — Gemini Developer API Pricing2026-06-22
- OpenAI — API Pricing2026-06-22
- MiniMax — Product Pricing (API docs)2026-06-22
- Deepgram — Pricing (Aura-2 TTS)2026-06-22