light estimateLast updated 2026-06-22

Most multilingual text-to-speech API

By stated language coverage, OpenAI's gpt-4o-mini-tts is the broadest in this set at around 50 languages, ahead of Cartesia Sonic (42) and ElevenLabs (32), while Deepgram Aura-2 covers only seven. Language counts mix Tier-1 documentation with marketing and shift between model versions, so treat them as indicative rather than audited. A light estimate from provider documentation, June 2026.

DefaultOpenAI gpt-4o-mini-ttsmeeste_talenOpenAI gpt-4o-mini-tts
Provider offerings compared on Price, ELO, TTFA, Langs and capabilities
OfferingPrice ($/1M chars)ELOTTFALangsCapabilities
ElevenLabs Multilingual v2 / Eleven v3ElevenLabs100264 ms32streamingcloning
Cartesia Sonic 3 / Sonic 3.5Cartesia39*1203188 ms42streamingcloning
Google Gemini 3.1 Flash TTSGoogle12*1214streaming
OpenAI gpt-4o-mini-ttsOpenAI15*50streaming
MiniMax Speech 2.5 TurboMiniMax78streamingcloning
Deepgram Aura-2Deepgram30313 ms7streaming

* token-/credit-priced — the headline understates real per-unit cost, so it is excluded from the cheapest ranking.