Home/OpenAI TTS
O

OpenAI TTS

High-quality TTS with 9 expressive voices

OpenAI's text-to-speech API offers natural-sounding voices optimized for various use cases. With two model tiers (tts-1 for speed and tts-1-hd for quality) and 9 distinct voices, it provides flexibility for real-time applications and pre-rendered content alike.

9
Voices
57
Languages
2
Models
$15/1M chars
Starting Price
Sample
Model
Speed

Key Features

What makes OpenAI TTS stand out.

Pros & Cons

Pros

  • +Natural-sounding voices
  • +Easy API integration
  • +Consistent quality across voices
  • +Fast response times with tts-1
  • +Good documentation and support

Cons

  • -Limited to 9 voices
  • -No voice cloning
  • -No SSML support
  • -English-optimized (limited multilingual)
  • -No emotion control

Who Should Use OpenAI TTS?

OpenAI TTS is ideal for teams that need reliable, natural-sounding voices with minimal setup.

Developers

The simplest TTS API to integrate. One endpoint, 9 voices, real-time streaming, and 6 output formats. No complex SDKs or configuration needed.

Content Creators

Consistent quality across all 9 voices with adjustable speed control. Cost-effective at $15/1M characters for high-volume audio production.

Accessibility Teams

Clear pronunciation and predictable pacing make OpenAI TTS a reliable choice for screen readers, assistive tools, and accessible content.

How OpenAI TTS Works

Getting started takes minutes. Here's the typical workflow.

1

Get an API Key

Sign up at platform.openai.com and generate an API key. TTS is available on all paid plans.

2

Choose a Voice

Pick from 9 voices: Alloy, Ash, Coral, Echo, Fable, Nova, Onyx, Sage, or Shimmer.

3

Select Model & Speed

Use tts-1 for low latency or tts-1-hd for quality. Set speed from 0.25x to 4.0x.

4

Generate Audio

Send text via the API. Get back MP3, WAV, OPUS, AAC, FLAC, or PCM. Stream in real-time or download.

How OpenAI TTS Compares

See how OpenAI TTS stacks up against other TTS services.

Frequently Asked Questions

How many voices does OpenAI TTS offer?

OpenAI TTS includes 9 built-in voices: Alloy (neutral), Ash (warm), Coral (clear), Echo (deep), Fable (animated), Nova (bright), Onyx (bold), Sage (calm), and Shimmer (soft). Each voice has a distinct personality suited for different use cases.

What is the difference between tts-1 and tts-1-hd?

tts-1 is optimized for speed and low latency, ideal for real-time applications like chatbots. tts-1-hd produces higher fidelity audio with better clarity, best for pre-rendered content like audiobooks and videos. tts-1-hd costs $30/1M characters vs $15/1M for tts-1.

How much does OpenAI TTS cost?

OpenAI charges $15 per 1 million characters for tts-1 and $30 per 1 million characters for tts-1-hd. A typical 1,000-word blog post (about 5,000 characters) costs roughly $0.08 with tts-1 or $0.15 with tts-1-hd.

What audio formats does OpenAI TTS support?

The API supports 6 output formats: MP3, WAV, OPUS, AAC, FLAC, and PCM. MP3 is the default and most widely compatible. OPUS is ideal for low-latency streaming, while FLAC and WAV are best for lossless audio.

Does OpenAI TTS support voice cloning?

No. OpenAI TTS does not support voice cloning or custom voice creation. You are limited to the 9 built-in voices. If you need voice cloning, consider ElevenLabs or Chatterbox Turbo.

Can I control the speed of OpenAI TTS output?

Yes. You can set the speed parameter from 0.25x to 4.0x when generating audio. The speed is baked into the generated file. Lower speeds sound more deliberate, while higher speeds work for faster narration.

Does OpenAI TTS support real-time streaming?

Yes. The API supports real-time audio streaming using chunk transfer encoding. Audio begins playing before the full file is generated, making it suitable for chatbots and voice assistants.

What languages does OpenAI TTS support?

OpenAI TTS supports 57 languages following the Whisper model. However, the voices are primarily optimized for English. Quality may vary for other languages, and there is no multilingual model selection like some competitors offer.

Pricing

Pay per character

tts-1
$15.00
per 1M characters
  • Low latency
  • Good quality
  • Real-time streaming
tts-1-hd
$30.00
per 1M characters
  • High quality
  • Better clarity
  • Pre-rendered content
Back to all services