How much does OpenAI TTS cost?

OpenAI TTS costs $15 per million characters for tts-1 (standard speed) and $30 per million characters for tts-1-hd (higher quality). The newer gpt-4o-mini-tts model costs approximately $15 per million characters using token-based pricing ($0.60/1M input tokens + $12/1M audio output tokens). There is no monthly subscription — you pay only for what you generate.

OpenAI does not have a dedicated free TTS plan. However, new API accounts receive $5 in free credits that work across all OpenAI services including TTS. At tts-1 rates, $5 covers approximately 333,000 characters or 5.5 hours of audio. After credits expire, it is pay-as-you-go with no monthly minimum.

What is the difference between tts-1 and gpt-4o-mini-tts?

tts-1 converts text to speech with 9 fixed voices at $15/1M characters. gpt-4o-mini-tts adds steerable voice instructions — you can control tone, emotion, pacing, and accent via natural language prompts — with 13 voices at similar pricing. gpt-4o-mini-tts also has a 2,000 input token limit per request versus tts-1's 4,096 character limit.

Is OpenAI TTS cheaper than ElevenLabs?

Yes, significantly. OpenAI tts-1 costs $15 per million characters versus ElevenLabs Flash API at $60/1M. That makes OpenAI 4x cheaper. However, ElevenLabs offers superior voice quality (Arena #4 vs not ranked), 1,000+ voices, voice cloning, and a studio editor. You trade quality and features for cost savings.

What are the OpenAI TTS API rate limits?

New paid OpenAI accounts start at 50 requests per minute for tts-1 and tts-1-hd. Each request is limited to 4,096 characters (about 5 minutes of audio). For gpt-4o-mini-tts, the limit is 2,000 input tokens per request (approximately 1,500 words). Rate limits increase with account age and usage history.

Does OpenAI TTS support voice cloning?

No. OpenAI TTS does not offer voice cloning. You are limited to the built-in voices: 9 for tts-1/tts-1-hd (alloy, ash, coral, echo, fable, nova, onyx, sage, shimmer) and 13 for gpt-4o-mini-tts (adds ballad, verse, marin, cedar). For voice cloning, consider ElevenLabs, Cartesia, or the free open-source Chatterbox.

OpenAI TTS Pricing 2026: tts-1 vs tts-1-hd vs gpt-4o-mini-tts Costs Compared

OpenAI TTS Pricing at a Glance (May 2026)

OpenAI charges a flat per-character rate with no monthly subscription required. You have three models to choose from: tts-1 at $15/1M characters for speed, tts-1-hd at $30/1M for higher fidelity, and the newer gpt-4o-mini-tts at roughly $15/1M characters with steerable voice instructions. No credits, no tiers, no rollover headaches — just pay for what you generate.

Model	Price per 1M Chars	≈ Cost per Minute	Voices	Best For
tts-1	$15	~$0.015	9	Low-latency apps, chatbots
tts-1-hd	$30	~$0.030	9	Pre-rendered content, podcasts
gpt-4o-mini-tts	~$15*	~$0.015	13	Steerable voice, emotional control

*gpt-4o-mini-tts uses token-based pricing ($0.60/1M input tokens + $12/1M audio output tokens). The ~$15/1M character rate is an estimate based on typical text-to-token ratios. Actual cost varies with input length.

tts-1 vs tts-1-hd: Is Paying 2x Worth It?

I ran the same 500-word script through both models, and the honest answer is: for most use cases, tts-1 is good enough. The HD model produces cleaner consonants and slightly richer vocal texture, but you need decent headphones to hear the difference. In a car, on laptop speakers, or in a busy office? Indistinguishable.

Where tts-1-hd earns its 2x premium:

Audiobook narration — long-form listening where subtle artifacts compound
Brand voice recordings — IVR greetings, product demos where polish matters
Music/poetry — content where tonal precision affects interpretation

For chatbots, notifications, internal tools, and prototyping? Stick with tts-1 and save 50%. The latency is actually lower on tts-1, making it better for real-time applications anyway.

gpt-4o-mini-tts: The Model That Changes Everything

Released in March 2025, gpt-4o-mini-tts is OpenAI's biggest TTS upgrade since launch. The headline feature: you can tell it how to speak, not just what to say. Pass an instructions parameter like "Speak in a warm, reassuring tone with occasional pauses for emphasis" and the model adjusts delivery accordingly.

This is a genuine differentiator. With tts-1, you pick a voice and that's it — the emotional delivery is fixed. With gpt-4o-mini-tts, the same "Nova" voice can sound excited, somber, professional, or playful depending on your instructions. No other pay-per-character TTS API offers this level of control at this price point.

Token Pricing Math

gpt-4o-mini-tts bills by tokens instead of characters. Text input costs $0.60/1M tokens, audio output costs $12/1M audio tokens. In practice, a 1,000-word blog post (about 5,000 characters) generates roughly 5 minutes of audio for $0.075 — slightly cheaper than tts-1 at $0.075 for the same text.

The instructions parameter counts toward your input tokens. A 50-word style prompt adds about $0.000018 per request — negligible.

13 voices are available: Alloy, Ash, Ballad, Coral, Echo, Fable, Nova, Onyx, Sage, Shimmer, Verse, Marin, and Cedar. You can preview all of them on our OpenAI TTS voice samples page. The additional voices (Ballad, Verse, Marin, Cedar) are exclusive to gpt-4o-mini-tts and tend toward more conversational, natural delivery.

Real-World Cost Examples

Abstract per-character rates don't mean much until you map them to actual projects. Here's what OpenAI TTS costs for common use cases, compared against ElevenLabs and Amazon Polly:

Use Case	Characters	tts-1 Cost	tts-1-hd Cost	ElevenLabs (Flash)	Polly Neural
Blog post (1,000 words)	~5,000	$0.08	$0.15	$0.30	$0.08
E-learning module (30 min)	~45,000	$0.68	$1.35	$2.70	$0.72
Podcast (1 hour)	~90,000	$1.35	$2.70	$5.40	$1.44
Full audiobook (80K words)	~400,000	$6.00	$12.00	$24.00	$6.40
SaaS app (1M chars/month)	1,000,000	$15.00	$30.00	$60.00	$16.00

The takeaway: OpenAI tts-1 and Amazon Polly Neural are neck-and-neck on cost. ElevenLabs is 4x more expensive. But voice quality and features matter too — check our full pricing comparison for the complete picture. You can also estimate your own costs with our TTS cost calculator.

Is There a Free Tier?

Sort of. OpenAI doesn't have a dedicated free plan for TTS. But new API accounts get $5 in free credits that work across all OpenAI services, including TTS. At tts-1 rates, $5 buys you about 333,000 characters — roughly 5.5 hours of audio. That's generous enough to build a prototype and test all 9 voices before spending a dime.

After the credits expire, it's pure pay-as-you-go. No monthly minimums, no commitments. You add a credit card and pay for exactly what you use. For developers testing TTS, this is actually a better deal than ElevenLabs' free tier (10,000 characters/month, about 10 minutes) — OpenAI gives you 33x more characters upfront, just not recurring.

For ongoing free TTS, consider Chatterbox Turbo (fully free, open-source) or Gemini Flash TTS (free quota in Google AI Studio).

API Limits and Hidden Gotchas

OpenAI's TTS pricing is refreshingly simple compared to ElevenLabs' credit system, but there are a few things that catch people off guard:

4,096 character limit per request — that's about 5 minutes of audio. For longer content, you need to split text and stitch audio files. This is the #1 complaint on the OpenAI developer forums.
gpt-4o-mini-tts has a 2,000 input token limit — approximately 1,500 English words per request. The instructions prompt counts toward this limit.
Rate limits start at 50 RPM — new paid accounts get 50 requests per minute for tts-1/tts-1-hd. This is enough for most applications, but voice agents handling concurrent calls may hit it.
No voice cloning — unlike ElevenLabs, Murf AI, or Cartesia, OpenAI doesn't offer voice cloning. You're limited to the built-in voices.
SynthID watermarking — all OpenAI TTS output is watermarked with SynthID for AI content detection. This is invisible to listeners but detectable by Google and other platforms.
No SSML support — you can't use Speech Synthesis Markup Language for fine-grained pronunciation control. The gpt-4o-mini-tts instructions parameter partially replaces this, but it's less precise than Amazon Polly's SSML.

Azure OpenAI TTS: Same Models, Different Pricing

Enterprise teams often use Azure OpenAI instead of the direct API. The models are identical — same voices, same quality, same character limits. But pricing runs about 2x higher on Azure (roughly $30/1M characters for tts-1 equivalent). The trade-off: Azure gives you enterprise SLAs, VNet integration, HIPAA/SOC2 compliance, and data residency controls. If your company already has an Azure Enterprise Agreement, the Azure path may actually be cheaper after discount.

For small teams and indie developers? Direct API wins on cost. For regulated industries (healthcare, finance, government)? Azure is often the only option that clears legal review.

5 Ways to Cut Your OpenAI TTS Bill

Use tts-1 unless you need HD — saves 50% with minimal quality difference for most applications.
Cache aggressively — if the same text gets converted repeatedly (IVR greetings, product names, standard responses), cache the audio output. OpenAI charges per generation, not per playback.
Strip unnecessary text — URLs, code blocks, markdown syntax, and boilerplate all consume characters without adding value. Clean your input before sending it.
Batch requests efficiently — stay close to the 4,096 character limit per request rather than sending many small requests. Fewer requests = lower overhead.
Consider Polly for high volume — at 10M+ characters/month, Amazon Polly Neural at $16/1M is slightly cheaper and offers SSML. OpenAI wins on simplicity; Polly wins on cost at scale.

OpenAI TTS vs 9 Competitors: Price per Million Characters

Here's how OpenAI stacks up against every major TTS provider. Prices are per 1M characters on each service's primary model (not budget or premium tiers):

Service	Cost/1M Chars	Free Tier	Voice Cloning	Arena Rank
Grok TTS (beta)	$4.20	None	No	Not ranked
Gemini Flash TTS	~$12	Quota-limited	No	#2 (ELO 1,211)
OpenAI tts-1	$15	$5 one-time credits	No	Not ranked
Amazon Polly Neural	$16	5M chars/12 mo	No	Not ranked
Inworld TTS Max	$10–$50	40 min trial	Yes	#1 (ELO 1,236)
Cartesia Sonic 3	~$33	20K chars	Yes (3s sample)	#10 (ELO 1,054)
ElevenLabs Flash	$60	10K chars/mo	Yes	#4 (ELO 1,179)
Murf AI Falcon	$10–$30	10 min total	Yes (Business plan)	Not ranked
Chatterbox Turbo	Free (self-host)	Unlimited	Yes (free)	Not ranked
Speechify Studio	$19–$49/mo flat	Limited	No	Not ranked

OpenAI sits in the middle of the pack on price. Cheaper than ElevenLabs and Cartesia. More expensive than Grok and Gemini Flash. Roughly tied with Polly. The real draw is API simplicity — OpenAI's TTS endpoint is the easiest to integrate if you're already using GPT models.

Who Should (and Shouldn't) Use OpenAI TTS

Best for

Developers already using the OpenAI API
Apps that need steerable voice (gpt-4o-mini-tts)
Moderate volume (under 5M chars/month)
Prototyping and MVPs (simple API, no SDK needed)
Multi-format output (MP3, WAV, FLAC, Opus, PCM)

Not ideal for

Voice cloning (try ElevenLabs or Cartesia)
Non-developers (no studio UI — try Murf)
Ultra-high volume (>10M chars/mo — Polly is cheaper)
Top-tier voice quality (ElevenLabs or Inworld rank higher)
Real-time voice agents (<100ms — try Cartesia 40ms)

Related Pricing Guides

ElevenLabs Pricing 2026$5–$990/mo across 6 plans Speechify Pricing 2026Reader vs Studio vs Audiobooks plans Amazon Polly Pricing 2026$4–$100/1M across 4 engines Compare All TTS Pricing11 services from $0 to $100/1M chars

For more on OpenAI's TTS voice quality and capabilities, read our full OpenAI TTS review with voice samples or compare it directly with competitors using our TTS API comparison guide.

Interested in TTS for audiobooks? OpenAI tts-1 produces a full 80,000-word audiobook for $6 — but voice quality trade-offs matter for long-form listening.