Home/Amazon Polly
A

Amazon Polly

AWS cloud TTS with multiple engine options

Amazon Polly is AWS's text-to-speech service offering multiple synthesis engines from standard to cutting-edge generative AI. With tight AWS integration, SSML support, and voices optimized for different use cases, Polly is ideal for enterprise applications requiring scalability and reliability.

60+
Voices
33
Languages
4
Models
$4.80/1M chars
Starting Price
Sample
Model
Speed

Key Features

What makes Amazon Polly stand out.

Pros & Cons

Pros

  • +AWS integration
  • +SSML support
  • +Multiple engine options
  • +Enterprise reliability
  • +Cost-effective at scale
  • +Good language coverage

Cons

  • -Complex pricing structure
  • -AWS account required
  • -Variable voice quality across engines
  • -Less natural than ElevenLabs
  • -Dated interface

Who Should Use Amazon Polly?

Amazon Polly excels for teams already invested in the AWS ecosystem who need reliable, scalable TTS.

Enterprise Teams

Organizations running on AWS who need TTS that integrates natively with Lambda, S3, and other AWS services. Polly handles millions of requests with enterprise-grade SLAs.

IVR & Call Centers

Companies building phone systems that need SSML control over pronunciation, pauses, and emphasis. Speech marks enable precise lip-sync for video avatars.

Cost-Sensitive Projects

Teams processing high volumes of text where cost matters. Standard engine starts at just $4.80 per million characters with a generous free tier.

Understanding Polly's 4 Engines

Amazon Polly is unique in offering four distinct synthesis engines, each with different quality and cost trade-offs.

Standard

$4.80/1M chars

Concatenative synthesis. The most cost-effective option, best for high-volume applications where natural quality isn't critical.

Neural

$19.20/1M chars

Deep learning-based synthesis. Significant quality improvement over Standard, ideal for customer-facing applications.

Long-form

$100/1M chars

Optimized for articles and books. Maintains consistent quality across paragraphs with improved prosody for extended content.

Generative

$30/1M chars

Latest AI technology. The most expressive and natural-sounding engine, approaching ElevenLabs quality at a competitive price.

How Amazon Polly Compares

See how Amazon Polly stacks up against other TTS services.

Frequently Asked Questions

What are Amazon Polly's 4 synthesis engines?

Amazon Polly offers Standard (concatenative, $4.80/1M chars), Neural (deep learning, $19.20/1M chars), Long-form (optimized for articles, $100/1M chars), and Generative (latest AI, $30/1M chars). Each engine targets different quality and cost trade-offs.

Does Amazon Polly have a free tier?

Yes. AWS offers 5 million Standard characters and 1 million Neural characters per month free for the first 12 months. After that, you pay per character with no minimum commitment.

Does Amazon Polly support SSML?

Yes. Amazon Polly has full SSML (Speech Synthesis Markup Language) support for controlling pronunciation, pauses, emphasis, speed, pitch, and volume. This level of control is unique among TTS services.

How many voices does Amazon Polly offer?

Amazon Polly offers 60+ voices across 33 languages. Not all voices are available on every engine — Neural and Generative engines have a smaller selection of higher-quality voices.

What are Speech Marks in Amazon Polly?

Speech Marks provide metadata about the generated audio, including word timing, sentence boundaries, and viseme data for lip-sync. This is essential for video avatars, karaoke-style highlighting, and subtitle generation.

Do I need an AWS account to use Amazon Polly?

Yes. Amazon Polly is an AWS service and requires an AWS account. It integrates natively with other AWS services like Lambda, S3, and CloudFront for scalable audio pipelines.

How does Amazon Polly compare to OpenAI TTS?

Amazon Polly offers more voices, SSML support, and lower pricing at the Standard tier. OpenAI TTS provides more natural-sounding voices with a simpler API but no SSML support. Polly is better for enterprise AWS workflows.

Can I create custom voices with Amazon Polly?

Amazon Polly offers Brand Voices as a custom engagement for enterprise customers. This requires working directly with AWS to train a unique voice on your brand's audio data.

Pricing

Pay per character

Standard
$4.80
per 1M characters
  • Basic concatenative synthesis
  • 5M chars/mo free tier
Neural
$19.20
per 1M characters
  • Deep learning synthesis
  • 1M chars/mo free (12mo)
Long-form
$100.00
per 1M characters
  • Optimized for articles
  • Improved long content
Generative
$30.00
per 1M characters
  • Latest AI technology
  • Most expressive
Back to all services