Amazon Polly
Amazon Polly is a cloud service that converts text into lifelike speech, enabling you to create applications that talk and build entirely new categories of speech-enabled products. Polly supports multiple voices, languages, and audio output formats including neural and generative engines for natural-sounding speech.
APIs
Amazon Polly API
The Amazon Polly API enables you to synthesize speech from text (plain text or SSML), manage custom pronunciation lexicons, list available voices across multiple languages and e...
Capabilities
Amazon Polly Text-to-Speech
Workflow capability for converting text to lifelike speech using Amazon Polly. Combines speech synthesis, voice discovery, and lexicon management for developers building voice-e...
Run with NaftikoFeatures
Produce natural-sounding speech using neural network-based text-to-speech technology.
New generative engine delivers the highest quality, most human-like speech synthesis.
Choose from 60+ voices across 30+ languages including male, female, and child voices.
Use Speech Synthesis Markup Language (SSML) to control pronunciation, volume, pitch, and speech rate.
Create custom pronunciation lexicons to control how specific words and phrases are spoken.
Generate speech marks metadata to synchronize spoken text with animations or visual highlights.
Process large text bodies asynchronously with S3 output for long-form content.
Output audio in MP3, OGG, PCM, and JSON (speech marks) formats.
Use Cases
Build conversational interfaces that speak responses to users.
Add text-to-speech reading to applications for visually impaired users.
Convert written articles and content into audio podcasts automatically.
Add spoken narration to educational courses and training materials.
Create interactive voice response systems with natural-sounding speech.
Provide native-speaker pronunciation examples for language education.
Integrations
Store synthesized speech output from asynchronous synthesis tasks in S3 buckets.
Combine Polly speech synthesis with Lex conversational AI for voice chatbots.
Trigger speech synthesis from Lambda functions for event-driven voice applications.
Pair Polly text-to-speech with Transcribe speech-to-text for round-trip voice applications.
Power Amazon Connect contact center voice responses with Polly neural speech.