Text to speech

An AI Speech feature that converts text to lifelike speech.

Try Text to speech free Create a pay-as-you-go account

Bring your apps to life with natural-sounding voices

Build apps and services that speak naturally. Differentiate your brand with a customized, realistic voice generator, and access voices with different speaking styles and emotional tones to fit your use case—from text readers and talkers to customer support chatbots.

Start with USD200 Azure credit

Lifelike synthesized speech

Enable fluid, natural-sounding text to speech that matches the intonation and emotion of human voices.

Customizable text-talker voices

Create a unique AI voice generator that reflects your brand's identity.

Fine-grained text-to-talk audio controls

Tune voice output for your scenarios by easily adjusting rate, pitch, pronunciation, pauses, and more.

Flexible deployment

Run Text to Speech anywhere—in the cloud, on-premises, or at the edge in containers.

Tailor your speech output

Fine-tune synthesized speech audio to fit your scenario. Define lexicons and control speech parameters such as pronunciation, pitch, rate, pauses, and intonation with Speech Synthesis Markup Language (SSML) or with the audio content creation tool.

Deploy Text to Speech anywhere, from the cloud to the edge

Run Text to Speech wherever your data resides. Build lifelike speech synthesis into applications optimized for both robust cloud capabilities and edge locality using containers.

Build a custom voice for your brand

Differentiate your brand with a unique custom voice. Develop a highly realistic voice for more natural conversational interfaces using the Custom Neural Voice capability, starting with 30 minutes of audio. Here are a few examples of organizations that are doing AI voice generation today:

Fuel App Innovation with Cloud AI Services

Learn five key ways your organization can get started with AI to realize value quickly.

Read the report

Comprehensive privacy and security

AI Speech, part of Azure AI Services, is certified by SOC, FedRAMP, PCI DSS, HIPAA, HITECH, and ISO.

View and delete your custom voice data and synthesized speech models at any time. Your data is encrypted while it’s in storage.

Your data remains yours. Your text data isn't stored during data processing or audio voice generation.

Backed by Azure infrastructure, AI Speech offers enterprise-grade security, availability, compliance, and manageability.

Comprehensive security and compliance, built in

Microsoft invests more than USD1 billion annually on cybersecurity research and development.

A security center overview in Azure showing policy and compliance data and resource security hygiene

We employ more than 3,500 security experts who are dedicated to data security and privacy.

The security center compute and apps tab in Azure showing a list of recommendations

Azure has more certifications than any other cloud provider. View the comprehensive list.

Learn more about security on Azure

Flexible pricing gives you the power and control you need

Pay only for what you use, with no upfront costs. With Text to Speech, you pay as you go based on the number of characters you convert to audio.

See pricing

Get started with an Azure free account

Start free. Get USD200 credit to use within 30 days. While you have your credit, get free amounts of many of our most popular services, plus free amounts of 55+ other services that are always free.

After your credit, move to pay as you go to keep building with the same free services. Pay only if you use more than your free monthly amounts.

After 12 months, you'll keep getting 55+ always-free services—and still pay only for what you use beyond your free monthly amounts.

Guidelines for building responsible synthetic voices

Learn about responsible deployment

Synthetic voices must be designed to earn the trust of others. Learn the principles of building synthesized voices that create confidence in your company and services.

Obtain consent from voice talent

Help voice talent understand how neural text-to-speech (TTS) works and get information on recommended use cases.

Be transparent

Transparency is foundational to responsible use of computer voice generators and synthetic voices. Help ensure that users understand when they’re hearing a synthetic voice and that voice talent is aware of how their voice will be used. Learn more with our disclosure design guidelines.

Learn how to get started with the Custom Neural Voice capability, a limited access feature

Documentation and resources

Get started

Read the documentation

Take the Microsoft Learn course

Get started with a 30-day learning journey

Explore code samples

Check out the sample code

See customization resources

Customize your speech solution with Speech studio. No code required.

Start building with AI Services

Try Text to speech free

Featured

AI + Machine Learning

Analytics

Compute

Containers

Databases

DevOps

Developer Tools

Hybrid + multicloud

Identity

Integration

Internet of Things

Management and Governance

Media

Migration

Mixed Reality

Mobile

Networking

Security

Storage

Web

Virtual desktop infrastructure

Use cases

Application development

AI

Cloud migration and modernisation

Data and analytics

Hybrid cloud and infrastructure