Azure AI Speech

A managed service offering industry-leading speech capabilities such as speech-to-text, text-to-speech, speech translation, and speaker recognition.

Try Speech free Create a pay-as-you-go account

Quickly develop high-quality voice-enabled apps

Build voice-enabled apps confidently and quickly with the Speech SDK. Transcribe speech to text with high accuracy, produce natural-sounding text-to-speech voices, translate spoken audio, and use speaker recognition during conversations. Explore with a no-code experience and create custom models tailored to your app with Speech studio.

AI is a necessity, not a luxury, say technical leaders. Read the blog.

Industry-leading quality

Get state-of-the-art speech to text, lifelike text to speech, and award-winning speaker recognition.

Compliant and secure

Your data stays yours—your speech input is not logged during processing.

Customizable voices and models

Create custom voices, add specific words to your base vocabulary, or build your own models.

Flexible deployment

Run Speech anywhere, in the cloud or at the edge in containers.

Convert speech to text

Quickly and accurately transcribe audio in more than 100 languages and variants. Gain customer insights with call center transcription, improve experiences with voice-enabled assistants, capture key discussions in meetings and more.

Learn more about Speech to Text

Get started transcribing speech today

Give your app a voice

Use text to speech to create apps and services that speak conversationally. Create natural-sounding audio content, improve accessibility with read-aloud functionality, and create custom voice assistants.

Learn more about Text to Speech

Learn how to convert text to audio

Translate speech in real time

Translate audio from more than 30 languages and customize translations for your organization's specific terms—all in your preferred programming language.

Learn more about Speech Translation

Get started translating speech in real time

Verify and recognize speakers

Confirm a person's identity or recognize who's speaking in a meeting by adding speaker verification and identification to your app.

Learn more about Speaker Recognition

Learn how to recognize speakers in your app

Activate your assistant or IoT device with a custom keyword

Create a custom keyword for IoT devices and voice-enabled assistants to set your brand apart—making it more personal, personable, and secure.

Learn how to create a custom keyword

Get started building voice-controlled apps

Add voice commands for hands-free scenarios

Build a touchless, voice-first experience to improve safety and support back-to-work scenarios.

Learn more about Custom Commands

Get started adding Custom Commands

Comprehensive security and compliance, built in

Microsoft invests more than $1 billion annually on cybersecurity research and development.

We employ more than 3,500 security experts who are dedicated to data security and privacy.

Azure has more certifications than any other cloud provider. View the comprehensive list.

Learn more about security on Azure

Flexible pricing gives you the power and control you need

Pay for only what you use, with no upfront costs. With Speech, pay as you go based on:

The number of hours of audio you transcribe or translate for speech to text and speech translation.

The number of characters you convert to audio for text to speech

The number of transactions for Speaker Recognition

Cognitive AI Speech pricing

Get started with an Azure free account

Start free. Get $200 credit to use within 30 days. While you have your credit, get free amounts of many of our most popular services, plus free amounts of 55+ other services that are always free.

After your credit, move to pay as you go to keep building with the same free services. Pay only if you use more than your free monthly amounts.

After 12 months, you'll keep getting 55+ always-free services—and still pay only for what you use beyond your free monthly amounts.

Trusted by companies of all sizes

AT&T delights customers with immersive experiences

AT&T is showcasing its 5G network with an immersive experience that allows customers to talk directly to Bugs Bunny.*

Read the story for ATT

Progressive brings Flo directly to customers

Progressive used Custom Neural Voice to build a natural-sounding, virtual version of Flo to help customers with everything from getting a free car insurance quote to general insurance questions.

Read the story for Progressive

KPMG streamlines call transcription

KPMG uses Speech to Text to transcribe and catalog thousands of calls, reducing compliance costs for its clients by as much as 80 percent.

Motorola helps first responders access vital data

Motorola Solutions helps first responders in the field access vital information with a voice-first virtual assistant.

Speech documentation and resources

Get started with AI Speech

Browse the documentation

Take the Microsoft Learn Speech course

Explore popular developer resources

Checkout our sample code and SDKs

Build speech models quickly with Speech studio Stack Overflow

Start building with AI Services

Try Speech free

Popular

AI + machine learning

Analytics

Compute

Containers

Databases

DevOps

Developer tools

Hybrid + multicloud

Identity

Integration

Internet of Things

Management and governance

Media

Migration

Mixed reality

Mobile

Networking

Security

Storage

Web

Virtual desktop infrastructure

Use cases

Application development

AI

Cloud migration and modernization

Data and analytics

Hybrid cloud and infrastructure