Azure AI Speech
Build multimodal, multilingual AI apps faster with pre-built or customizable speech models.
OVERVIEW
Add multimodality to your generative AI apps
- Build voice-enabled, multilingual generative AI apps with fast transcriptions and natural-sounding voices.
- Customize speech in your app for your domain—including OpenAI Whisper model—or give your copilot a branded voice.
- Enable real-time, multi-language speech to speech translation and speech to text transcription of audio streams.
- Run AI models wherever your data resides. Deploy your apps in the cloud or at the edge with containers.
USE CASES
Develop multimodal generative AI apps with speech models
Transcribe speech to text
Transcribe call center or meeting conversations. Go global with audio-captioning in more than 100 languages.
Convert text to speech
Build bots that speak naturally. Differentiate your brand with customized, realistic voices and speaking styles.
Speech analytics
Analyze audio or video call recordings to gain deep insights. Summarize key topics and extract or redact personal identification information.
Transcribe audio with OpenAI Whisper
Transform your call centers using the latest OpenAI Whisper model in Azure AI Speech or Azure OpenAI Service.
Build custom voices
Build natural-sounding voices with custom neural voice.
Build your avatars
Bring your brand to life using pre-built or custom avatars with natural-sounding voices.
Verify and recognize speakers
Confirm a person's identity or recognize who's speaking in a meeting by adding speaker verification and identification to your app.
Enable multilingual communication
Translate audio or video data from and into an ever-growing list of supported languages. Customize translations to your industry.
Embedd speech
Use embedded speech to power on-device speech to text and text to speech scenarios where cloud connectivity is intermittent or unavailable.
SECURITY
Built-in security and compliance
Microsoft has committed to investing USD$20 billion in cybersecurity over five years.
We employ more than 8,500 security and threat intelligence experts across 77 countries.
Azure has one of the largest compliance certification portfolios in the industry.
PRICING
Flexible pricing to meet your needs
Pay for only what you use—no upfront costs. Azure AI Speech pay-as-you-go pricing is based on:
RELATED PRODUCTS
Azure products work better together
Build comprehensive solutions using Azure AI Speech and other Azure AI products.
Azure OpenAI Service
Incorporate multimodality and enhance apps with models that combine multiple types of data, such as text, images, video, and audio.
Azure AI Studio
Get everything you need to develop generative AI applications and custom copilots on one platform.
Azure AI Content Safety
Deliver secure and trustworthy solutions with built-in tools that put responsible AI principles into practice.
CUSTOMER STORIES
See what customers are building with Azure AI Speech
RESOURCES
Get started with Azure AI Speech
FAQ
Frequently asked questions
- Azure AI Speech offers a number of features and capabilities, including speech to text, text to speech, and speech translation. These are offered through SDKs in several programming languages, including C#, C++, Java, and more.
- Yes, Azure AI Speech supports OpenAI’s Whisper model, especially for batch transcriptions.
- Azure AI Speech supports an ever-growing set of languages. For the current list of supported languages, please refer to this list.
- Customers are building interesting applications using Azure AI services. Get started with Speech analytics in Azure AI Studio for conversation AI, post-call analytics, video summarization, and more use cases.
Account signup
Get started with a free account
Start with USD$200 Azure credit
Account signup
Get started with pay-as-you-go pricing
There’s no upfront commitment—cancel anytime.