Azure Speech in Foundry Tools

OVERVIEW

Discover the latest Azure Speech capabilities

Build voice-enabled, multilingual generative AI apps with fast transcriptions and natural-sounding voices.

Explore Azure Speech
Enable AI agents with end-to-end speech, including customized transcription, voice, and avatars.

Explore Voice Live API
Enable real-time, multi-language speech-to-speech translation and speech-to-text transcription of audio streams.

Learn more
Run AI models wherever your data resides. Deploy your apps in the cloud or at the edge with containers.

Develop with containers

USE CASES

Develop multimodal generative AI apps with speech models

Build voice-enabled agents

Use foundation models along with customized audio-in and audio-out models to power agents with voice.

Transcribe speech to text

Transcribe call center or meeting conversations. Go global with audio captioning in more than 100 languages.

Convert text to speech

Build bots that speak naturally. Differentiate your brand with customized, realistic voices and speaking styles.

Use post-call analytics

Analyze audio or video call recordings to gain deep insights using foundation models in Azure Content Understanding in Foundry Tools.

Transcribe audio with OpenAI Whisper

Transform your call centers using the latest OpenAI Whisper model in Azure Speech or Azure OpenAI in Foundry Models.

Build custom voices

Build natural-sounding voices with custom neural voice.

Build your avatars

Bring your brand to life using prebuilt or custom avatars with natural-sounding voices.

Enable multilingual communication

Translate audio or video data from and into an ever-growing list of supported languages. Customize translations to your industry.

Embed speech

Use embedded speech to power on-device speech-to-text and text-to-speech scenarios where cloud connectivity is intermittent or unavailable.

A person wearing a denim jacket is using a tablet in a clothing store with various garments hanging on racks in the background.

Pricing

Flexible pricing to meet your needs

Pay for only what you use—no upfront costs. Azure Speech pay-as-you-go pricing is based on:

Azure Speech pricing

Azure OpenAI

Incorporate multimodality and enhance apps with models that combine multiple types of data, such as text, images, video, and audio.

Learn more

Microsoft Foundry

Get everything you need to develop generative AI applications and custom agents on one platform.

Learn more

Content Safety in Foundry Control Plane

Deliver secure and trustworthy solutions with built-in tools that put responsible AI principles into practice.

Learn more

Azure Content Understanding

Accelerate the transformation of multimodal data into insights.

Learn more

Azure Translator

Translate documents and text in real-time or in batches across more than 100 languages for global reach.

Learn more

Azure Language

Build conversational interfaces, summarize documents, and analyze text using prebuilt AI-powered features.

Learn more

RESOURCES

Get started with Azure Speech

A person wearing glasses and a beige shirt is smiling and gesturing while looking at a laptop in an outdoor cafe setting

Explore Azure Speech documentation

Discover resources such as tutorials and API references.

Learn more

Two people discussing code displayed on a computer monitor at a workstation with a keyboard, headphones, and a mug.

Build voice-enabled apps

Design and build enterprise-grade, voice-enabled apps.

Download the infographic

Man in an office setting, wearing glasses and an orange sweater, looks at papers in one hand while using a laptop.

GitHub resources

Explore sample code and SDKs.

Browse samples on GitHub

A person sits at a dual-monitor setup in an office chair, typing code on their keyboard

Start building now

Build models quickly in Foundry.

Explore Azure Speech in Foundry.

Four people are seated at a table, engaged in conversation and working on laptops. They appear to be in a collaborative setting.

Azure Speech learning paths

Develop natural language processing solutions with Azure.

Learn more

Create agentic AI

Integrate AI agents into apps seamlessly and learn advanced model fine-tuning techniques.

Learn more

A woman wearing a blue jacket looking at a computer screen.

Find the best AI model

Enable multimodal models, model selection, and benchmarking, and create multimodal applications.

Learn more

Secure and responsible AI

Understand the fundamentals of AI security, evaluations, and managing harmful content.

Learn more

FAQ

Azure Speech is part of Foundry Tools (formerly Azure AI Services) and provides APIs for speech-to-text, text-to-speech, translation, and speaker recognition. It was previously known as Azure AI Speech.
Yes, we’re rebranding many of our former Azure AI Services as Foundry Tools. This shift reflects a broader platform unification under Foundry, and signals that these services are now positioned as core tools for building agentic AI applications.

Azure Speech in Foundry Tools still offers the same powerful capabilities—like speech recognition, text-to-speech, and translation—but is now part of a cohesive toolkit designed for developers building intelligent agents.

The rebrand helps clarify how these APIs fit into the Foundry ecosystem and makes it easier to discover, orchestrate, and integrate them into modern AI workflows.
Azure Speech offers a number of features and capabilities, including speech to text, text to speech, and speech translation. These are offered through SDKs in several programming languages, including C#, C++, and Java.

Learn more
Speech supports an ever-growing set of languages. For supported languages, please refer to the current list.
Customers are building interesting applications using Foundry Tools. Get started with Azure Speech for use cases including conversational AI, post-call analytics, and video summarization.

A woman sitting at a table using a laptop.

Next steps

Choose the Azure account that’s right for you 

Pay as you go or try Azure free for up to 30 days.

Get started with Azure

A woman with short curly hair, smiling and a man sitting beside her

AI development tools

Design and manage AI applications

Create, customize, and scale AI apps and agents efficiently.

Explore Foundry

Business Solutions Hub

Drive results with innovative cloud solutions

Browse the Business Solutions Hub to find products and solutions to achieve your goals.

Explore Microsoft solutions

Azure Speech in Foundry Tools

Discover the latest Azure Speech capabilities

Develop using best-in-class models

Integrate voice with your AI agents

Translate audio or text

Deploy anywhere

Develop multimodal generative AI apps with speech models

Build voice-enabled agents

Transcribe speech to text

Convert text to speech

Use post-call analytics

Transcribe audio with OpenAI Whisper

Build custom voices

Build your avatars

Enable multilingual communication

Embed speech

Embedded security and compliance

Flexible pricing to meet your needs

Azure products work better together

Azure OpenAI

Microsoft Foundry

Content Safety in Foundry Control Plane

Azure Content Understanding

Azure Translator

Azure Language

Get insights from leading brands

Get started with Azure Speech

Explore Azure Speech documentation

Build voice-enabled apps

GitHub resources

Start building now

Azure Speech learning paths

Create agentic AI

Find the best AI model

Secure and responsible AI

Frequently asked questions

What is Azure Speech in Foundry Tools (formerly Azure AI Speech)?

I see that Azure AI Speech is now called Azure Speech in Foundry Tools. How does that change the service?

What capabilities does Azure Speech support?

What languages are supported for speech translation in Azure AI Speech?

What are some ways I can use speech-to-text with Azure OpenAI’s GPT models to build intelligent solutions?

Choose the Azure account that’s right for you

Design and manage AI applications

Drive results with innovative cloud solutions

Choose the Azure account that’s right for you