Quickly develop high-quality voice-enabled apps
Get state-of-the-art speech to text, lifelike text to speech, and award-winning speaker recognition.
Compliant and secure
Your data stays yours—your speech input is not logged during processing.
Customizable voices and models
Create custom voices, add specific words to your base vocabulary, or build your own models.
Run Speech anywhere, in the cloud or at the edge in containers.
Convert speech to text
Quickly and accurately transcribe audio in more than 92 languages and variants. Gain customer insights with call center transcription, improve experiences with voice-enabled assistants, capture key discussions in meetings and more.
Give your app a voice
Use text to speech to create apps and services that speak conversationally, choosing from more than 215 neural voices across 119 languages. Create natural-sounding audio content, improve accessibility with read-aloud functionality, and create custom voice assistants.
Translate speech in real time
Translate audio from more than 30 languages and customize translations for your organization's specific terms—all in your preferred programming language.
Verify and recognize speakers
Confirm a person's identity or recognize who's speaking in a meeting by adding speaker verification and identification to your app.
Activate your assistant or IoT device with a custom keyword
Create a custom keyword for IoT devices and voice-enabled assistants to set your brand apart—making it more personal, personable, and secure.
Add voice commands for hands-free scenarios
Build a touchless, voice-first experience to improve safety and support back-to-work scenarios.
Comprehensive security and compliance, built in
Microsoft ではサイバーセキュリティの研究と開発に年間 USD 10 億を超える投資を行っています。
Microsoft には、データ セキュリティとプライバシーを専門とする 3,500 人を超えるセキュリティ エキスパートがいます。
Azure は、他のクラウド プロバイダーを上回る数の認定を受けています。包括的なリストをご確認ください。
Flexible pricing gives you the power and control you need
Pay for only what you use, with no upfront costs. With Speech, pay as you go based on:
- The number of hours of audio you transcribe or translate for speech to text and speech translation.
- The number of characters you convert to audio for text to speech
- The number of transactions for Speaker Recognition
Trusted by companies of all sizes
AT&T delights customers with immersive experiences
AT&T is showcasing its 5G network with an immersive experience that allows customers to talk directly to Bugs Bunny.*
*LOONEY TUNES and all related characters and elements © & ™ Warner Bros. Entertainment Inc. (s21)
Progressive brings Flo directly to customers
Progressive used Custom Neural Voice to build a natural-sounding, virtual version of Flo to help customers with everything from getting a free car insurance quote to general insurance questions.
KPMG streamlines call transcription
KPMG uses Speech to Text to transcribe and catalog thousands of calls, reducing compliance costs for its clients by as much as 80 percent.
Motorola helps first responders access vital data
Motorola Solutions helps first responders in the field access vital information with a voice-first virtual assistant.
Hochtief documents construction defects using voice
A voice-enabled virtual assistant helps construction project managers identify and document defects at building sites.
Zencity improves quality of life with AI solutions
Data and analytics startup Zencity uses Speech Translation to analyze data from a variety of sources—social media, maintenance requests, and more—helping governments make data-driven decisions that provide better services for their residents.