Skip Navigation

Bing Speech

Convert audio to text, understand intent and convert text back to speech for natural responsiveness

Speech Recognition

Convert spoken audio to text. The API can be directed to turn on and recognise audio coming from the microphone in real-time, recognise audio coming from a different real-time audio source or to recognise audio from within a file. In all cases, real-time streaming is available, so as the audio is being sent to the server, partial recognition results are also being returned.

The Speech to Text API enables you to build smart apps which are voice triggered. To see how it works select your target language then click on the microphone and start speaking. Or simply click on one of the sample speech phrases to see how speech recognition works. When you use this demo you consent to providing your voice input data to Microsoft for service improvement purpose

See it in action

To try out the demo with your own voice using a microphone, please change to a different browser with WebRTC support, for example a recent version of Microsoft Edge, Firefox or Chrome.

Want to build this?

Text to Speech

Convert text to spoken audio. When applications need to “talk” back to their users, this API can be used to convert text which is generated by the app into audio which can be played back to the user.

The Text-To-Speech API enables you to build smart apps which can speak. You can test it now, simply choose your target language, add your sentences then click on the play button to see how speech synthesis works. When you use this demo you consent to providing your voice input data to Microsoft for service improvement purposes.

See it in action

500 characters left

Want to build this?

Explore the Cognitive Services APIs

Computer Vision

Distill actionable information from images

Face

Detect, identify, analyse, organise, and tag faces in photos

Video Indexer PREVIEW

Unlock video insights

Content Moderator

Automated image, text and video moderation

Custom Vision PREVIEW

Easily customise your own state-of-the-art computer vision models for your unique use case

Text Analytics

Easily evaluate sentiment and topics to understand what users want

Translator Text

Easily conduct machine translation with a simple REST API call

Bing Spell Check

Detect and correct spelling mistakes in your app

Content Moderator

Automated image, text and video moderation

Language Understanding

Teach your apps to understand commands from your users

Bing Speech

Convert speech to text and back again to understand user intent

Speaker Recognition PREVIEW

Use speech to identify and verify individual speakers

Translator Speech

Easily conduct real-time speech translation with a simple REST API call

Custom Speech PREVIEW

Overcome speech recognition barriers like speaking style, background noise and vocabulary

Speech Services PREVIEW

Unified speech services for speech-to-text, text-to-speech and speech translation

QnA Maker

Distill information into conversational, easy-to-navigate answers

Ready to supercharge your app?