AI + Machine Learning, Announcements, Speech to text

Announcing Custom Speech Service (Preview) from Microsoft Cognitive Services

Announcing Custom Speech Service (Preview) from Microsoft Cognitive Services • 2 min read

Posted on February 6, 2017
2 min read

We are excited to announce the public preview release of the Custom Speech Service from Microsoft Cognitive Services. The Custom Speech Service (formerly the Custom Recognition Intelligent Service) lets you customize Microsoft’s speech-to-text engine. By uploading text and/or speech data to the Custom Speech Service that reflects your application and your users, you can create custom models that can be combined with Microsoft’s state-of-the-art speech models and deployed to a custom speech-to-text endpoint, accessible from any device.

Why customize the speech-to-text engine?

Speech recognition systems are composed of several components. Two of the most important components are the acoustic model and the language model. The acoustic and language models behind Microsoft’s world-class speech recognition engine have been optimized for common usage scenarios, such as interacting with Cortana on your smart phone, tablet or PC, searching the web by voice, or sending text messages to a friend.

If your application contains particular vocabulary items, such as product names or jargon that rarely occur in typical speech, it is likely that you can obtain improved performance by customizing the language model.

For example, if you were building an app to assist automotive mechanics, terms like “powertrain” or “catalytic converter” or “limited slip differential” will appear more frequently in this application than in typical voice applications. Customizing the language model will enable the system to learn this.

Similarly, customizing the acoustic model can enable the system to do a better job recognizing speech in particular environments or from particular user populations. For example, if you have a voice-enabled app designed for use in a warehouse or factory, a custom acoustic model can more accurately recognize speech in the presence of the noises found in these environments.

How do I get started?

Visit www.cris.ai to learn how to create and deploy custom speech-to-text models. The site provides resources that enable you to use a a simple interface to import text and/or audio data, create custom acoustic and language models, and evaluate performance. The custom models can be deployed in conjunction with Microsoft’s existing state-of-the-art models to create custom speech-to-text endpoints.

We’ve made some sample text data for building and testing a custom language model available on Custom Speech Service GitHub page. The model will enable you to build an application that can transcribe facts about dinosaurs, because, you know, everybody loves dinosaurs.

We welcome your Feedback and Questions link = https://cognitive.uservoice.com/

Announcing Custom Speech Service (Preview) from Microsoft Cognitive Services

Explore

Related posts

Accelerate your productivity with the Whisper model in Azure AI now generally available

3 Microsoft Azure AI product features that accelerate language learning

3 ways Azure Speech transforms game development with AI

Azure AI: Build mission-critical AI apps with new Cognitive Services capabilities

Popular

AI + machine learning

Analytics

Compute

Containers

Databases

DevOps

Developer tools

Hybrid + multicloud

Identity

Integration

Internet of Things

Management and governance

Media

Migration

Mixed reality

Mobile

Networking

Security

Storage

Web

Virtual desktop infrastructure

Use cases

Application development

AI

Cloud migration and modernization

Data and analytics

Hybrid cloud and infrastructure

Internet of Things

Security and governance

Organization type

Resources

Explore

Related posts