AI + Machine Learning, Announcements, Speech to text

Microsoft previews neural network text-to-speech

Microsoft previews neural network text-to-speech • 1 min read

Posted on December 13, 2018
1 min read

Applying the latest in deep learning innovation, Speech Service, part of Azure Cognitive Services now offers a neural network-powered text-to-speech capability. Access the preview available today.

Neural Text-to-Speech makes the voices of your apps nearly indistinguishable from the voices of people. Use it to make conversations with chatbots and virtual assistants more natural and engaging, to convert digital texts such as e-books into audiobooks and to upgrade in-car navigation systems with natural voice experiences and more.

This release includes significant enhancements since we first revealed Neural Text-to-Speech at Ignite earlier this year.

Enhanced voice quality

The voices sound more robust and natural across a wider variety of user scenarios, achieved by harnessing the following:

A large supervised training with transfer learning across diverse speakers
More features from unsupervised pretraining
Added robust neural model design

Accelerated runtime performance

Runtime performance of the Neural Text-to-Speech engine is near-instantaneous through extensive code optimization with hardware accelerators, applying parallel inference models and model simplifications considering the balance of sound quality and performance. The real-time factor has been improved from the previous version to less than 0.05X, meaning 1 second of audio can be generated in less than 50 milliseconds. Producing the first byte of audio now runs 6 times faster than before.

Greater service availability

Neural Text-to-Speech has since expanded to three datacenters across the US, Europe, and Asia. Wherever you are in the world, you can integrate neural voices with reduced latency overhead.

With these updates, Speech Services Neural Text-to-Speech capability offers the most natural-sounding voice experience for your users in comparison to the traditional and hybrid system approaches.

You can use this capability starting today with two pre-built neural voices in English – meet Jessa and Guy. Hear what they sound like.

Discounts are available during the preview. Visit the Speech Services pricing page for more details.

If you would like to access this capability in Chinese or German, please submit your request.

Microsoft previews neural network text-to-speech

Enhanced voice quality

Accelerated runtime performance

Greater service availability

Explore

Related posts

Accelerate your productivity with the Whisper model in Azure AI now generally available

3 Microsoft Azure AI product features that accelerate language learning

3 ways Azure Speech transforms game development with AI

Azure AI: Build mission-critical AI apps with new Cognitive Services capabilities

Popular

AI + machine learning

Analytics

Compute

Containers

Databases

DevOps

Developer tools

Hybrid + multicloud

Identity

Integration

Internet of Things

Management and governance

Media

Migration

Mixed reality

Mobile

Networking

Security

Storage

Web

Virtual desktop infrastructure

Use cases

Application development

AI

Cloud migration and modernization

Data and analytics

Hybrid cloud and infrastructure

Internet of Things

Security and governance

Organization type

Resources

Enhanced voice quality

Accelerated runtime performance

Greater service availability

Explore

Related posts