Speech services July 2018 update

By Grace Sturman Senior Program Manager, Speech Services

Speech services July 2018 update • 2 min read

Posted on July 18, 2018
2 min read

A lot has happened since we announced that Speech services is now in preview, we have released the Cognitive Services Speech SDK June 2018 update.

Today, we are excited to announce that we have just released the 0.5.0 version of the Speech SDK. With this update, we have added support for UWP (on Windows version 1709), .NET Standard 2.0 (on Windows), and Java on Android 6.0 (Marshmallow, API level 23) or higher. We have made some feature changes and done some bug fixes. Most notably, we now support long-running audio and automatic reconnection. This will make the Speech service more resilient overall, in the event of timeout, network failures or service errors. We’ve also improved the error messages to make it easier to handle the errors. Please visit the Release Notes page for details. We will continue to add support for more platforms and programming languages, as we work toward making the Speech SDK generally available this fall.

Besides the Speech SDK, Custom Voice has also released a new feature to support more training data formats. All ‘.wav’ files (RIFF) with a sampling rates equal to or higher than 16khz are now accepted. Furthermore, we have extended support to more plain text encoding types (ANSI/UTF-8/UTF-8-BOM/UTF-16-LE/UTF-16-BE). For more details, visit our docs about how to prepare data and customize voice fonts. A new document is released to help you create high quality audio samples of human speech, with a focus on issues that you are likely to encounter during your voice training data preparation. For more details, see how to record voice samples for a custom voice.

In addition, we are very happy to announce new content for our Speech (Preview) documentation.

The content update aims to help developers to quickly navigate to the right content, based on the type of application they are developing.

We have a new separate section on the end-to-end customization process, including acoustic adaptation, language adaptation, pronunciation and voice fonts. We’ve added documentation about the Batch Transcription API which is ideal for customers that have large quantities of audio files on storage.

The Documentation is also complementing this SDK update with the following sections.

Brand new Scenario section to help you navigate the documentation according to your applications needs.
Consolidated e2e Customization section (including data and tutorial on GitHub)
Brand new Batch Transcription API including GitHub Sample
More detail and elaborate FAQ section for each of the sub-services, under the Resources.

The documentation is live now. Please use the Feedback section at the bottom of the documentation pages to tell us what you think.”

Interested in the Microsoft Speech services? You can try it out for free. To learn more and review sample code, please reference our documentation page. Please follow us on Twitter @msspeech3 to be notified for the future updates.

Speech services July 2018 update

Explore

Related posts

Enabling Diagnostic Logging in Azure API for FHIR®

Azure におけるインフラから SAP アプリケーションレイヤーまでの IRAP Protected コンプライアンス

MileIQ and Azure Event Hubs: Billions of miles streamed

Azure Stack IaaS – part ten

Join the conversation

おすすめ

AI + machine learning

分析

コンピューティング

コンテナー

データベース

DevOps

開発者ツール

ハイブリッド + マルチクラウド

ID

統合

モノのインターネット (IoT)

管理とガバナンス

メディア

移行

複合現実

モバイル

ネットワーク

セキュリティ

ストレージ

Web

Windows Virtual Desktop

ユース ケース

アプリケーション開発

AI

クラウドの移行とモダン化

データと分析

ハイブリッド クラウドとインフラストラクチャ

モノのインターネット (IoT)

セキュリティとガバナンス

組織の種類

リソース

Explore

Related posts

Join the conversation

ユースケース

ハイブリッドクラウドとインフラストラクチャ