Azure Media Indexer Spanish (v1.2)!

By Adarsh Solanki Program Manager, Azure Media Services

Azure Media Indexer Spanish (v1.2)! • 2 min read

Posted on April 13, 2015
2 min read

Tag: Media Services & CDN

Today I am proud to announce the general availability of Azure Media Indexer v1.2! From a bird’s eye view, this release comprises the following:

Support for Spanish Language
New configuration xml format

Language Support

Azure Media Indexer launched with support for only the English language, and this was a pain-point for many customers. There is a large coordinated effort across Research and Engineering required to light up each new language model (and associated acoustic model). Today, Indexer makes its first step towards universal support by introducing the Spanish language!

Stay tuned in to the Azure blog for the upcoming announcements of our languages that are in the works such as Italian and Mandarin.

Configuration

As Azure Media Indexer matures and grows as a product, there has come a need to change the schema of the configuration xml provided alongside each Indexer job. This augmentation of our XML format will not break existing workflows, but it will expose some new nodes that allow for more powerful and customizable Indexer jobs.

I’ll start by showing an example of the old configuration (1.1.x.x), followed by a new working configuration which will run an Indexing job equivalent to an Indexing job from the old configuration. This way, you can immediately inject the new configuration into your workflow to quickly and easily use the new configuration format.

Old configuration

The following configuration in the new format defines a Speech Recognition job in the English language which has the same outputs:

TTML
SAMI
WebVTT
AIB
keyword XML file

New Configuration

As you may have noticed, the additions have all been added after the (still unused) global node. The new configuration reflects the possibility of Azure Media Indexer eventually supporting functions beyond pure Automatic Speech Recognition (ASR), and hence includes a node. As of version 1.2, the only feature available is Speech Recognition.

The Speech Recognition feature has the following settings keys:

Key	Description	Example value(s)
Language	the natural language to be recognized in the multimedia file	English Spanish
CaptionFormats	a semicolon-separated list of the desired output caption formats (if any)	ttml;sami;webvtt ttml;sami webvtt
GenerateAIB	a boolean flag specifying whether or not an AIB file is required (for use with SQL Server and the customer Indexer IFilter). For more information on the AIB file, check out this previous blog post	True False
GenerateKeywords	a boolean flag specifying whether or not a keyword XML file is required	True False

With the new configuration laid out here, transitioning to v1.2 shouldn’t be very difficult, and we at Azure Media Services can’t wait to see the cool experiences you can develop with the addition of the Spanish language to Azure Media Indexer.

Special thanks go out to the Microsoft Machine Translation and Speech teams in Microsoft Research for the new language model, as well as the Microsoft Research Asia team for their continued help in development of the Azure Media Indexer media processor!

As always, feel free to reach out to us with any questions or comments at indexer@microsoft.com.

Not sure what Azure Media Indexer is? Check out the introductory blog post here!

Indexer has been updated! Check out the release notes for version 1.2.1 to learn more.

Azure Media Indexer Spanish (v1.2)!

Language Support

Configuration

Old configuration

New Configuration

Explore

Related posts

Logic Apps, Flow connectors will make Automating Video Indexer simpler than ever

Get video insights in (even) more languages!

Build 2018: Video Indexer updates

Brand Detection in Microsoft Video Indexer

Join the conversation

Sélection

IA + Machine Learning

Analyse

Calcul

Conteneurs

Bases de données

DevOps

Outils de développement

Hybride + multicloud

Identité

Intégration

Internet des Objets

Gestion et gouvernance

Données multimédias

Migration

Réalité mixte

Mobile

Mise en réseau

Sécurité

Stockage

Web

Bureau virtuel Windows

Cas d'utilisation

Développement d’applications

IA

Migration et modernisation cloud

Données et analyse

Cloud hybride et infrastructure

Internet des Objets

Sécurité et gouvernance

Type d’organisation

Ressources

Language Support

Configuration

Old configuration

New Configuration

Explore

Related posts

Join the conversation