• 2 min read

Azure Media Indexer Spanish (v1.2)!

Today, I'm proud to announce the general availability of Azure Media Indexer v1.2! Included in this release, is support for Spanish Language and configuration support.

Today I am proud to announce the general availability of Azure Media Indexer v1.2!  From a bird’s eye view, this release comprises the following:

  • Support for Spanish Language
  • New configuration xml format


Language Support

Azure Media Indexer launched with support for only the English language, and this was a pain-point for many customers.  There is a large coordinated effort across Research and Engineering required to light up each new language model (and associated acoustic model).  Today, Indexer makes its first step towards universal support by introducing the Spanish language!

Stay tuned in to the Azure blog for the upcoming announcements of our languages that are in the works such as Italian and Mandarin.



As Azure Media Indexer matures and grows as a product, there has come a need to change the schema of the configuration xml provided alongside each Indexer job.  This augmentation of our XML format will not break existing workflows, but it will expose some new nodes that allow for more powerful and customizable Indexer jobs.

I’ll start by showing an example of the old configuration (1.1.x.x), followed by a new working configuration which will run an Indexing job equivalent to an Indexing job from the old configuration.  This way, you can immediately inject the new configuration into your workflow to quickly and easily use the new configuration format.


Old configuration



The following configuration in the new format defines a Speech Recognition job in the English language which has the same outputs:

  • TTML
  • SAMI
  • WebVTT
  • AIB
  • keyword XML file


New Configuration




As you may have noticed, the additions have all been added after the (still unused) global node.  The new configuration reflects the possibility of Azure Media Indexer eventually supporting functions beyond pure Automatic Speech Recognition (ASR), and hence includes a node.  As of version 1.2, the only feature available is Speech Recognition.

The Speech Recognition feature has the following settings keys:

Key Description Example value(s)
Language the natural language to be recognized in the multimedia file English
CaptionFormats a semicolon-separated list of the desired output caption formats (if any) ttml;sami;webvtt
GenerateAIB a boolean flag specifying whether or not an AIB file is required (for use with SQL Server and the customer Indexer IFilter).  For more information on the AIB file, check out this previous blog post True
GenerateKeywords a boolean flag specifying whether or not a keyword XML file is required True

With the new configuration laid out here, transitioning to v1.2 shouldn’t be very difficult, and we at Azure Media Services can’t wait to see the cool experiences you can develop with the addition of the Spanish language to Azure Media Indexer.

Special thanks go out to the Microsoft Machine Translation and Speech teams in Microsoft Research for the new language model, as well as the Microsoft Research Asia team for their continued help in development of the Azure Media Indexer media processor!

As always, feel free to reach out to us with any questions or comments at indexer@microsoft.com.

Not sure what Azure Media Indexer is?  Check out the introductory blog post here!

Indexer has been updated!  Check out the release notes for version 1.2.1 to learn more.