Zum Hauptinhalt wechseln

 Subscribe

Today I am proud to announce the general availability of Azure Media Indexer v1.2!  From a bird’s eye view, this release comprises the following:

  • Support for Spanish Language
  • New configuration xml format

 

Language Support

Azure Media Indexer launched with support for only the English language, and this was a pain-point for many customers.  There is a large coordinated effort across Research and Engineering required to light up each new language model (and associated acoustic model).  Today, Indexer makes its first step towards universal support by introducing the Spanish language!

Stay tuned in to the Azure blog for the upcoming announcements of our languages that are in the works such as Italian and Mandarin.

 

Configuration

As Azure Media Indexer matures and grows as a product, there has come a need to change the schema of the configuration xml provided alongside each Indexer job.  This augmentation of our XML format will not break existing workflows, but it will expose some new nodes that allow for more powerful and customizable Indexer jobs.

I’ll start by showing an example of the old configuration (1.1.x.x), followed by a new working configuration which will run an Indexing job equivalent to an Indexing job from the old configuration.  This way, you can immediately inject the new configuration into your workflow to quickly and easily use the new configuration format.

 

Old configuration


  
    
    
  
  
  

 

The following configuration in the new format defines a Speech Recognition job in the English language which has the same outputs:

  • TTML
  • SAMI
  • WebVTT
  • AIB
  • keyword XML file

 

New Configuration


  
    
    
  
  
  

  
  
      
      
        
        
        
        
      
    
  
  

 

As you may have noticed, the additions have all been added after the (still unused) global node.  The new configuration reflects the possibility of Azure Media Indexer eventually supporting functions beyond pure Automatic Speech Recognition (ASR), and hence includes a node.  As of version 1.2, the only feature available is Speech Recognition.

The Speech Recognition feature has the following settings keys:

Key Description Example value(s)
Language the natural language to be recognized in the multimedia file English
Spanish
CaptionFormats a semicolon-separated list of the desired output caption formats (if any) ttml;sami;webvtt
ttml;sami
webvtt
GenerateAIB a boolean flag specifying whether or not an AIB file is required (for use with SQL Server and the customer Indexer IFilter).  For more information on the AIB file, check out this previous blog post True
False
GenerateKeywords a boolean flag specifying whether or not a keyword XML file is required True
False

With the new configuration laid out here, transitioning to v1.2 shouldn’t be very difficult, and we at Azure Media Services can’t wait to see the cool experiences you can develop with the addition of the Spanish language to Azure Media Indexer.

Special thanks go out to the Microsoft Machine Translation and Speech teams in Microsoft Research for the new language model, as well as the Microsoft Research Asia team for their continued help in development of the Azure Media Indexer media processor!

As always, feel free to reach out to us with any questions or comments at indexer@microsoft.com.

Not sure what Azure Media Indexer is?  Check out the introductory blog post here!

Indexer has been updated!  Check out the release notes for version 1.2.1 to learn more.

  • Explore

     

    Let us know what you think of Azure and what you would like to see in the future.

     

    Provide feedback

  • Build your cloud computing and Azure skills with free courses by Microsoft Learn.

     

    Explore Azure learning


Join the conversation