Skip to main content
  • 4 min read

Empowering remote learning with Azure Cognitive Services

Announcing the general availability of Immersive Reader and other enhancements to improve learning engagement.

This blog post was co-authored by Anny Dow, Product Marketing Manager, Azure Cognitive Services.

As schools and organizations around the world prepare for a new school year, remote learning tools have never been more critical. Educational technology, and especially AI, has a huge opportunity to facilitate new ways for educators and students to connect and learn.

Today, we are excited to announce the general availability of Immersive Reader, and shine a light on how new improvements to Azure Cognitive Services can help developers build AI apps for remote education that empower everyone.

Make content more accessible with Immersive Reader, now generally available

Immersive Reader is an Azure Cognitive Service within the Azure AI platform that helps readers read and comprehend text. Through today’s general availability, developers and partners can add Immersive Reader right into their products, enabling students of all abilities to translate in over 70 languages, read text aloud, focus attention through highlighting, other design elements, and more. 

Immersive Reader has become a critical resource for distance learning, with more than 23 million people every month using the tool to improve their reading and writing comprehension. Between February and May 2020, when many schools moved to a distance learning model, we saw a 560 percent increase in Immersive Reader usage. As the education community embarks on a new school year in the Fall, we expect to see continued momentum for Immersive Reader as a tool for educators, parents, and students.

With the general availability of Immersive Reader, we are also rolling out the following enhancements:

  • Immersive Reader SDK 1.1: Updates include support to have a page read aloud automatically, pre-translating content, and more. Learn about SDK updates.
  • New Neural Text-to-Speech (TTS) languages: Immersive Reader is adding 15 new Neural Text to Speech voices, enabling students to have content read aloud in even more languages. Learn about the new Neural Text to Speech languages.
  • New Translator languages: Translator is adding five new languages that will also be available in Immersive Reader—Odia, Kurdish (Northern), Kurdish (Central), Pashto, and Dari. Learn about the latest Translator languages.

Today, we’re adding new partners who are integrating Immersive Reader to make content more accessible, and SAFARI Montage. is a nonprofit dedicated to expanding access to computer science in schools. To ensure that students of all backgrounds and abilities can access their resources and course content, is integrating Immersive Reader into their platform.

“We’re thrilled to partner with Microsoft to bring Immersive Reader to the community. The inclusive capabilities of Immersive Reader to improve reading fluency and comprehension in learners of varied backgrounds, abilities, and learning styles directly aligns with our mission to ensure every student in every school has the opportunity to learn computer science.” – Hadi Partovi, Founder and CEO of

SAFARI Montage, a leading learning object repository, is integrating Immersive Reader to make it possible for students of any language background or accessibility needs to engage with content, and enable families who don’t speak the language of instruction to be more involved in their students’ learning journeys.  

“Immersive Reader is a crucial support for CPS students and families. During remote learning, particularly for our younger learners, student learning is often supported by parents, guardians, or other caregivers. Since Immersive Reader can be used to translate the student-facing instructions in our digital curriculum, families can support student learning in over 80 languages, making digital learning far more equitable and accessible than ever before! In addition, read-aloud and readability supports are game-changers for diverse learners”Giovanni Benincasa, UX Manager, Department of Curriculum, Instruction, and Digital Learning, Chicago Public Schools  

With Immersive Reader, all it takes is a single API call to help users boost literacy. To start exploring how to integrate Immersive Reader into your app or service, check out these resources: 

To see the growing list of Immersive Reader partners and learn more, check out our partners page and Immersive Reader education blog.

Bring online courses to life with speech-enabled apps

With the shift to remote learning, another challenge that educators may face is continuing to drive student engagement.

Text to Speech, a Speech service feature that allows users to convert text to lifelike audio can facilitate new ways for students to interact with content. In addition to powering features like Read Aloud in Immersive Reader and the Microsoft Edge browser, Text to Speech enables developers to build apps that speak naturally in over 110 voices with more than 45 languages and variants.

With the Audio Content Creation tool, users can more easily bring audiobooks to life and finetune audio characteristics like voice style, rate, pitch, and pronunciation to fit their scenarios—no code required. Voices can even be customized for specific characters or personas; the Custom Neural Voice capability makes it possible to build one-of-a-kind voices, starting with 30 minutes of audio. Duolingo, for example, is using the Custom Neural Voice capability to create unique voices to represent different characters in its language courses.

To learn more about how to start creating speech-enabled apps for remote learning, check out the technical Text to Speech blog and other resources:

Improve productivity and accessibility with transcription and voice commands 

AI can also be a useful tool for more seamless note-taking, making it possible for students and teachers to type with their voice. Transcribe in Word uses Speech to Text in Azure Cognitive Services to automatically transcribe your conversations. Now with speaker diarization, you can get a transcript that identifies who said what, when. 

In addition, adding voice enables more seamless experiences in Microsoft 365. For students who have difficulties writing things down, they can use AI-powered tools in Office not just for dictation but also for controls such as adding, formatting, editing, and organizing text. Word uses Language Understanding, an Azure Cognitive Service that enables you to add custom natural language understanding to your apps, to make it possible to capture ideas easily. To learn more about Language Understanding and how it is powering voice commands, check out our Language Understanding blog.

For more details on how AI is powering experiences in Microsoft 365, read the Microsoft 365 blog.

Get started today

We can’t wait to see what you’ll build. Get started today with Azure Cognitive Services and an Azure free account.