Solution architecture: Keyword search/speech-to-text/OCR digital media

A speech-to-text solution allows you to identify speech in static video files so you can manage it as standard content, such as allowing employees to search within training videos for spoken words or phrases, and then enabling them to quickly navigate to the specific moment in the video. This solution allows you to upload static videos to an Azure website. The Azure Media Indexer uses the Speech API to index the speech within the videos and stores it in SQL Azure. You can search for words or phrases by using Azure Web Apps and retrieve a list of results. Selecting a result enables you to see where in the video the word or phrase is mentioned.

This solution is built on the Azure managed services: Content Delivery Network and Azure Search. These services run in a high-availability environment, patched and supported, allowing you to focus on your solution instead of the environment they run in.

Keyword search/speech-to-text/OCR digital media A diagram showing the architecture of a keyword search/speech-to-text/OCR digital media solution, built on the Azure managed services Content Delivery Network and Azure Search. TTML, WebVTTKeywordsAzure BlobStorageStreamingEndpointMulti-ProtocolDynamicPackaging/Multi-DRMWeb AppsAzure CDNSourceA/V FilesAzure MediaIndexer/OCR Media ProcessorAzure SearchAzure Media PlayerAzure Encoder(Standard orPremium)

Implementation guidance

Products Documentation

Blob storage

Stores large amounts of unstructured data, such as text or binary data, that can be accessed from anywhere in the world via HTTP or HTTPS. You can use Blob storage to expose data publicly to the world, or to store application data privately.

Azure encoding

Encoding jobs are one of the most common processing operations in Media Services. You create encoding jobs to convert media files from one encoding to another.

Azure streaming endpoint

Represents a streaming service that can deliver content directly to a client player application, or to a content delivery network (CDN) for further distribution.

Content Delivery Network

Provides secure, reliable content delivery with broad global reach and a rich feature set.

Azure Media Player

Uses industry standards, such as HTML5 (MSE/EME) to provide an enriched adaptive streaming experience. Regardless of the playback technology used, developers have a unified JavaScript interface to access APIs.

Azure Search

Delegates search-as-a-service server and infrastructure management to Microsoft, leaving you with a ready-to-use service that you can populate with your data, and then use to add search to your web or mobile application.

Web Apps

Hosts the website or web application.

Azure Media Indexer

Enables you to make the content of your media files searchable and to generate a full-text transcript for closed-captioning and keywords. You can process one media file or multiple media files in a batch.

Related solution architectures

Video-on-demand digital media

A basic video-on-demand solution that gives you the capability to stream recorded video content such as movies, news clips, sports segments, training videos, and customer support tutorials to any video-capable endpoint device, mobile application, or desktop browser. Video files are uploaded to Azure Blob storage, encoded to a multi-bitrate standard format, and then distributed via all major adaptive bit-rate streaming protocols (HLS, MPEG-DASH, Smooth) to the Azure Media Player client.

Learn more
Live streaming digital media

A live streaming solution allows you to capture video in real-time and broadcast it to consumers in real time, such as streaming interviews, conferences, and sporting events online. In this solution, video is captured by a video camera and sent to a channel input endpoint. The channel receives the live input stream and makes it available for streaming through a streaming endpoint to a web browser or mobile app. The channel also provides a preview monitoring endpoint to preview and validate your stream before further processing and delivery. The channel can also record and store the ingested content in order to be streamed later (video-on-demand).

Learn more