Skip navigation

Solution architecture: Keyword search/speech-to-text/OCR digital media

A speech-to-text solution allows you to identify speech in static video files so you can manage it as standard content, such as allowing employees to search within training videos for spoken words or phrases, and then enabling them to quickly navigate to the specific moment in the video. This solution allows you to upload static videos to an Azure website. The Azure Media Indexer uses the Speech API to index the speech within the videos and stores it in SQL Azure. You can search for words or phrases by using Azure Web Apps and retrieve a list of results. Selecting a result enables you to see where in the video the word or phrase is mentioned.

This solution is built on the Azure-managed services: Content Delivery Network and Azure Search. These services run in a high-availability environment that is patched and supported, allowing you to focus on your solution instead of the environment they run in.

TTML, WebVTTKeywords Azure BlobStorage StreamingEndpoint Multi-ProtocolDynamicPackaging/Multi-DRM Web Apps Azure CDN SourceA/V Files Azure MediaIndexer/OCR Media Processor Azure Search Azure Media Player Azure Encoder(Standard orPremium)

Implementation guidance

Products/Description Documentation

Blob Storage

Stores large amounts of unstructured data, such as text or binary data, that can be accessed from anywhere in the world via HTTP or HTTPS. You can use Blob Storage to expose data publicly to the world or to store application data privately.

Azure encoding

Encoding jobs are one of the most common processing operations in Media Services. You create encoding jobs to convert media files from one encoding to another.

Azure streaming endpoint

Represents a streaming service that can deliver content directly to a client player application, or to a content delivery network (CDN) for further distribution.

Content Delivery Network

Provides secure, reliable content delivery with a broad global reach and rich feature set.

Azure Media Player

Uses industry standards, such as HTML5 (MSE/EME), to provide an enriched adaptive streaming experience. Regardless of the playback technology used, developers have a unified JavaScript interface to access APIs.

Azure Search

Delegates search-as-a-service server and infrastructure management to Microsoft, leaving you with a ready-to-use service that you can populate with your data and use to add search to your web or mobile application.

Web Apps

Hosts the website or web application.

Azure Media Indexer

Enables you to make the content of your media files searchable and to generate a full-text transcript for closed-captioning and keywords. You can process one media file or multiple media files in a batch.

Related solution architectures

Token Token License/Key License/Key Azure BlobStorage StreamingEndpoint Multi-Protocol Dynamic Packaging/Multi-DRM Azure Encoder(Standard orPremium) Azure Media Playerin Browser Azure Media Playerin Mobile App Cloud DRMLicense/KeyDelivery Server Azure CDN MezzanineVideo Files

Video-on-demand digital media

A basic video-on-demand solution that gives you the capability to stream recorded video content, such as films, news clips, sports segments, training videos and customer support tutorials to any video-capable endpoint device, mobile application or desktop browser. Video files are uploaded to Azure Blob Storage, encoded to a multi-bit rate standard format, then distributed via all major adaptive bit-rate streaming protocols (HLS, MPEG-DASH, Smooth) to the Azure Media Player client.

Learn more
Token Token License/Key License/Key Channel Live Source Cloud DRM License/Key Delivery Serve StreamingEndpoint Multi-ProtocolDynamicPackaging/Multi-DRM Azure CDN Azure BlobStorage PreviewMonitoring Azure Media Playerin Browser Azure Media Playerin Mobile App Azure LiveEncoder 3rd Party On-PremisesLive Encoder Program

Live-streaming digital media

A live streaming solution allows you to capture video in real time and broadcast it to consumers in real time, such as streaming interviews, conferences and sporting events online. Through this solution, video is captured by a video camera and sent to a channel input endpoint. The channel receives the live input stream and makes it available for streaming through a streaming endpoint to a web browser or mobile app. The channel also provides a preview monitoring endpoint to preview and validate your stream before further processing and delivery. The channel can also record and store the ingested content in order to be streamed later (video on demand).

Learn more