Dynamic Manifests and Rendered Sub-Clips, Essential Tools for Live Streaming | Azure Blog

I'm very excited to announce the arrival of two new sets of capabilities in our media platform which significantly enhance the toolbox of anyone building live and linear workflows.

I’m very excited to announce the arrival of two new sets of capabilities in our media platform which significantly enhance the toolbox of anyone building live and linear workflows. In this post, I’ll give you a high level introduction to these capabilities and their usage scenarios. Stay tuned for follow up posts when these feature sets are available for public use. Cenk (PM owner of dynamic manifests) and Anil (PM owner of sub-clipping) will go into usage/implementation details including code samples in the following weeks.

Dynamic manifests (a.k.a. Dynamic Manifest Manipulation or DMM)

Probably one of the more geeky names for a feature set you will hear all year but stick with me because I think you’ll like what you read. Dynamic manifests allow you to define filters on the streaming manifests for your assets to do a number of cool things. As with all our streaming features, it works equally well across all our supported protocols (HLS, MPEG-DASH, Smooth Streaming, and HDS) so when you define a filter, it will be applied the same when we dynamically package into each of those protocols.

Dynamic manifests are essentially filters that can be applied to the streaming manifests for your assets. They can be defined at both the asset level or at a global level which gives you a bunch of flexibility. For instance at the asset level you may want to define a timeline trim type filter (described below) because more than likely the locations where you want to perform your trims will vary asset by asset. However if you’re using the rendition filtering capabilities (also described below) to define the quality levels you want to deliver to different device types it will probably be more efficient for you to describe those filters once at a global level since you probably have a business policy that applies to all your streaming assets.

If one filter is good, two is even better. In fact, you can define multiple filters globally and at the asset level. So for instance you may want different timeline trims on your asset for different audiences and you will certainly have different device types and need to have a rendition filter for each of them. You may even want to combine filters to filter renditions based on device type and filter the timeline to cut it down to the start of the event.

There are two ways that you can filter your streaming manifests which I will detail in the sections below.

Rendition filtering

We live in a multi-platform world and for that, we have our dynamic packaging capabilities to make your assets available for streaming in all the major protocols. However, we also live in a world with dramatically different screen sizes from 4″ mobile phones to 72″ LCD displays. To cover all those devices from resolution and bandwidth perspectives, you may need upwards of 10 renditions on your assets and that’s just for video. On the audio side, you may have a number of different renditions as well including AAC for the mobile devices and Dolby Digital+ for the OTT devices.

If you put all those video and audio renditions into a single asset the problem you face without dynamic manifests is how to prevent devices from playing renditions that are not appropriate for them. For instance on large screen OTT devices you probably don’t want to serve a manifest which includes the 240p video rendition that you’ve created for your mobile devices since those viewers will need to ramp up through some really blocky video.

There are two ways I’ve seen customers to solve this problem without DMM. On devices that support it they write client-side logic to filter the manifest or specify the renditions they want to play. The problem with this approach is that building, testing, and maintaining client-side logic for all the platforms you need to reach can be very time consuming and problematic. The other option for solving this problem and the one that is most prominently used is to create different assets for different devices. The obvious pain with that approach is that you need to build, test, and maintain a much more complicated workflow to output an asset for each device type and your costs rise dramatically due to the additional development, asset preparation processing, and storage.

However with dynamic manifests you only need one asset preparation workflow which creates one asset with all the renditions needed for all devices. Then on that asset you can apply filters to the renditions it contains which are applied at request time on the server. The net result is that each device receives a manifest that only contains the renditions that are appropriate for it. Thus no client-side logic is needed. You can define your filtering rules based on bit rate, resolution, and codec and as mentioned earlier you can also define multiple filters that can be applied to your asset so you can have one for each device type you want to serve.
So to summarize, rendition filtering is a very powerful tool that will enable you to deliver the correct renditions to the right devices while also reducing the complexity of the workflows and client logic you need to build as well as your costs for preparing and storing your assets.

In the example above (click on the image to see a larger version) I prepared my asset using Azure Media Encoder’s “H264 Adaptive Bitrate MP4 Set 1080p” preset which transcoded my mezzanine asset into seven ISO MP4s video renditions from 180p to 1080p that can be dynamically packaged into any of our supported streaming protocols. At the top of the diagram I’ve requested the HLS manifest for the asset with no filters specified so the response I got had all seven renditions. In the bottom left I’ve again asked for the HLS manifest but this time I’ve specified that I also want the manifest filtered to remove all bitrates below 1Mbps which has resulted in the bottom two quality levels being stripped off in the response. In the bottom right right I’m again using HLS and this time specified a “mobile” filter which specifies that I don’t want any renditions where the resolution is larger than 720p which gave me a response with the two 1080p renditions stripped off.

Timeline trimming (for removing pre- and post-roll slates)

In a typical live event workflow, you will start a live stream several minutes before your event starts to ensure that everything is working well and to allow your audience to connect early and thus not miss the start of the event. During this time, you will probably display a slate that says something on the order of “Please stand by, your event will start shortly.” I will call that a pre-roll slate. Then, when the event ends you may want to keep the stream running with a slate that says something like “Thank you for watching, your event has now ended.” I’ll call this a post-roll slate.

This works well when the stream is live but when you stop your Program and the asset instantly becomes available for on-demand viewing, it will also have the pre-roll and post-roll. That means anyone viewing the on-demand stream and thus starting playback from the beginning of the archive will be left to seek through the timeline looking for the start of the event. Some customers choose to tackle this problem by writing client-side logic to seek the video player forward to the start of the event but there are a couple of problems with this approach. First, you are again faced with the challenges of writing, testing, and maintaining client-side logic across all of the platforms you want to reach. Second, the pre-roll is still there in the manifest delivered to the client so anyone that decides to seek back to the beginning may find themselves diving into a pre-roll slate and need to again seek to find the beginning. Additionally and the timeline duration in the video player doesn’t properly reflect the actual duration of the event.

With dynamic manifests you can specify (again through a filter) that you want all fragments in the manifest before the event start and after the event end to be removed and we will do that for you on the server and dynamically at the time of request. That last part is important to understand, the whole asset is still there including the whole manifest so you can have different filters with different trim points and you can still request the stream with no filters. Again no client-side logic is needed since all the video player has to do is play the manifest it is given.

Even better dynamic manifest trims can actually be applied while an event is still live. In other words as soon as you drop your slate and the event starts you can define and apply a filter to the stream to trim off the pre-roll slate. That way any of your viewers that arrived late and decide to watch the event from the beginning by seeking back aren’t left looking for start of the event.

NOTE: Trimming removes whole fragments from the manifest which means the granularity of your trim placement is limited by the size of the fragments you’re using (a.k.a. your GOP length). Typically with our live platform most people are using 2 second fragments for ingest which means you will have a granularity of 2 seconds for your cut/trim placement.

In the diagram above, I’m showing an example of using trim filter to remove pre- and post-roll slates from a baseball game archive. The timeline on the top shows the full event archive with no filter specified in the manifest request so what is returned in the response, is a manifest which contains everything in the asset including the pre- and post-roll slate. After the event ended, I created a filter on the asset named “eventonly” which specifies that I want all fragments with a time code value of less than X (start time of the game) and greater than Y to be filtered out of the manifest. In the second line of the diagram, I’ve now applied the filter by adding “filter=eventonly” to my manifest request. What is returned to me in the response is a manifest that only contains the fragments from the start to the end of the event.

Timeline trimming (for creating sub-clips)

Another possible way you can use the timeline trimming capabilities of dynamic manifests, is to cut your manifest all the way down to a sub-clip length. Suppose the event you are covering is a baseball game and in the fourth inning there’s a home run which you want to provide a short-form highlight (a.k.a sub-clip) of for on-demand viewing. You can use dynamic manifests for this purpose as well, much like above you’re again specifying cut or trim points the only difference is that you’re cutting the timeline down much further to what is most likely only 1-2 minute clip. Since filter definitions don’t require any processing (i.e. transcoding) they are available almost instantly which means your sub-clip of the home run can be available to your audience as fast as you can specify the cut points.

Using dynamic manifests in this way comes with two caveats. #1 as noted above your trim/cut points can only be at fragment/GOP boundaries. #2 you are not creating a new asset, only a filter on the full event archive. That means that the sub-clip will have the same life-cycle and security attributes as the full event archive. In other words if you delete the full event archive all of the sub-clips you’ve created in this way go with it, likewise if the full event archive is configured for dynamic encryption all of its sub-clips will be encrypted as well.

In the diagram above I am again using an asset which contains the full event archive of a baseball game and in the first row I’ve specified that I want a manifest that contains all the fragments in the archive. However in the second row instead of trimming down to the event start and end as I did above I’ve this time built a filter to trims all the way down to a short-form clip of the game winning home run.

Rendered Sub-Clips

I think every customer I’ve talked to over the last year that does any live streaming has wanted to be able to create sub-clips from those live streams. In other words, they want to be able to cut out some portion of the live stream in order to create a new on-demand asset. The prototypical example is in live sports coverage where there will typically be highlight moments like the home run I described above that you want to cut out and make available on your site, in your app, or on social media.

As I described above in the dynamic manifest section, timeline trimming could be used for this purpose if you’re willing to live with the caveats. However, I think most people will want to use another new feature we have which is rendered sub-clipping. To contrast it with dynamic manifests which trim down the full event archive of the live stream rendered sub-clips instead extract the portion of the stream you want and create a new asset, hence the “rendered” part of the name. The benefits to this approach are that your sub-clips can then have their own life-cycle independent from the stream they were cut from and can have their own security properties. For instance the full event may be DRM’d but the sub-clips can be in the clear.

Typically, the name of the game with sub-clipping is speed, how quickly can you have the highlight of that home run available to fans on your site after it occurs and performance is definitely top of mind for us in building this feature. The first thing to note is that rendered sub-clips can be described and rendered while the event is still live. Second, we provide two rendering options depending on your requirements. If you’re ok with cutting at fragment/GOP boundaries we offer a no transcode option which is very fast since we’re only copying fragments from the live asset to a new asset. However if you need frame accuracy we offer another option which does a transcode to achieve frame accuracy.

In the diagram above, I’ve again gone back to the same example event of a baseball game and the first row again shows the full archive of the event. This time rather than using a dynamic manifest filter to cut down the streaming manifest to create my highlight of the home run I’ve instead created a sub-clipping job which I have processed. The end result is a new asset which contains a frame accurate clip of the home run so when I request the streaming manifest for the asset with no filters I get back the 30 seconds of the home run.

Conclusion

Dynamic manifests and rendered sub-clipping are powerful new tools to add to your toolbox. Watch here for the detailed implementation posts in the coming weeks. Please ask any questions you have in the comments section below and I’ll respond as quickly as I can.