Calculating Billable Gigabytes for Media Encoding Jobs

With Media Services, when you submit a job with a task that uses Azure Media Encoder Media Processor, you are charged based on the amount of data processed. The Azure billing portal provides you an aggregate value of gigabytes processed over a billing period but it doesn’t provide you the breakdown of the charges at a job or task level. In this blog, I will go over some sample code that you can use to generate a breakdown of billable gigabytes at a job and task level. The blog also goes over how to use Excel Power Query to analyze the data generated by the sample code.

Media Assets in Storage

When you create a Media Asset, Media Services generates a GUID and uses that GUID to create a Media Asset Id. The Media Asset Id is prefixed by “nb:cid:UUID:” followed by the GUID. In other words, the Media Asset Id takes the form of “nb:cid:UUID:<GUID>”. Media Services also goes ahead and creates a container named “asset-<GUID>” in the specified Storage Account. Once the Asset is created, you can upload Asset files in the storage container. When you submit a job with an encoding task, the output files from the encoder are placed in the Storage container associated with the output asset.

Sample Code

Given the above, here is how the provided sample code works

  • The code enumerates through all the jobs for a given Media Services account.
  • For each job, all tasks are enumerated.
  • For task that are finished, all input and output assets are enumerated.
  • For each input and output asset, all blobs in the storage container associated with the asset are enumerated and the size of each blob in the asset is added to calculated the size of the asset.
  • The code then creates an Azure Table Entity called JobAndTaskTableEntity with JobId as the partition key and TaskId as the row key.
    • Other members for JobAndTaskTableEntity , such as StartTime, EndTime, MediaProcessor, RunningDuration, InputAssetSize and OutputAssetSize are also populated.
  • The code then writes out the JobAndTaskTableEntity to an Azure Table called JobAndTaskMetadata.

 

The App.Config file for the sample code looks as follows

<?xml version="1.0" encoding="utf-8"?>
<configuration>
  <startup>
    <supportedRuntime version="v4.0" sku=".NETFramework,Version=v4.5" />
  </startup>
  <appSettings>
    <add key="MediaServicesAccountName" value="<MediaAccountName>" />
    <add key="MediaServicesAccountKey" value="<MediaAccountKey>" />
    <add key="StorageConnectionString" value="DefaultEndpointsProtocol=https;AccountName=<StorageAccountName>;AccountKey=<StorageAccountKey>"/>
  </appSettings>
  <runtime>
    <assemblyBinding xmlns="urn:schemas-microsoft-com:asm.v1">
      <dependentAssembly>
        <assemblyIdentity name="Microsoft.WindowsAzure.Storage" publicKeyToken="31bf3856ad364e35" culture="neutral" />
        <bindingRedirect oldVersion="0.0.0.0-4.1.0.0" newVersion="4.1.0.0" />
      </dependentAssembly>
    </assemblyBinding>
  </runtime>
</configuration>

In the above App.Config, replace <MediaAccountName> and <MediaAccountKey> with your Media Services Account Name and Key. Also replace <StorageAccountName> and <StorageAccountKey> with the name and key of the storage account associated with your Media Services account.

The sample code is as follows.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using Microsoft.WindowsAzure;
using Microsoft.WindowsAzure.Storage;
using Microsoft.WindowsAzure.Storage.Blob;
using Microsoft.WindowsAzure.Storage.Table;
using Microsoft.WindowsAzure.MediaServices.Client;
using System.Configuration;

namespace JobAndTaskBilling
{
    /// <summary>
    /// 
    /// </summary>
    public class JobAndTaskTableEntity : TableEntity
    {
        public DateTime StartTime { get; set; }
        public DateTime EndTime { get; set; }
        public string MediaProcessor { get; set; }
        public TimeSpan RunningDuration { get; set; }
        public Double InputAssetSize { get; set; }
        public Double OutputAssetSize { get; set; }
    }

    /// <summary>
    /// 
    /// </summary>
    class Program
    {
        // Read values from the App.config file.
        private static readonly string _mediaServicesAccountName =
            ConfigurationManager.AppSettings["MediaServicesAccountName"];
        private static readonly string _mediaServicesAccountKey =
            ConfigurationManager.AppSettings["MediaServicesAccountKey"];
        private static readonly string _storageConnectionString =
            ConfigurationManager.AppSettings["StorageConnectionString"];

        // 
        private static CloudStorageAccount _cloudStorage = null;
        private static CloudBlobClient _blobClient = null;

        private static CloudTableClient _tableClient = null;
        private static CloudTable _taskTable = null;         

        private static CloudMediaContext _context = null;
        private static MediaServicesCredentials _cachedCredentials = null;

        static void Main(string[] args)
        {
            try
            {
                // Create and cache the Media Services credentials in a static class variable.
                _cachedCredentials = new MediaServicesCredentials(_mediaServicesAccountName, _mediaServicesAccountKey);

                // Used the chached credentials to create CloudMediaContext.
                _context = new CloudMediaContext(_cachedCredentials);

                // Use the Storage Connection String from App.Config to create a CloudStorageAccount instance
                _cloudStorage = CloudStorageAccount.Parse(_storageConnectionString);

                _blobClient = _cloudStorage.CreateCloudBlobClient();   // Create the CloudBlobClient instance to perform Blob operations
                _tableClient = _cloudStorage.CreateCloudTableClient(); // Create the CloudTableClient instance to perform Table operations

                _taskTable = _tableClient.GetTableReference("JobAndTaskMetadata"); 
                _taskTable.CreateIfNotExists();  // Create the Table if it doesn't exist

                ProcessJobs();
            }
            catch (Exception ex)
            {
                Console.WriteLine(ex.Message);
            }
        }

        /// <summary>
        /// This function loops through all jobs in a Media Services account 
        /// </summary>
        static void ProcessJobs()
        {
            try
            {
                Dictionary<string, string> _dictMPs = GetMediaProcessors();

                int skipSize = 0;
                int batchSize = 1000;
                int currentBatch = 0;                

                while (true)
                {
                    // Loop through all Jobs (1000 at a time) in the Media Services account
                    IQueryable _jobsCollectionQuery = _context.Jobs.Skip(skipSize).Take(batchSize);
                    foreach (IJob job in _jobsCollectionQuery)
                    {
                        currentBatch++;
                        Console.WriteLine("Processing Job Id:" + job.Id);

                        ProcessTasks(job, _dictMPs);
                    }

                    if (currentBatch == batchSize)
                    {
                        skipSize += batchSize;
                        currentBatch = 0;
                    }
                    else
                    {
                        break;
                    }                    
                }
            }
            catch (Exception ex)
            {
                Console.WriteLine(ex.Message);
            }
        }

        /// <summary>
        /// Enumerates all the Media Processors available in the Media Services accounts and creates a dictionary with MediaProcessorId as key and MediaProcessorName as value
        /// </summary>
        /// <returns></returns>
        static Dictionary<string, string> GetMediaProcessors()
        {
            Dictionary<string, string> _dictMPs = new Dictionary<string, string>();
            foreach (IMediaProcessor mp in _context.MediaProcessors)
            {
                _dictMPs.Add(mp.Id, mp.Name);                
            }

            return _dictMPs;
        }

        /// <summary>
        /// This function looks through all the tasks associated with a job
        /// For all finished tasks, it calculates the input and output asset size and writes an entity to the JobsAndTasksMetadata table
        /// </summary>
        /// <param name="job"></param>
        static void ProcessTasks(IJob job, Dictionary<string, string> _dictMPs)
        {
            try
            {
                foreach (ITask task in job.Tasks)
                {
                    Console.WriteLine("Processing Task Id:" + task.Id);

                    // Loop through the HistoricalEvents associated with the Task to find Tasks that have Finished
                    // Task.State only has the Conpleted State and based on that it is not possible to know whether the task had an error or did it finish successfully
                    for (int i = 0; i < task.HistoricalEvents.Count; i++)
                    {
                        if (task.HistoricalEvents[i].Code == "Finished")
                        {
                            try
                            {
                                JobAndTaskTableEntity tme = new JobAndTaskTableEntity();
                                tme.PartitionKey = job.Id;
                                tme.RowKey = task.Id;
                                tme.StartTime = Convert.ToDateTime(task.StartTime);
                                tme.EndTime = Convert.ToDateTime(task.EndTime);
                                tme.MediaProcessor = _dictMPs[task.MediaProcessorId]; // Use the MediaProcessor dictionary to figure out the MediaProcessorName
                                tme.RunningDuration = task.RunningDuration;
                                tme.InputAssetSize = 0;
                                tme.OutputAssetSize = 0;

                                for (int j = 0; j < task.InputAssets.Count; j++)
                                {
                                    tme.InputAssetSize += GetAssetSize(task.InputAssets[j]);
                                }

                                for (int k = 0; k < task.OutputAssets.Count; k++)
                                {
                                    tme.OutputAssetSize += GetAssetSize(task.OutputAssets[k]);
                                }

                                TableOperation op = TableOperation.Insert(tme);
                                _taskTable.Execute(op);
                            }
                            catch (Exception x)
                            {
                                Console.WriteLine(x.Message);
                            }
                        }
                    }
                }
            }
            catch (Exception ex)
            {
                Console.WriteLine(ex.Message);
            }
        }

        /// <summary>
        /// Gets the size of an Asset by eumerating all the blobs in the asset container and adding the size of each blob
        /// </summary>
        /// <param name="_asset"></param>
        /// <returns></returns>
        static double GetAssetSize(IAsset _asset)
        {
            double _assetSize = 0;

            try
            {
                // The asset container has "asset-" prefixed to the Asset GUID and the Asset Id has "nb:cid:UUID:" prefixed to the Asset GUID
                foreach (CloudBlockBlob _blobItem in _blobClient.ListBlobs("asset-" + _asset.Id.Replace("nb:cid:UUID:", "") + "/", true))
                {
                    _assetSize += _blobItem.Properties.Length;
                }
            }
            catch (Exception ex)
            {
                // If the Asset is not found in storage a value of zero will be returned for the Asset Size
                // This can happen if the Asset was deleted after the Task finished but before this sample code was run
                Console.WriteLine(ex.Message);
            }

            return _assetSize;
        }        

    }
}

A brief description of the functions in the code above is as follows

ProcessJobs

This function loops through all the jobs in the provided Media Services Account. Media Services returns 1000 jobs in Jobs Collection. The function makes use of Skip and Take to make sure that all jobs are enumerated (in case you have more than 1000 jobs in your account).

GetMediaProcessors

This function loops through all the Media Processors available in the provided Media Services Account and creates a dictionary with MediaProcesssorId as the Key and the MediaProcessorName as the value. This dictionary is used when writing entities for JobAndTaskMetadata table.

ProcessTasks

This function loops through all the tasks associated with a given job. It then loops through all the historical events associated with the task to check if the task actually finished successfully. This is to avoid logging entries for tasks that had an error (as those tasks are not billable). For finished tasks, the function also gets the input and output asset size and then creates an entity for the JobAndTaskMetadata table.

GetAssetSize

This function calculates the asset size in bytes by looping through all the blobs in the asset container in  the storage account.

Using Excel Power Query to analyze the data

Once the sample code above finishes running, you will have an Azure Table called JobAndTaskMetadata with data about finished tasks. You can use Excel Power Query to import that data in to Excel for analysis. If you have never used Excel Power Query, you can download it from “Download Microsoft Power Query for Excel” web page. Once installed you can start Excel and you will see a tab called “POWER QUERY”. Click on that tab and then click on “From Other Sources” button and you will see a menu item called “From Windows Azure Table Storage” as shown in the screenshot below

2014-07-06_00h21_51

Upon selecting that menu item, you will see a dialog which looks as follows

2014-07-06_00h28_02

Enter your Storage account name and you will be presented the following dialog

2014-07-06_00h29_50

Enter the account key and click Save. You will now see a pane called “Navigator” on the right hand side. All the tables in the storage account will be listed. Double click on the JobAndTaskMetadata table and a new window will open up. A screenshot of that window is as follows

2014-07-06_21h19_26

Click on the button next to the column labeled “Content” and you will see a popup which looks as follows

2014-07-06_21h22_17

Click OK and all the columns will now get loaded. Now click on the “Apply & Close” button at the top and the entire data will get loaded in a worksheet.

Now you can add a column called “Billable Gigabytes” and add a formula that add the InputAssetSize and OutputAssetSize and divides the total by 1024 * 1024 * 1024 (as the asset sizes are in bytes). Since Media Services only bills for tasks submitted with “Windows Azure Media Encoder” Media Processor you can filter out the remaining rows.

Considerations

Finally, please note the following as you consider using this sample code for your application

  • A Media Services account has limits on the max number of jobs, assets and tasks. These are documented in the “Quotas and Limitations” MSDN page. The sample code above relies on the fact that no assets, jobs or tasks have been deleted from the provided Media Services account but that may be hard to sustain if your application submits a lot of jobs. In such a case, you can consider using the ProcessTasks function from the code above right before you delete the job and/or asset.
  • If your application adds or deletes media blobs from an asset container after the job finishes, you will get inaccurate billable gigabytes. To avoid this, its best to run the ProcessTasks function for a given job as soon as it finishes. If you use “Job Notifications”, then you can call the ProcessTasks function upon receiving the “Finished” message for the Job.
  • The sample code provided in this blog is designed to work with a Media Services account that has all assets in a single storage account but it can be easily adapted to work with multiple storage accounts.
  • The code above writes the InputAssetSize and OutputAssetSize in bytes. You can modify the code to write the sizes in gigabytes and also add another column in the JobAndTaskMetadata table to write out the sum of the InputAssetSize and OutputAssetSize.
  • You can use one of the existing Azure Storage Explorer applications that supports Tables to view the data in the JobAndTaskMetadata table vs. using Excel but note that Excel does provide some powerful data manipulation functionality such as Pivot Tables that you can use for further analysis of the data in the table.
  • Your monthly Azure bill is a function of your billing anniversary as well as the discounts (volume as well as commitment) that apply to your account. Keep that in to account if you try to compare the results of what you get by going through the above code with your monthly Azure bill.
  • A side benefit of the above code is that you can use the data loaded in the worksheet to see how many bytes are being processed with the other Media Processors as well.