• 4 min read

Simple Web Site Backup retention policy with WebJobs

Azure provides the capability of running code on your website context with WebJobs. They provide great versatility in terms of application scenario solutions for websites, while accessing Azure…

Azure provides the capability of running code on your website context with WebJobs. They provide great versatility in terms of application scenario solutions for websites, while accessing Azure resources, and different ways in which you can solve them. The website backup feature runs on a schedule and makes use of Azure storage, which makes this a particularly interesting example. This post covers the details on how to build a simple WebJob that will clean-up your old website backup files on a regular basis.

Backup files layout

To set the context, let’s cover some details of the backup feature that are relevant for understanding this solution. One of the best aspects of its design is how accessible and manageable the backup content is made for customers. After each successful backup run, two files are stored in the specified storage account: an XML file describing the backup contents, and a ZIP file with the actual files that have been backed up. The files’ naming convention is _.. For example:

Most customers will set up the backup operation to run on a schedule, such as once a day. This will create 2 files on a daily basis that will accumulate over time in the backup storage account. While this feature is in preview mode, retention policy hasn’t been offered out of the box yet, but you can configure your site to do this cleanup.

Solution outline

One simple way to cleanup old backups is to run a scheduled WebJob that will apply your backup files retention policy. To familiarize yourself with WebJobs, check out this article on the Azure website. To use WebJobs for this, you need to create a simple application and configure a WebJob to run it every day. The application will go through all the files in the backup folder, check when they were created, and delete them if they are older than your defined policy.

Sample Code

The application can be written and deployed in a variety of ways. The example below is written in C# and deployed as a console app (.exe), but it could easily be done in other languages. Note: if you would like to use this code directly, follow the instructions in the next section on how to download the full version that can be used to build and deploy this as a WebJob

This first part of the code gathers the configuration for the backup storage account, the backup folder location, the max age to keep (in days) and the Website name. The configuration values (uppercase strings below) are obtained from the application settings for your website. Later in this post you’ll see how to set up values for those settings.

With the settings in place, the second part of the code connects to the blob storage, where the backup files are located. It then calculates the Date and Time offset from the current time to the max-age policy. Next, it uses a foreach loop to go through each of the files in the folder and for files that have the zip or xml extension and are named according to the backup process convention. The program then calculates each file age, if that age is older than the max defined, the code deletes the files.

Building and packaging the binaries (.exe)

If you’re not familiar with building cmd line executables (.exe), here are some steps to build the executable using Visual Studio 2013 and Azure SDK (both need to be installed beforehand):

  1. Create new Console App –> Templates Visual C#  -> Console Application –> Name it, e.g. “WebJobProject”
  2. Replace the content from Program.cs with the full version of the code above, available here.
    1. It has the references to the libraries (using statements), logging progress + errors to standard output, namespace/class definitions, and additional code comments.
  3. Add References (in the Solution Explorer dialog –> right click references –> Add Reference)
    1. Framework –> System.Configuration
    2. Extensions –> Microsoft.WindowsAzure.Storage
  4. Build the solution (Ctrl+Shift+B)
  5. Zip the contents of the project “Debug” and name it, e.g. “WebJobProject.zip”
    1. To get to the Debug folder you can go to Solution Explorer –> click Show All Files icon –> expand “bin” folder to show the “Debug” folder –> right click to Open Folder in File Explorer
    2. To zip up the contents in a Windows 8 machine: Select All content –> right click to Send To –> “Compressed (zipped) folder”

WebJob setup

Here’s how the WebJob set up would look like. In a later section you’ll see how to create the WebJob content zip file.

These are the schedule details. As the backup feature is running daily, the WebJob schedule is also set up the same way.

Application parameters setup

This WebJob will look for specific website app settings, which need to be set up. For setting it up, you would need to your site on the Azure portal, select the “CONFIGURE” tab and scroll down to “ app settings”. Here’s where you would define your backup files retention policy in days, as well as indicate the storage account used for the backup process (actual info hidden in the picture). The “WEBSITE_SITE_NAME” setting is already provided for all websites and doesn’t need to be added. To view all your application settings, go to .scm.azurewebsites.net and click on “Environment”


Here’s how everything look after set up and a successful run

WebJobs dashboard post successful run

Log Files from the WebJob

Following the link to the logs on the WebJobs dashboard shows a web page with all the previous run results. The logs are very useful to understand failures on your WebJob execution, or just to confirm the job performed as expected.

Clicking through a particular run results you can see text output from the WebJobs. In this example you see log entries by the WebJob infrastructure (“SYS INFO”) and entries outputted by this WebJob (“INFO”)

Retention policy extension ideas

This can be easily changed to a more sophisticated retention policy based on content size, logarithmic periodicity (keeping more recent backups and fewer older backups), or even compare zip content differences to select which backups to delete. There are also many other ways to code and deploy your WebJob as posted here. Please share any feedback you might have as well as ideas for improvements.