Update as of September 23, 2016: Automatic online backup and restore functionality is now available and documented in the Automatic online backup and restore with DocumentDB documentation article.
One of the questions the DocumentDB team frequently gets asked is “What are the recommended patterns for backing up my database?” Backing up is a basic building block of a Business Continuity and Disaster Recover (BCDR) plan, and something you absolutely cannot afford to get wrong.
The DocumentDB service is internally backed up with geo-redundancy built in. This is a data protection measure we take to ensure that even in the face of regional failures, customer data remains safe. However, we have heard from customers that they would like to have their own, additional backups that can be archived and restored based on individual business needs. Through engagement with our customers, we developed and verified backup strategies to enable all customers to build a successful BCDR plan.
Scenarios
Common events that would trigger a BCDR response can be broken down into two scenarios:
Oops! I deleted data by mistake
Programmatic or user error that results in data being accidentally deleted or malformed.
Uh oh! The fiber link to the datacenter was cut by a back hoe!
Availability impacting incidents that take the service region offline.
Blob-based backup, the “Oops” scenario
The recommended strategy to provide business continuity for this failure case is to maintain a collection level backup of all the data in your DocumentDB account.
You have two options for implementation:
- In-Cloud backup using Azure Data Factory: Information on how to do this is available here: Moving data to and from DocumentDB using Azure Data Factory. This option works well for more uniform data sets and at scale.
- On-premises (+ Cloud) backup using the DocumentDB Data Migration Tool: You can use the the tool to perform backups using a physical or virtual Windows Machine (source: GitHub). It enables you to output your data either to local storage or Azure Blob storage (for additional geo-redundancy).
Using the DocumentDB Migration Tool
Use this tool to run an export operation as described here. Follow all the steps but instead of executing the export operation on the Summary page, you can view the command by hitting the View Command button on the top right of the Data Migration Tool window, as shown in the following screenshot. The underlying command is then displayed in the Command Line Preview window.
Note: The Collection field in the Source Information pane accepts regular expressions
This command can now be used, copied, and run with the command line core of the Data Migration tool, dt.exe.
An easy way to run this command on a regular schedule would be using the Windows Task Scheduler.
To launch the Task Scheduler, just hit the Win+S (search) and type in Task Scheduler. On the right hand pane, click Create Task which will pop up the below window. Here you can name your task and add a description. You can also set the task to Run whether the user is logged in or not (recommended).
Switching to the Triggers tab, click New to set up a trigger. Here you can set the conditions that will cause the task to run. In the screenshot below, the task would be triggered daily at 12:00 AM. Hit OK to save that Trigger.
Switching to the Actions tab, click New to set up the command to launch the actual backup task. Click Browse and select dt.exe from the folder where you unzipped the Migration Tool.
In the Add arguments (optional) text box, past in the command that you copied from the Migration Tool.
Hit OK to close the New Action pane and then OK to save and close the Create Task.
This scheduled task will now launch the Migration Tool using the settings you configured at the time you scheduled, automating backups for your DocumentDB database.
The reader may have noticed that this action would result in a failure due to the potential overwrite of the backup on its second run. The way to maintain snapshots would be to wrap the dt.exe command in a PowerShell script that generates a new output file name with unique character (for example, the date) before calling dt.exe.
Restoring Data
If the data is stored in an Azure Blob store, you can use the Azure Data Factory to quickly write it back to a new (or deleted and recreated in case of bad writes) collection/s in Document DB. This option is well suited for large datasets with uniform structure.
The other option for restoring data is to run the Migration tool with the source and destinations reversed i.e. source is the backup file and destination is a DocumentDB account.
Double writes, the“Outage” scenario
This scenario is best covered by performing double writes to a secondary DocumentDB database account in a different region. The recommended pattern would be one of the following:
- Design your application’s data access layer to transparently double-commit writes to the two DocumentDB accounts.
- Run an Azure Web App that exposes a simple REST interface through which all requests to your DocumentDB account pass. This pass-through service would manage the double-commit of any writes for all clients. This pattern has the advantage of reducing the durability burden on the clients.
- Run an Azure Cloud Service that, on a regular schedule, performs the following steps:
- Look for all documents with the “ts” property greater than the last time this job was run
- Create/Update each document (addressed by its unique ID) in the database in the backup region. This pattern trades off sub-second Recovery Point Objective (the maximum time window for data loss) for reduced write latency and increased throughput.
Restoring Data
In the event that availability of the primary region is impacted, reads and writes can continue to the secondary region. When the primary region is available again, the easiest recovery strategy is to suspend writes from clients to the secondary region, and use the DocumentDB Migration tool to rebuild the primary region.
If you are not running in a Windows environment, you have two options:
- Start up a Windows VM in Azure to run the Migration Tool.
- Write a recovery script using the DocumentDB SDK that queries the secondary region and writes all the JSON objects back to the primary region.
Trade-offs
The above suggested patterns involve the following trade-offs:
Feature | Blob-based backup | Double Writes |
Point-in-time Backup |
Requires suspending writes from the client |
Requires suspending writes from the client |
Cross-Collection Consistency |
Requires suspending writes from the client |
It is possible to maintain cross-collection consistency if the client makes sure both writes are always successful. |
Geo-redundancy |
Low if using on-premises storage, higher if using Blob storage. |
Higher redundancy due to geographical separation of data |
Bad Write Rollback |
Possible |
Will leak through and write to backup |
Summary
Using one at least one of the above strategies, DocumentDB administrators can ensure that their data remains safe and available in the event that a natural disaster or other disruption strikes. Using both the above strategies simultaneously will cover both of the scenarios discussed and provide the most comprehensive safety net.
The DocumentDB team is actively improving backup support for the service, but we wanted to share our current guidance as it is a hot topic.
To learn more about DocumentDB, visit the DocumentDB product page. If you need any help or have questions or feedback, please reach out to us on the developer forums on Stack Overflow or schedule a 1:1 chat with the DocumentDB engineering team.
Stay up-to-date on the latest DocumentDB news and features by following us on Twitter @DocumentDB.