• 2 min read

Import Sample Data to Azure DocumentDB

We’re always looking for ways to make it even easier to get started with DocumentDB.

We’re always looking for ways to make it even easier to get started with DocumentDB.  You likely already know about our great Query Playground which lets you quickly learn and try out DocumentDB’s rich querying capabilities for yourself: nupostplayground1 Hopefully, you also know about our Data Migration Tool, which lets you quickly and easily import data to DocumentDB from various sources such as JSON files, CSV files, MongoDB, SQL Server, Azure Table storage, and even existing DocumentDB collections: nupostmigrationtool1 Now, we’re making the nutrition data set* available, which helps power the Query Playground.  Use the Data Migration Tool to import the data set to your DocumentDB account:

  1. Download the sample data set.
  2. Start the Data Migration tool and choose JSON source.
  3. Click Add Files, select the file you downloaded in step 1, and click next.
  4. Choose DocumentDB – Bulk Import as your target, provide the required information, and click next.
  5. Click Import.

  In my tests, the import takes approximately 8 minutes using an S3 collection.  Actual import time will vary based on your proximity to the Azure region in which your DocumentDB account is hosted, the performance tier of the target collection, and your available network bandwidth.  Once the import is finished, however, you should have 8,618 items. Here are a couple sample queries you can execute using the DocumentDB Query Explorer in the preview Azure portal (note that if you copy/paste, you may need to reformat the ” character in the queries below.  If you don't, you'll receive a syntax error when running the query in the Query Explorer): Find the nutritional information for various granola bars (for a full bar): SELECT food.description, serving.amount, serving.description as servingDescription, serving.weightInGrams FROM food JOIN tag IN food.tags JOIN serving IN food.servings WHERE tag.name = “granola bars” AND CONTAINS(serving.description, “bar”) Get documents in a batch by their ids: SELECT Food.id, Food.description FROM Food WHERE Food.id IN ( “01236”, “01237”, “01263”, “06152”, “21224”, “21225”, “21226”, “21227”, “21505”, “22903”, “14003”, “14004”, “14005”, “14006”, “14007” ) Let us know what else we can do to make it easy to get started with DocumentDB by submitting feedback.  To learn more about DocumentDB, please visit our service page.

*This sample dataset has been modified for use from its original source, www.ars.usda.gov, the official website of the United States Department of Agriculture.