How to get great performance when using Azure Storage is a topic we’ve talked with you about many times: during talks at TechEd and Build, in threads on forums, on our blog, and in person. It’s always exciting to see how passionate you are about making your applications perform as well as possible!
To help you further in this goal, we’ve now released the Azure Storage Performance Checklist which consolidates our performance guidance in a single easy to use document, in one easy to find location. It’s a short document (about 15 printed pages) that a developer should be able to read in about 30 minutes and it contains details of over 40 proven practices structured as a checklist, which will help you to improve the performance of your applications. Here is a small selection from the checklist:
Done |
Area |
Category |
Question |
Blobs |
Use Metadata |
Are you storing frequently used metadata in blob metadata to avoid having to download each blob to extract it each time? | |
Blobs |
Uploading Fast |
To upload one blob fast, are you uploading blocks in parallel? | |
Tables |
Configuration |
Are you using JSON for your table requests? | |
Tables |
Limiting Returned Data |
Are you using projection to avoid retrieving unneeded properties? | |
Queues |
Update Message |
Are you using UpdateMessage to store progress in processing a message and avoid having to reprocess from the start if the processing component encounters an error? | |
Queues |
Architecture |
Are you using queues to make your entire application more scalable by keeping long-running workloads out of the critical path and scale them independently? |
Developers can use this checklist to help design a new application or to validate an existing design, and while not every recommendation is relevant to every application, each of them is a broadly applicable practice that most applications will benefit from following.
We will keep this checklist up to date as we identify more proven practices and add to it when we introduce new Azure Storage features. If you have a recommendation for a proven practice that you don’t see in the current checklist, then please let us know.
Example Scenarios
Many of the recommendations in the checklist are simple to implement in your code. Here are three examples, each of which may have a significant effect on the performance of your application if you apply them in the correct context:
Scenario #1: Queues: Configuration
Have you turned Nagle off to improve the performance of small requests?
The Nagle algorithm is enabled by default. To disable it for a queue endpoint, you can use the following code. This code must execute before you make any calls to the queue endpoint:
var storageAccount = CloudStorageAccount.Parse(connStr); ServicePoint queueServicePoint = ServicePointManager.FindServicePoint(storageAccount.QueueEndpoint); queueServicePoint.UseNagleAlgorithm = false; CloudQueueClient queueClient = storageAccount.CreateCloudQueueClient();
Scenario #2: Blobs: Copying Blobs
Are you copying blobs in an efficient manner?
To copy blob data from a container in one storage account to a container in another storage account, you could first download and then upload the data as shown here:
CloudBlockBlob srcBlob = srcBlobContainer.GetBlockBlobReference("srcblob"); srcBlob.DownloadToFile(@"C:Tempcopyblob.dat",System.IO.FileMode.Create); CloudBlockBlob destBlob = destBlobContainer.GetBlockBlobReference("destblob"); destBlob.UploadFromFile(@"C:Tempcopyblob.dat", System.IO.FileMode.Open);
However, a much more efficient approach is to use one of the copy blob methods such as StartCopyFromBlob as shown here:
CloudBlockBlob srcBlob = srcBlobContainer.GetBlockBlobReference("srcblob"); CloudBlockBlob destBlob = destBlobContainer.GetBlockBlobReference("destblob"); destBlob.StartCopyFromBlob(GenerateSASUri(srcBlob));
Note that this example uses a Shared Access Signature (SAS) to access the private blob in the source container.
Scenario #3: Blobs: Uploading Fast
When trying to upload one blob quickly, are you uploading blocks in parallel?
If you are using the .NET Storage Client Library, it has the capability to manage parallel block uploads for you. The following code sample shows how you can use the BlobRequestOptions class to specify the number of threads to use for a parallel block upload (four in this example):
CloudBlockBlob blob = srcBlobContainer.GetBlockBlobReference("uploadinparallelblob"); byte[] buffer = ... var requestOptions = new BlobRequestOptions() { ParallelOperationThreadCount = 4 }; blob.UploadFromByteArray(buffer, 0, buffer.Length, null, requestOptions);
Note that the Storage Client Library may upload small blobs as a single blob upload instead of multiple block uploads: the SingleBlobUploadThresholdInBytes property of the BlobRequestOptions class sets the size threshold above which the Storage Client Library uses block uploads.
Summary and Call to Action
We have developed the Azure Storage Performance Checklist that contains over 40 proven practices pulled together from a wide variety of sources. This checklist will help you to make a significant difference to the performance of your applications that use the Azure Storage services.
For now, you should take a look at the checklist, print it out, and then see what you can do to improve the performance of your application! You should check back regularly for updates as we incorporate more proven practices into the checklist.
Jeff Irwin
Azure Storage Program Manager