This article will show you how to perform common scenarios using Blob storage. The samples are written in Python and use the Python Azure Storage package. The scenarios covered include uploading, listing, downloading, and deleting blobs.
Azure Blob storage is a service for storing large amounts of unstructured data, such as text or binary data, that can be accessed from anywhere in the world via HTTP or HTTPS. You can use Blob storage to expose data publicly to the world, or to store application data privately.
Common uses of Blob storage include:
The Blob service contains the following components:
Storage Account: All access to Azure Storage is done through a storage account. See Azure Storage Scalability and Performance Targets for details about storage account capacity.
Container: A container provides a grouping of a set of blobs. All blobs must be in a container. An account can contain an unlimited number of containers. A container can store an unlimited number of blobs.
Blob: A file of any type and size. Azure Storage offers three types of blobs: block blobs, page blobs, and append blobs.
Block blobs are ideal for storing text or binary files, such as documents and media files. Append blobs are similar to block blobs in that they are made up of blocks, but they are optimized for append operations, so they are useful for logging scenarios. A single block blob or append blob can contain up to 50,000 blocks of up to 4 MB each, for a total size of slightly more than 195 GB (4 MB X 50,000).
Page blobs can be up to 1 TB in size, and are more efficient for frequent read/write operations. Azure Virtual Machines use page blobs as OS and data disks.
For more information about blobs, see Understanding Block Blobs, Page Blobs, and Append Blobs.
You can address a blob in your storage account using the following URL format:
For example, here is a URL that addresses one of the blobs in the diagram above:
A container name must be a valid DNS name and conform to the following rules:
A blob name must conform to the following rules:
The Blob service is based on a flat storage scheme. You can create a virtual hierarchy by specifying a character or string delimiter within the blob name to create a virtual hierarchy. For example, the following list shows some valid and unique blob names:
/a /a.txt /a/b /a/b.txt
You can use the delimiter character to list blobs hierarchically.
The BlobService object lets you work with containers and blobs. The following code creates a BlobService object. Add the following near the top of any Python file in which you wish to programmatically access Azure Storage.
from azure.storage.blob import BlobService
The following code creates a BlobService object using the storage account name and account key. Replace 'myaccount' and 'mykey' with the real account and key.
blob_service = BlobService(account_name='myaccount', account_key='mykey')
Every blob in Azure storage must reside in a container. The container forms part of the blob name. For example,
mycontainer is the name of the container in these sample blob URIs:
Note that the name of a container must always be lowercase. If you include an upper-case letter in a container name, or otherwise violate the container naming rules, you may receive a 400 error (Bad Request). For rules on naming containers, see Naming and Referencing Containers, Blobs, and Metadata.
In the following code example, you can use a BlobService object to create the container if it doesn't exist.
By default, the new container is private, so you must specify your storage access key (as you did earlier) to download blobs from this container. If you want to make the files within the container available to everyone, you can create the container and pass the public access level using the following code.
Alternatively, you can modify a container after you have created it using the following code.
After this change, anyone on the Internet can see blobs in a public container, but only you can modify or delete them.
To upload data to a blob, use the put_block_blob_from_path, put_block_blob_from_file, put_block_blob_from_bytes or put_block_blob_from_text methods. They are high-level methods that perform the necessary chunking when the size of the data exceeds 64 MB.
put_block_blob_from_path uploads the contents of a file from the specified path, and put_block_blob_from_file uploads the contents from an already opened file/stream. put_block_blob_from_bytes uploads an array of bytes, and put_block_blob_from_text uploads the specified text value using the specified encoding (defaults to UTF-8).
The following example uploads the contents of the sunset.png file into the myblob blob.
blob_service.put_block_blob_from_path( 'mycontainer', 'myblob', 'sunset.png', x_ms_blob_content_type='image/png' )
To list the blobs in a container, use the list_blobs method. Each call to list_blobs will return a segment of results. To get all results, check the next_marker of the results and call list_blobs again as needed. The following code outputs the name of each blob in a container to the console.
blobs =  marker = None while True: batch = blob_service.list_blobs('mycontainer', marker=marker) blobs.extend(batch) if not batch.next_marker: break marker = batch.next_marker for blob in blobs: print(blob.name)
Each segment of results can contain a variable number of blobs up to a maximum of 5000. If next_marker exists for a particular segment, there may be more blobs in the container.
To download data from a blob, use get_blob_to_path, get_blob_to_file, get_blob_to_bytes, or get_blob_to_text. They are high-level methods that perform the necessary chunking when the size of the data exceeds 64 MB.
The following example demonstrates using get_blob_to_path to download the contents of the myblob blob and store it to the out-sunset.png file.
blob_service.get_blob_to_path('mycontainer', 'myblob', 'out-sunset.png')
Finally, to delete a blob, call delete_blob.
Now that you have learned the basics of Blob storage, follow these links to learn about more complex storage tasks.