Skip Navigation

Azure Data Lake Storage Gen2 pricing

Massively scalable, secure data lake storage

Azure Data Lake Storage Gen2 is the world’s most productive Data Lake. It combines the power of a Hadoop compatible file system with integrated hierarchical namespace with the massive scale and economy of Azure Blob Storage to help speed your transition from proof of concept to production.

  1. Is fully integrated with the Azure Blob Storage platform; Every Blob Storage capability (Azure Active Directory Integration, At Rest Encryption, High-availability and disaster recovery, Automated Lifecycle Policy Management, Storage Account Firewalls, etc.) can be used by Azure Data Lake Storage
  2. Is optimized for leading cloud analytic engines
  3. Is tightly integrated with all elements of the Azure Big Data Analytics stack to deliver fast insights
  4. Allows analytics data to coexist with object data in the same store with no programming changes or data copying
  5. Performs faster than other data stores thus making your analytics workloads run faster and lowering your TCO

General Purpose v2 provides access to the latest Azure storage features, including Cool and Archive storage, with pricing optimized for the lowest GB storage prices. These accounts provide access to Data Lake Storage, Block Blobs, Page Blobs, Files, and Queues.

Azure Data Lake Storage Gen2 provides the choice of organizing data in two different ways. With hierarchical namespaces option, customers can organize their Data Lake into structured directories, folders and files. With flat namespaces, customers can operate their Data Lake as an unstructured blob store. Regardless of the two options, customers will pay for the same storage price as per the table below. However, with the hierarchical namespace option, customers will be charged for additional meta-data associated with the folder and directory structure as part of the bill.

Data storage prices

Hot Cool Archive
First 50 terabyte (TB) / month $- per GB $- per GB $- per GB
Next 450 TB / Month $- per GB $- per GB $- per GB
Over 500 TB / Month $- per GB $- per GB $- per GB
Hot Cool Archive
First 50 terabyte (TB) / month $- per GB $- per GB $- per GB
Next 450 TB / Month $- per GB $- per GB $- per GB
Over 500 TB / Month $- per GB $- per GB $- per GB

Meta-data storage prices

Hot Cool Archive
GB / Month N/A N/A N/A
Hot Cool Archive
GB / Month $- N/A N/A

Archive early deletion

In addition to the per-GB, per-month charge, any blob that is moved to Archive is subject to an Archive early deletion period of 180 days. This charge is prorated. For example, if a blob is moved to Archive and then deleted or moved to the Hot tier after 45 days, the customer is charged an early deletion fee equivalent to 135 (180 minus 45) days of storing that blob in Archive.

Operations and data transfer prices

Hot Cool Archive
WriteOperations* (first 4MB, per 10,000) $- $- $-
WriteOperations* (beyond 4MB, per 10,000) $- $- $-
List and Create Container Operations (per 10,000) $- $- $-
Read Operations** (first 4MB, per 10,000) $- $- $-
Read Operations** (beyond 4MB, per 10,000) $- $- $-
Iterative Write Operations (100’s)+++ $- $- $-
All other Operations (per 10,000), except Delete, which is free $- $- $-
Data Retrieval (per GB) Free $- $-
Data Write (per GB) Free Free Free
*The following API calls are considered write operations: AppendFile,CreateFilesystem,CreatePath,CreatePathFile,FlushFile,SetFileProperties,SetFilesystemProperties,RenameFile,RenamePathFile,CopyFile **The following API calls are considered read operations: ReadFile, ListFilesystemFile +++The following API calls are considered iterative write operations: RenameDirectory,RenamePath,RenamePathDir
Hot Cool Archive
WriteOperations* (first 4MB, per 10,000) $- $- $-
WriteOperations* (beyond 4MB, per 10,000) $- $- $-
List and Create Container Operations (per 10,000) $- $- $-
Read Operations** (first 4MB, per 10,000) $- $- $-
Read Operations** (beyond 4MB, per 10,000) $- $- $-
Iterative Write Operations (100’s)+++ $- $- $-
All other Operations (per 10,000), except Delete, which is free $- $- $-
Data Retrieval (per GB) Free $- $-
Data Write (per GB) Free Free Free
*The following API calls are considered write operations: AppendFile,CreateFilesystem,CreatePath,CreatePathFile,FlushFile,SetFileProperties,SetFilesystemProperties,RenameFile,RenamePathFile,CopyFile **The following API calls are considered read operations: ReadFile, ListFilesystemFile +++The following API calls are considered iterative write operations: RenameDirectory,RenamePath,RenamePathDir

Data transfer prices for Block Blobs

Azure Data Lake Storage Gen2 is currently available only in LRS. The service is currently in preview, and more replication options such as ZRS, GRS and RA-GRS will be available soon. Data transfer pricing GRS and RA-GRS will be added as soon as these options are available.

See pricing for Azure Data Lake Storage Gen1 here.

FAQ

  • Azure Data Lake Storage is optimized for running analytic workloads on unstructured data. Azure Data Lake Storage Gen2 is optimized for fast I/O of high volume data, thereby making analytic workloads run faster and lowering the TCO for analytic jobs. Further, Azure Data Lake Storage Gen2 provides the added flexibility of organizing data either in a flat or hierarchical namespace.

  • With hierarchical namespaces, you can organize data into structured folders and directories. With a flat namespace, your files will be organized in a flat structure just like Blob Storage. A hierarchical namespace allows operations like folder renames and deletes to be performed in a single atomic operation, which with a flat namespace requires a number of operations proportionate to the number of objects in the structure. Hierarchical namespaces store additional meta-data for your directory and folder structure. However, as your data volume grows, hierarchical namespaces keeps your data organized and more importantly yields better storage performance on your analytic jobs – thus lowering your overall TCO to run analytic jobs.

  • While Blob Storage and Data Lake Storage with flat namespace are similar in the way data is stored, Data Lake Storage with flat names performs better for analytic workloads. You should put data into Blob Storage, if you are confident that there is no need to run analytics services like Azure Databricks, Azure HDInsight etc. However, if you need to run occasional analytics jobs, you should put such data into Azure Data Lake Storage Gen2 with flat namespaces. For data that will be use constantly for analytics, we recommend that you put such data into Azure Data Lake Storage Gen2 with hierarchical namespaces.

    • Billing using flat namespaces

    Let’s say you store 120TB of data for the whole month in Azure Data Lake Storage Gen2 using flat namespaces in the “Hot Tier”. During this month, you perform 100 million operations, and let’s assume each operation is 6MB in size. Further, let’s say that 20% of these operations are write operations and the other 80% are read operations. Finally, let’s assume you also rename 10K directories during the month.

    For flat namespaces, you will not incur additional charges for meta-data related to your files and folders and therefore will be charged for 120TB of data. Also, since every operation is 6MB, you will be charged two transactions per operation (4MB + 2MB), since every 4MB of data read or written is charged as a transaction. Note that for the second transaction in each operation (to read or write the last 2MB), you will be charged lesser than the first transaction (to read or write). Finally, directory renames are charged using a separate meter.

    This is how your total cost will be calculated:

    Resource Used Usage Volume Price Monthly Cost
    Storage 120TB

    $- for first 50TB

    $- for next 450TB

    Total

    $- * 50TB = $-

    $- * 70TB = $-

    $-

    Transactions

    First 4MB = 1

    Beyond 4MB = 1


    First 4MB = 1

    Beyond 4MB = 1

    $- per 10K

    $- per 10K


    $- per 10K

    $- per 10K

    Total

    $- per 10K * 20M = $-

    $- per 10K * 20M = $-


    $- per 10K * 80M = $-

    $- per 10K * 80M = $-

    $-

    Directory Renames 10K $- per 100 $- per 100 * 10K = $-
    Total Monthly Cost:

    Storage

    Transactions

    Directory Renames

    Total Monthly Cost

    $-

    $-

    $-

    $-


    • Billing using hierarchical namespaces

    Let’s say you store 120TB of data for the whole month in Azure Data Lake Storage Gen2 using hierarchical namespaces in the “Hot Tier”. During this month, you perform 100 million operations, and let’s assume each operation is 6MB in size. Further, let’s say that 20% of these operations are write operations and the other 80% are read operations. Finally, let’s assume you also rename 10K directories during the month.

    For hierarchical namespaces, you will incur additional charges for meta-data related to your files and folders and therefore will be charged for a little more 120TB of data. Also, since every operation is 6MB, you will be charged two transactions per operation (4MB + 2MB), since every 4MB of data read or written is charged as a transaction. Note that for the second transaction in each operation (to read or write the last 2MB), you will be charged lesser than the first transaction (to read or write). Finally, directory renames are charged using a separate meter.

    This is how your total cost will be calculated:

    Resource Used Usage Volume Price Monthly Cost
    Storage 132TB

    $- for first 50TB

    $- for next 450TB

    Total

    $- * 50TB = $-

    $- * 82TB = $-

    $-

    Transactions

    First 4MB = 1

    Beyond 4MB = 1


    First 4MB = 1

    Beyond 4MB = 1

    $- per 10K

    $- per 10K


    $- per 10K

    $- per 10K

    Total

    $- per 10K * 20M = $-

    $- per 10K * 20M = $-


    $- per 10K * 80M = $-

    $- per 10K * 80M = $-

    $-

    Directory Renames 10K $- per 100 $- per 100 * 10K = $-
    Total Monthly Cost:

    Storage

    Transactions

    Directory Renames

    Total Monthly Cost

    $-

    $-

    $-

    $-

  • Yes, larger files are more cost effective and yield better analytic performance. For files > 4MB in size, Azure Data Lake Storage Gen2 offers lower price for every 4MB block of data read beyond the first 4MB. To read a single file that is 16 MB is cheaper than reading 4 files that are 4MB each. In both cases, the total number of transactions are 4. However, the last 12 MB, read as three transactions, from the 16MB file are cheaper, there by making the total cost of reading a 16MB file cheaper than reading 4 files that are 4MB each.

    More importantly, Azure Data Lake Storage Gen2 is highly optimized to perform faster on larger files. This means that your analytics jobs will run faster, when operating on larger files, thus further your TCO for running analytics jobs.

Support & SLA

  • Free billing and subscription management support.
  • Flexible support plans starting at $29/month. Shop for a plan.
  • Guaranteed 99.9 percent or greater availability (excludes preview services). Read the SLA.

Resources

Estimate your monthly costs for Azure services

Review Azure pricing frequently asked questions

Learn more about Storage

Review technical tutorials, videos, and more resources

Added to estimate. Press 'v' to view on calculator View on calculator

Learn and build with $200 in credit, and keep going for free