Skip Navigation

Azure Data Lake Storage

Massively scalable data lake storage

Azure Data Lake Storage Gen2 is a highly scalable and cost-effective data lake solution for big data analytics. It combines the power of a high-performance file system with massive scale and economy to help you speed your time to insight. Data Lake Storage Gen2 extends Azure Blob Storage capabilities and is optimized for analytics workloads. Store data once and access it via existing Blob Storage and HDFS-compliant file system interfaces with no programming changes or data copying. Data Lake Storage Gen2 is the most comprehensive data lake available.

Why Data Lake Storage?

Productive

Test models faster with a Hadoop-compatible file system that supports atomic file and folder operations and is optimized to execute jobs at lightning speed.

Trusted

Our data lake file system is built to meet the most stringent enterprise data security requirements. It features POSIX-compliant fine-grained ACL support, object store security with at-rest encryption, Azure Active Directory integration, and storage account firewalls.

Scalable

Harness the global scale, durability, and performance of Azure Blob Storage, including support for massive storage accounts.

Cost-effective

Get data lake functionality at cloud object store pricing levels. Data Lake Storage Gen2 leverages the lifecycle policy management and object-level tiering functionality built into Azure Blob Storage to optimize data storage costs, with no data copying between services.

Data Lake Storage capabilities

Build sophisticated analytics workflows quickly

Data Lake Storage Gen2 natively integrates with other Azure data services, including Azure Databricks and Azure Data Factory, for building end-to-end big data and advanced analytics solutions.

Run jobs faster and more efficiently

Big data analytics workloads can incur significant transaction costs during job execution, such as when files and folders are created, renamed, or deleted. Data Lake Storage Gen2 supports atomic file operations, which significantly reduce the transaction overhead required for job execution and the time it takes for big data analytics jobs to complete.

Unleash analytics efforts globally

Data Lake Storage Gen2 is compliant with regional data management requirements. When generally available, Data Lake Storage Gen2 will be available in all Azure regions.

What can you do with Data Lake Storage?

Modern data warehouse

Advanced analytics on big data

Real time analytics

Related products and services

Azure Databricks

Fast, easy, and collaborative Apache Spark-based analytics platform

Data Factory

Hybrid data integration at enterprise scale, made easy

HDInsight

Provision cloud Hadoop, Spark, R Server, HBase, and Storm clusters