This is the Trace Id: 768172d2283117e42a4bd25a91ff88fe
Skip to main content
Azure

Azure Data Lake Storage

Massively scalable and secure data lake for your high-performance analytics workloads.
OVERVIEW

Build a foundation for your high-performance analytics

Eliminate data silos with a single storage platform. Optimize costs with tiered storage and policy management. Authenticate data using Microsoft Entra ID (formerly Azure Active Directory) and role-based access control (RBAC). And help protect data with security features like encryption at rest and advanced threat protection.
  • Meet any capacity requirements and manage data with ease with Azure global infrastructure. Run large-scale analytics queries with consistently high performance.
    A row of computer servers in a room.
  • Safeguard your data lake with capabilities that span encryption, data access, and network-level control—all designed to help you drive insights more securely.
    A group of people sitting at desks in a room with the word PROTECT visible.
  • Ingest data at scale using a wide range of data ingestion tools. Process data using Azure Databricks, Azure Synapse Analytics, or Azure HDInsight. And visualize the data with Microsoft Power BI for transformational insights.
    A hand touching a tablet screen.
  • Optimize costs by scaling storage and compute independently—which you can’t do with on-premises data lakes. Tier up or down based on usage and take advantage of automated lifecycle management policies for optimizing storage costs.
    A man standing in front of a large screen displaying numbers and text.
Features

Key storage platform capabilities

Scalability

Limitless scale and 16 9s of data durability with automatic geo-replication.

Security

Highly secure storage with flexible mechanisms for protection across data access, encryption, and network-level control.

Analytics

Single storage platform for ingestion, processing, and visualization that supports the most common analytics frameworks.

Optimization

Cost optimization via independent scaling of storage and compute, lifecycle policy management, and object-level tiering.
Security

Embedded security and compliance

34,000
Full-time equivalent engineers dedicated to security initiatives at Microsoft.
15,000
Partners with specialized security expertise.
 
>100
Compliance certifications, including over 50 specific to global regions and countries.
A woman is looking at the laptop.
Pricing

Flexible pricing for building data lakes

Choose from pricing options including tiering, reservations, and lifecycle management.
Customer stories

Trusted by companies of all sizes

FAQ

Frequently asked questions

  • Adding the Hierarchical Namespace on top of blobs allows the cost benefits of cloud storage to be retained, without compromising the file system interfaces that big data analytics frameworks were designed for.

    A simple example is a frequently occurring pattern of an analytics job writing output data to a temporary directory, and then renaming that directory to the final name during the commit phase. In an object store (which, by design, doesn’t support the notion of directories), these renames can be lengthy operations involving N copy and delete operations, wherein N is the number of files in the directory. With the Hierarchical Namespace, these directory manipulation operations are atomic, improving performance and cost. Additionally, supporting directories as elements of the file system permits the application of POSIX-compliant access control lists (ACLs) that use parent directories to propagate permissions.
  • Similar to other cloud storage services, Data Lake Storage is billed according to the amount of data stored plus any costs of operations performed on that data. See a cost breakdown.
  • Data Lake Storage is primarily designed to work with Hadoop and all frameworks that use the Hadoop FileSystem as their data access layer (for example, Spark and Presto). See details.

    In Azure, Data Lake Storage is interoperable with:

    • Azure Data Factory
    • Azure HDInsight
    • Azure Databricks
    • Azure Synapse Analytics
    • Power BI
    The service is also included in the Azure Blob Storage ecosystem.
  • Data Lake Storage provides multiple mechanisms for data access control. By offering the Hierarchical Namespace, the service is the only cloud analytics store that features POSIX-compliant access control lists (ACLs) that form the basis for Hadoop Distributed File System (HDFS) permissions. Data Lake Storage also includes capabilities for transport-level security via storage firewalls, private endpoints, TLS 1.2 enforcement, and encryption at rest using system or customer supplied keys.
A person sitting on a couch using a laptop.
Next steps

Get started with an Azure free account

Pay as you go or try Azure free for up to 30 days.
A woman in a green shirt with short curly hair smiling while looking at a woman wearing yellow shirt.
Azure Solutions

Learn about more Azure cloud solutions

Solve your business problems with proven combinations of Azure cloud services, as well as sample architectures and documentation.
A man in a white shirt using a laptop.
Business Solutions Hub

Find the right Microsoft Cloud solution

Browse the Microsoft Business Solutions Hub to find the products and solutions that can help your organization reach its goals.