Skip navigation

Azure Data Lake Storage

Massively scalable, secure data lake functionality built on Azure Blob Storage

Get powerful data lake functionality at cloud scale

Azure Data Lake Storage Gen2 is a highly scalable and cost-effective data lake solution for big data analytics. It combines the power of a high-performance file system with massive scale and economy to help you speed your time to insight. Data Lake Storage Gen2 extends Azure Blob Storage capabilities and is optimised for analytics workloads. Data Lake Storage Gen2 is the most comprehensive data lake available.

Read the blog

See more videos

Fast

Test models faster with a Hadoop-compatible file system that supports atomic file and folder operations and is optimised to execute jobs at lightning speed.

Scalable

Extend the global scale, durability and performance of Azure Blob Storage, and get support for massive storage accounts.

Secure

Meet the most stringent enterprise data security requirements with tools and resources such as POSIX-compliant, fine-grained ACL support, object store security with at-rest encryption, Azure Active Directory integration and storage account firewalls.

Cost-effective

Get data lake functionality at cloud object store pricing levels. Data Lake Storage Gen2 provides the same lifecycle policy management and object-level tiering that’s built into Blob Storage.

Service capabilities

Massive scalability

Near limitless storage for analytics data

Cloud object store pricing

Same low-cost data storage model as Azure Blob Storage

Fewer file and folder transactions

Atomic transactions for fewer compute cycles and faster job execution

Granular file and folder security

POSIX-compliant, fine-grained access control lists (ACLs)

Simplified ingestion in a single store

Consolidated data storage using the Data Lake Storage Gen2 or Blob Storage REST API

Full Azure Blob Storage feature set

Data lifecycle policy management; hot, cool and archive tiers; and high availability/disaster recovery support

Role-based access and storage account firewalls

Multi-layer security to govern data access so only users from authorised IPs can perform analytics

Common data model (CDM) support

Ability to exchange data with powerful applications like Microsoft Dynamics 365 (for CRM) and Power BI

Trusted partners

  • Informatica Cloud
  • Attunity
  • WANDisco
  • Striim
  • Qubole
  • Cloudera

What can you do with Data Lake Storage?

Modern data warehouse

Modern data warehouseA modern data warehouse lets you bring together all your data at any scale easily, and means you can get insights through analytical dashboards, operational reports or advanced analytics for all your users.12354
  1. Overview
  2. Flow

Overview

A modern data warehouse lets you bring together all your data at any scale easily, and means you can get insights through analytical dashboards, operational reports or advanced analytics for all your users.

Flow

  1. 1 Combine all your structured, unstructured and semi-structured data (logs, files and media) using Azure Data Factory to Azure Blob Storage.
  2. 2 Leverage data in Azure Blob Storage to perform scalable analytics with Azure Databricks and achieve cleansed and transformed data.
  3. 3 Cleansed and transformed data can be moved to Azure SQL Data Warehouse to combine with existing structured data, creating one hub for all your data. Leverage native connectors between Azure Databricks and Azure SQL Data Warehouse to access and move data at scale.
  4. 4 Build operational reports and analytical dashboards on top of Azure Data Warehouse to derive insights from the data, and use Azure Analysis Services to serve thousands of end users.
  5. 5 Run ad hoc queries directly on data within Azure Databricks.

Advanced analytics on big data

Advanced analytics on big dataTransform your data into actionable insights using the best-in-class machine learning tools. This architecture allows you to combine any data at any scale, and to build and deploy custom machine-learning models at scale.1234576
  1. Overview
  2. Flow

Overview

Transform your data into actionable insights using the best-in-class machine learning tools. This architecture allows you to combine any data at any scale, and to build and deploy custom machine-learning models at scale.

Flow

  1. 1 Bring together all your structured, unstructured and semi-structured data (logs, files and media) using Azure Data Factory to Azure Blob Storage.
  2. 2 Use Azure Databricks to clean and transform the structureless datasets and combine them with structured data from operational databases or data warehouses.
  3. 3 Use scalable machine learning/deep learning techniques to derive deeper insights from this data using Python, R or Scala, with inbuilt notebook experiences in Azure Databricks.
  4. 4 Leverage native connectors between Azure Databricks and Azure SQL Data Warehouse to access and move data at scale.
  5. 5 Power users take advantage of the inbuilt capabilities of Azure Databricks to perform root cause determination and raw data analysis.
  6. 6 Run ad hoc queries directly on data within Azure Databricks.
  7. 7 Take the insights from Azure Databricks to Cosmos DB to make them accessible through web and mobile apps.

Real-time analytics

Real-time analyticsGet insights from live, streaming data with ease. Capture data continuously from any IoT device or logs from website click-streams and process it in near-real time.12348765
  1. Overview
  2. Flow

Overview

Get insights from live streaming data with ease. Capture data continuously from any IoT device, or logs from website clickstreams, and process it in near-real time.

Flow

  1. 1 Easily ingest live streaming data for an application using Apache Kafka cluster in Azure HDInsight.
  2. 2 Bring together all your structured data using Azure Data Factory to Azure Blob Storage.
  3. 3 Take advantage of Azure Databricks to clean, transform and analyse the streaming data, and combine it with structured data from operational databases or data warehouses.
  4. 4 Use scalable machine learning/deep learning techniques to derive deeper insights from this data using Python, R or Scala, with inbuilt notebook experiences in Azure Databricks.
  5. 5 Leverage native connectors between Azure Databricks and Azure SQL Data Warehouse to access and move data at scale.
  6. 6 Build analytical dashboards and embedded reports on top of Azure Data Warehouse to share insights within your organisation and use Azure Analysis Services to serve this data to thousands of users.
  7. 7 Power users take advantage of the inbuilt capabilities of Azure Databricks and Azure HDInsight to perform root cause determination and raw data analysis.
  8. 8 Take the insights from Azure Databricks to Cosmos DB to make them accessible through real-time apps.

Related products and services

Azure Databricks

Fast, easy and collaborative Apache Spark-based analytics platform

Data Factory

Hybrid data integration at enterprise scale, made easy

SQL Data Warehouse

Elastic data warehouse as a service with enterprise-class features

Get started with Azure Data Lake Storage