Azure HDInsight

OVERVIEW

Manage your big data needs in an open-source platform

Run popular open‑source frameworks like Hadoop, Spark, Hive, and Kafka with Azure HDInsight—an enterprise‑grade, customizable analytics service that processes massive data at global scale and simplifies migrating big data workloads to the cloud.

Stay up to date with the newest releases of open source frameworks, including Kafka, HBase, and Hive LLAP. HDInsight supports the latest open-source projects from the Apache Hadoop and Spark ecosystems.
Build your data lake through seamless integration with Azure data storage solutions and services including Azure Synapse Analytics, Azure Cosmos DB, Azure Data Lake Storage, Azure Blob Storage, Azure Event Hubs, and Azure Data Factory. Control costs by choosing from a wide variety of virtual machines and by leveraging load- or schedule-based autoscaling features. Monitor your entire data lake using Azure Monitor dashboards.
Use your preferred productivity tools, including Visual Studio, Eclipse, IntelliJ, Jupyter, and Zeppelin. Write code in familiar languages such as Scala, Python, R, JavaScript, and .NET.

Features

Managed open-source clusters for secure, scalable analytics

A person sitting in a car using a laptop.

Pricing

Pay for only what you need

HDInsight offers a broad range of memory- or compute-optimized platforms (virtual machines). Choose the one that best suits your performance and cost requirements.

See Azure HD Insight pricing

Customer stories

Trusted by companies of all sizes

Myntra accelerates its digital transformation

Myntra has worked closely with Microsoft to migrate its platform—from supply chain management to inventory to site capabilities to Azure for trusted, always-on, hyperscale and cost-effective computing.

Gap Inc. accelerates its digital transformation

By building and centralizing its data platform on Azure, Gap Inc. can now apply advanced analytics and machine learning to gain a comprehensive understanding of customers across channels in all brands in its portfolio.

Resources

HDInsight resources and documentation

Create Apache Spark cluster in Azure HDInsight using Azure portal

Learn how to use Apache Spark SQL with Jupyter notebooks in HDInsight.

Learn more

Two people seated on a sofa discussing content on a laptop in a living room setting with framed artwork in the background

Building Open Source Software (OSS) Analytics Solutions with Azure HDInsight

Explore training to build open-source analytical solutions on HDInsight.

Learn more

What is Azure HDInsight?

Learn about the full capabilities and architecture of Azure HDInsight.

Learn more

FAQ

You would benefit from Azure HDInsight if you use custom code to process and analyze extremely large datasets with the latest big data processing frameworks such as Spark, Hadoop, Hive, Kafka or Hbase. Azure HDInsight gives you full control over the configuration of your clusters and the software installed on them. You might also consider HDInsight if you are migrating Hortonworks, Cloudera, or MapR clusters from on-premises environments or other clouds.
Azure HDInsight can be used for a variety of scenarios in big data processing. It can be historical data (data that's already collected and stored) or real-time data (data that's directly streamed from the source). The scenarios for processing such data can be summarized in the following categories: batch processing (ETL), data warehousing, Internet of Things (IoT), data science, and hybrid.
To learn more about HDInsight clusters types and provisioning methods, read our documentation about how to set up clusters in HDInsight with Apache Hadoop, Apache Spark, Apache Kafka, and more.

Person standing behind a laptop with arms crossed, blueprints on wall, modern workspace setting.

Next steps

Get started with an Azure free account

Pay as you go or try Azure free for up to 30 days.

Try for free

A woman in a green shirt with short curly hair smiling while looking at a woman wearing yellow shirt.

Azure Solutions

Learn about more Azure cloud solutions

Solve your business problems with proven combinations of Azure cloud services, as well as sample architectures and documentation.

Explore Azure AI solutions

Business Solutions Hub

Find the right Microsoft Cloud solution

Browse the Microsoft Business Solutions Hub to find the products and solutions that can help your organization reach its goals.

Explore solutions

Manage your big data needs in an open-source platform

Build your projects in an open-source ecosystem

Integrate natively with Azure services

Get the flexibility of multiple languages and tools

Managed open-source clusters for secure, scalable analytics

Fast cluster setup

Flexible cost control

Trusted data protection

Modern open‑source support

Embedded security and compliance

Pay for only what you need

Trusted by companies of all sizes

Myntra accelerates its digital transformation

Gap Inc. accelerates its digital transformation

HDInsight resources and documentation

Create Apache Spark cluster in Azure HDInsight using Azure portal

Building Open Source Software (OSS) Analytics Solutions with Azure HDInsight

What is Azure HDInsight?

Spark and Hive Tools for Visual Studio Code

PySpark for Visual Studio Code

Azure toolkit for IntelliJ

Azure toolkit for Eclipse

Install Jupyter Notebook locally for Apache Spark

Use Apache Zeppelin notebooks with Azure HDInsight

Frequently asked questions

Who should use HDInsight?

What are the scenarios for using HDInsight?

How do I provision an HDInsight cluster?

Get started with an Azure free account

Learn about more Azure cloud solutions

Find the right Microsoft Cloud solution