Trace Id is missing
Skip to main content

Azure HDInsight

Provision cloud Hadoop, Spark and HBase clusters.

Manage your big data needs in an open-source platform

Run popular open-source frameworks—including Apache Hadoop, Spark, Hive, Kafka, and more—using Azure HDInsight, a customizable, enterprise-grade service for open-source analytics. Effortlessly process massive amounts of data and get all the benefits of the broad open-source project ecosystem with the global scale of Azure. Easily migrate your big data workloads and processing to the cloud.

Learn more about the new version of HDInsight including refresh of every element of the stack

Open-source projects and clusters are easy to spin up quickly without the need to install hardware or manage infrastructure

Big data clusters reduce costs through autoscaling and pricing tiers that allow you to pay for only what you use

Enterprise-grade security and industry-leading compliance with more than 30 certifications helps protect your data

Optimized components for open-source technologies such as Hadoop and Spark keep you up to date

Build your projects in an open-source ecosystem

Stay up to date with the newest releases of open source frameworks, including Kafka, HBase, and Hive LLAP. HDInsight supports the latest open-source projects from the Apache Hadoop and Spark ecosystems.

Logos of open source frameworks such a Kafka, HBase, Hive LLAP

Integrate natively with Azure services

Build your data lake through seamless integration with Azure data storage solutions and services including Azure Synapse Analytics, Azure Cosmos DB, Azure Data Lake Storage, Azure Blob Storage, Azure Event Hubs, and Azure Data Factory. Control costs by choosing from a wide variety of virtual machines and by leveraging load- or schedule-based autoscaling features. Monitor your entire data lake using Azure Monitor dashboards.

Get the flexibility of multiple languages and tools

Use your preferred productivity tools, including Visual Studio, Eclipse, IntelliJ, Jupyter, and Zeppelin. Write code in familiar languages such as Scala, Python, R, JavaScript, and .NET.

A user searching to use HDInsight in Visual Studio Code.

Comprehensive security and compliance, built in

Get started with an Azure free account

1

Start free. Get $200 credit to use within 30 days. While you have your credit, get free amounts of many of our most popular services, plus free amounts of 55+ other services that are always free.

2

After your credit, move to pay as you go to keep building with the same free services. Pay only if you use more than your free monthly amounts.

3

After 12 months, you'll keep getting 55+ always-free services—and still pay only for what you use beyond your free monthly amounts.

Trusted by companies of all sizes

Myntra accelerates its digital transformation

Myntra has worked closely with Microsoft to migrate its platform—from supply chain management to inventory to site capabilities to Azure for trusted, always-on, hyperscale and cost-effective computing.

Myntra

Gap Inc. accelerates its digital transformation

By building and centralizing its data platform on Azure, Gap Inc. can now apply advanced analytics and machine learning to gain a comprehensive understanding of customers across channels in all brands in its portfolio.

GAP
Back to tabs

Frequently asked questions about HDInsight

  • You would benefit from Azure HDInsight if you use custom code to process and analyze extremely large datasets with the latest big data processing frameworks such as Spark, Hadoop, Hive, Kafka or Hbase. Azure HDInsight gives you full control over the configuration of your clusters and the software installed on them. You might also consider HDInsight if you are migrating Hortonworks, Cloudera, or MapR clusters from on-premises environments or other clouds.

  • Azure HDInsight can be used for a variety of scenarios in big data processing. It can be historical data (data that's already collected and stored) or real-time data (data that's directly streamed from the source). The scenarios for processing such data can be summarized in the following categories: batch processing (ETL), data warehousing, Internet of Things (IoT), data science, and hybrid.

  • To learn more about HDInsight clusters types and provisioning methods, read our documentation about how to set up clusters in HDInsight with Apache Hadoop, Apache Spark, Apache Kafka, and more.

Ready when you are—let's set up your Azure free account

Try Azure for free