Questions? Feedback? powered by Olark live chat software
Skip Navigation

HDInsight

A managed Apache Hadoop, Spark, R, HBase and Storm cloud service made easy

Comprehensive set of managed Apache big data projects

Scale elastically on demand

Azure HDInsight is an Apache Hadoop distribution powered by the cloud. This means that it handles any amount of data, scaling from terabytes to petabytes on demand. Spin up any number of nodes at any time. We only charge for the compute and storage that you use.

Beth Israel Deaconess Medical Center
It’s part of our audit requirements that we keep data for seven years, and some information has to be retained for as long as 30 years. With HDInsight, we can store more data and query it as needed.

–Don Wood, Beth Israel Deaconess Medical Center

Azure HDInsight offers cloud services to handle any amount of data
Hadoop cloud services lets you analyse large sets of data easily

Crunch all data – structured, semi-structured, unstructured

Because it’s 100 per cent Apache Hadoop, HDInsight can process unstructured or semi-structured data from web clickstreams, social media, server logs, devices and sensors and more. This lets you analyse new sets of data and uncover new business possibilities that drive your organisation forwards.

Ascribe
With a solution based on SQL Server and the Azure HDInsight service, we can capture data written in plain English and use it to improve services�This will reinvent the way we work with medical records in the future.

–Paul Henderson, Ascribe

Develop in your favourite language

HDInsight has powerful programming extensions for languages including C#, Java and .NET. Use your programming language of choice on Hadoop to create, configure, submit and monitor Hadoop jobs. See what else
Use your programming language of choice with the Hadoop cloud service
Azure HDInsight Hadoop cloud services is available in the cloud with no other up-front costs

Skip the hardware purchase and maintenance

With HDInsight, deploy Hadoop in the cloud without buying new hardware or incurring other up-front costs. There’s also no time-consuming installation or setup. Azure does it for you. Launch your first cluster in minutes.

McKesson
Because we’re on an elastic cloud with Azure, we don’t have to worry about setting up infrastructure or whether we can sustain growth with the current capacity in our data centres.

–Sujatha Bayyapureddy, McKesson

Use Excel or your favourite BI tool to visualise Hadoop data

Because it’s integrated with Excel, HDInsight lets you visualise and analyse your Hadoop data in compelling new ways, using a tool that’s familiar to your business users. From Excel, users can select HDInsight as a data source.

Black Ball
I looked at some of the other BI solutions on the market, and most were overly complex, especially from an end-user point of view.

–Andrew Cheong, BlackBall

Use Excel to visualise all your Hadoop data
Use the cloud to connect on-premises Hadoop clusters

Connect on-premises Hadoop clusters with the cloud

HDInsight is also integrated with Hortonworks Data Platform, letting you move Hadoop data from an on-site data centre to the Azure cloud for backup, Dev/Test and cloud-bursting scenarios. Using the Microsoft Analytics Platform System, you can even query your on-premises and cloud-based Hadoop clusters at the same time.

Customise clusters to run other Hadoop projects

The Apache Hadoop ecosystem is a portfolio of fast-moving open-source projects that are evolving quickly. HDInsight gives you the flexibility to deploy arbitrary Hadoop projects through custom scripts. This includes popular projects such as Spark, R, Giraph and Solr.

Use NoSQL transactional capabilities offered by Azure

Use NoSQL transactional capabilities

HDInsight also includes Apache HBase, a columnar NoSQL database that runs on top of the Hadoop Distributed File System (HDFS). This lets you do large transactional processing (OLTP) of non-relational data, enabling use cases such as interactive websites or having sensor data write to Azure Blob Storage.

Provide real-time stream processing

HDInsight includes Apache Storm, an open-source stream analytics platform that can process real-time events at large scale. This lets you process millions of events as they’re generated, enabling use cases such as Internet of Things (IoT) and gaining insights from your connected devices or web-triggered events. We make deploying and implementing Storm easier. Learn more about Storm

Use Spark for interactive analysis

HDInsight includes Apache Spark, an open-source project in the Apache ecosystem that can run large-scale data analytics applications in-memory. Spark delivers queries up to 100 times faster than traditional big data queries. It provides a common execution model for tasks such as ETL, batch queries, interactive queries, real-time streaming, machine learning and graph processing on data stored in Azure Storage. Learn more about Spark

Use R to support predictive modelling and machine learning

HDInsight incorporates R Server for Hadoop, a scale-out implementation of one of the most popular programming languages for statistical computing and machine learning. R Server on HDInsight is a cloud implementation of 100 per cent open-source R integrated with Hadoop and Spark clusters. It gives the familiarity of R with the scalability and performance of Hadoop. Learn more about R Server for HDInsight

Deploy to Windows and Linux

Select Linux or Windows clusters when deploying big data workloads into Azure. With Windows, use existing Windows-based code, including .NET, to scale over all your data in Azure. With Linux, you can more easily move existing Hadoop workloads into the cloud and incorporate additional big data components that can run in the service. By offering both Windows and Linux clusters, Microsoft gives you the flexibility to use the operating system of your choice, gaining insights from the massive amounts of data being created in the cloud.

*Hadoop and the Hadoop elephant logo are trademarks of the Apache Software Foundation.

Customers building Hadoop in Azure

Try HDInsight clusters for free

Try it now