HDInsight

A cloud Spark and Hadoop service for your enterprise

Azure HDInsight is the only fully-managed cloud Apache Hadoop offering that gives you optimized open-source analytic clusters for Spark, Hive, MapReduce, HBase, Storm, Kafka, and Microsoft R Server backed by a 99.9% SLA. Deploy these big data technologies and ISV applications as managed clusters with enterprise-level security and monitoring.

Reliable open-source analytics with an industry-leading SLA

Spin up enterprise-grade, open-source cluster types with a 99.9% SLA and 24/7 support. Our SLA covers your entire Azure big data solution, not just the virtual machine instances. HDInsight is designed for full redundancy and high availability, including head node replication, data geo-replication, and built-in standby NameNode, making HDInsight resilient to critical failures not addressed in standard Hadoop implementations. Azure also gives you cluster monitoring and 24/7 enterprise support, backed by Microsoft and Hortonworks with 37 combined committers for Hadoop core—more than all other managed-cloud providers combined—ready to support your deployment with the ability to fix and commit code back to Hadoop.

HDInsight works with Hadoop projects like Apache HBase, Apache Storm, Apache Hive, Apache Spark, and Apache Kafka
HDInsight meets compliance for HIPAA, PCI, SOC, DSS, and ISO

Enterprise-grade security and monitoring

Protect your data assets and extend your on-premises security and governance controls to the cloud with HDInsight. Get single sign-on (SSO), multi-factor authentication, and seamless management of millions of identities through Azure Active Directory. Authorize users and groups with fine-grained access control policies over all of your enterprise data with Apache Ranger. HDInsight meets Health Insurance Portability and Accountability Act (HIPAA), Payment Card Industry (PCI), and Service Organization Controls (SOC) compliance, helping you ensure that your enterprise data assets are always well-protected. To support the highest level of business continuity, HDInsight extends capabilities for alerting, monitoring, and defining preemptive actions, and it gives you enhanced workload protection through native integration with Microsoft Operations Management Suite.

High-productivity platform for developers and scientists

Use rich productivity suites for Hadoop and Spark with your preferred development environment such as Visual Studio, Eclipse, and IntelliJ for Scala, Python, R, Java, and .NET support. Data scientists can combine code, statistical equations, and visualizations to tell a story about their data through integration with the two most popular notebooks, Jupyter and Zeppelin. HDInsight is also the only managed-cloud Hadoop solution with integration to Microsoft R Server. Multithreaded math libraries and transparent parallelization in R Server handle up to 1000x more data and up to 50x faster speeds than open source R, which helps you to train more accurate models for better predictions than before.

Cost-effective cloud scale

Cost-effectively scale workloads up or down through decoupled compute and storage. Local storage can still be used for caching and fast I/O. Spark and interactive Hive users can choose SSD memory for interactive performance, while Kafka users can retain all streaming data in premium managed disks. Choose any Azure virtual machine type that enables the best utilization of resources, and only pay for the compute and storage that you use. A recent study showed HDInsight delivered 63% lower total cost of ownership (TCO) than deploying Hadoop on-premises.

Integration with leading productivity applications

A thriving market of independent software vendors (ISVs) give you value-added solutions across the broader ecosystem of Hadoop. Because every cluster is extended with edge nodes and script action, HDInsight lets you spin up Hadoop and Spark clusters that are pre-integrated and pre-tuned with any ISV application out-of-the-box, including Datameer, Cask, AtScale, and StreamSets.)

Easy for administrators to manage

Deploy Hadoop in the cloud without buying new hardware or paying other up-front costs. Launch your first cluster in minutes. There’s no time-consuming installation or setup, and no need to patch the operating system or upgrade Hadoop versions—Azure does it for you.

Customers building Hadoop clusters in Azure

Microsoft was recognized by Forrester Research as a leader in Cloud Hadoop Wave.

IDC found HDInsight delivered 63% lower TCO than deploying Hadoop on premises over five years.

Apache Hadoop® and associated open source project names are trademarks of The Apache Software Foundation.

Find and deploy applications built for the big data ecosystem

Find and deploy popular applications from trusted Hadoop partners available in the Azure Marketplace.

Try HDInsight clusters for free