Announcing general availability of Azure HDInsight 3.6

Veröffentlicht am 4 April, 2017

Program Manager, Azure, Big Data

This week at DataWorks Summit, we are pleased to announce general availability of Azure HDInsight 3.6 backed by our enterprise grade SLA. HDInsight 3.6 brings updates to various open source components in Apache Hadoop & Spark eco-system to the cloud, allowing customers to deploy them easily and run them reliably on an enterprise grade platform.

What’s new in Azure HDInsight 3.6

Azure HDInsight 3.6 is a major update to the core Apache Hadoop & Spark platform as well as with various open source components. HDInsight 3.6 has the latest Hortonworks Data Platform (HDP) 2.6 platform, a collaborative effort between Microsoft and Hortonworks to bring HDP to market cloud-first. You can read more about this effort here.

HDInsight 3.6 GA also builds upon the public preview of 3.6 which included Apache Spark 2.1. We would like to thank you for trying the preview and providing us feedback, which has helped us improve the product.

Apache Spark 2.1 is now generally available, backed by our existing SLA. We are introducing capabilities to support real-time streaming solutions with Spark integration to Azure Event Hubs and leveraging the structured streaming connector in Kafka for HDInsight. This will allow customers to use Spark to analyze millions of real-time events ingested into these Azure services, thus enabling IoT and other real-time scenarios. HDInsight 3.6 will only have the latest version of Apache Spark such as 2.1 and above. There is no support for older versions such as 2.0.2 or below. Learn more on how to get started with Spark on HDInsight.

Apache Hive 2.1 enables ~2X faster ETL with robust SQL standard ACID merge support and many more improvements. This release also includes an updated preview of Interactive Hive using LLAP (Long Lived and Process) which enables 25x faster queries.  With the support of the new version of Hive, customers can expect sub-second performance, thus enabling enterprise data warehouse scenarios without the need for data movement. Learn more on how to get started with Interactive Hive on HDInsight.

This release also includes new Hive views (Hive view 2.0) which provides an easy to use graphical user interface for developers to get started with Hadoop. Developers can use this to easily upload data to HDInsight, define tables, write queries and get insights from data faster using Hive views 2.0. Following screenshot shows new Hive views 2.0 interface.

hiveview

We are expanding our interactive data analysis by including Apache Zeppelin notebook apart from Jupyter. Zeppelin notebook is pre-installed when you use HDInsight 3.6, and you can easily launch it from the portal. Following screenshot shows Zeppelin notebook interface.

ApacheZeppelin

Getting started with Azure HDInsight 3.6

It is very simple to get started with Apache HDInsight 3.6 – simply go to the Microsoft Azure portal and create an Azure HDInsight service.

HDInsight in Azure portal 

Once you’ve selected HDInsight, you can pick the specific version and workload based on your desired scenario. Azure HDInsight supports a wide range of scenarios and workloads such as Hive, Spark, Interactive Hive (Preview), HBase, Kafka (Preview), Storm, and R Server as options you can select from. Learn more on creating clusters in HDInsight.

HDInsightClusterOption

Once you’ve complete the wizard, the appropriate cluster will be created. Apart from the Azure portal, you can also automate creation of the HDInsight service using the Command Line Interface (CLI). Learn more on how to create cluster using CLI.

We hope that you like the enhancements included within this release. Following are some resources to learn more about this HDI 3.6 release:

Learn more and get help

Summary

This week at DataWorks Summit, we are pleased to announce general availability of Azure HDInsight 3.6 backed by our enterprise grade SLA. HDInsight 3.6 brings updates to various open source components in Apache Hadoop & Spark eco-system to the cloud, allowing customers to deploy them easily and run them reliably on an enterprise grade platform.