Today, we are pleased to announce a preview of Azure HDInsight 3.6. We are enabling this preview to get feedback on Apache Spark 2.1. You can try out all the features available in the open source release of Apache Spark 2.1, along with the rich experience of using notebooks on Azure HDInsight. This post is a short summary on how to get started with this preview.
What’s new in Spark 2.1
The open source Apache Spark 2.1 release brings in a ton of improvements for developers. These improvements range from Structured Streaming to allowing developers to use Apache Kafka (version 0.10) with Spark Streaming.
To learn more about all of the improvements in Apache Spark 2.1, please read the release notes on the Apache Spark project.
Get started with Apache Spark 2.1 on HDInsight
It is very simple to get started with Apache Spark 2.1 Preview. You can go to Microsoft Azure portal and create an Azure HDInsight service.
Once you select HDInsight, you can pick the Spark cluster type with version Spark 2.1 (HDI 3.6 Preview).
After creating the cluster you will have access to all the tools, services and notebooks, including Jupyter. You can access the Jupyter notebook by clicking “Cluster dashboard”.
We hope that you like this preview. Following are some resources to learn more about using Spark on HDInsight.
Learn more and get help
- Apache Spark 2.1 release notes
- Apache Spark on HDInsight
- Getting started with Spark on HDInsight
- Get help on Spark questions
- Ask HDInsight questions on stackoverflow
Frequently Asked Questions (FAQ’s)
Following is a set of commonly asked questions and known issues in this preview.
Can I use any other cluster besides Spark in HDInsight 3.6?
For this preview release we are only enabling Spark cluster for version 2.1.
I cannot connect to BI tools with Spark 2.1.
You cannot connect BI tools to Spark 2.1 using ODBC driver in this preview.
I cannot use Azure Data Lake Store with Spark 2.1.
In this preview, you can only store data in Azure Blob Storage and use from your Spark 2.1 cluster. Azure Data Lake Store is not yet supported.
Why is Spark 2.1/ HDInsight 3.6 in preview?
We are releasing HDInsight 3.6 as preview so that we can enable users to try the improvements in Spark 2.1 and give us feedback. We are working on improving the experience of Spark 2.1 in HDInsight and once ready, we will make it generally available.
What is the Support & SLA provided for this preview?
We are pleased to announce a preview of Microsoft Azure HDInsight 3.6 along with Apache Spark 2.1. We are inviting you to try this preview and give us feedback so we can improve the experience.