Enterprise Security Package GA for HDInsight 3.6
The HDInsight team is excited to announce the general availability of Enterprise Security Package (ESP) for Apache Spark, Apache Hadoop and Interactive Query clusters in HDInsight 3.6. When enterprise customers share clusters between multiple employees, Hadoop admins must ensure those employees have the right set of accesses and permissions to perform big data operations. In enterprises, multi-user access with granular authorization using the same identities in the enterprise is a complex and lengthy process. Enabling ESP with the new experience provides authentication and authorization for these clusters in a more streamlined and secure manner.
For authentication, open source Apache Hadoop relies on Kerberos. Customers can enable Azure AD Domain Services (AAD-DS) as the main domain controller and use that for domain joining of the clusters. The same identities available in AAD-DS will then be able to login to the cluster.
For authorization, customers can set Apache Ranger policies to get fine-grained authorization in their clusters. Apache Hive and Yarn Ranger plugins are available for setting these policies.
To learn more about ESP and how to enable it, see our documentation.
Public preview of ESP for Apache Kafka and HBase
We are also expanding ESP to HDInsight 3.6 Apache Kafka and Apache HBase clusters. Apache Kafka and HBase customers can now use domain accounts and Apache Ranger for Authentication and Authorization. Enabling ESP means Apache Kafka and HBase Ranger plugins will be available out of the box. To learn more about secure Kafka clusters, see our documentation.
Managed Identity support in HDInsight
With the most recent set of security enhancements in HDInsight, we are now excited to announce that user-assigned managed identity is now supported in HDInsight. Previously customers had to provide a service account with a password to enable ESP. This process is now simplified with managed identity. Customers now create and provide a managed identity at the cluster creation time without entering any password. Today, there are two main scenarios that rely on managed identity:
- As a pre-requisite for enabling ESP, users should enable AAD-DS, then create a managed identity and give it the correct permission in AAD-DS Access control (IAM) blade. This will ensure that the managed identity will have access to perform domain operations seamlessly without providing any additional password. Users will then use this identity to create a secure HDInsight cluster. For more information on how to configure this, see our documentation.
- For Apache Kafka clusters, customers can now authorize the managed identity to have proper access to an encryption key stored in Azure Key vault to perform disk encryption at rest. This scenario is also known as BYOK (Bring Your Own Key). For more information, see our documentation.
New UX to create ESP clusters
As part of the improvements for GA, we have created a brand new user experience in Azure portal for enabling ESP. This new experience will automatically detect and validate common misconfigurations related to AAD-DS. This will help save a lot of time and fix errors upfront before the user hits the create button. To learn more about the new configuration steps, see our documentation.
Try Azure HDInsight now
We are excited to see what you will build next with Azure HDInsight. Read this developer guide and follow the quick start guide to learn more about implementing open source analytics pipelines on Azure HDInsight. Stay up-to-date on the latest Azure HDInsight news and features by following us on Twitter #HDInsight and @AzureHDInsight. For questions and feedback, please reach out to AskHDInsight@microsoft.com.
Azure HDInsight is an easy, cost-effective, enterprise-grade service for open source analytics that enables customers to easily run popular open source frameworks including Apache Hadoop, Spark, Kafka, and others. The service is available in 27 public regions and Azure Government Clouds in the US and Germany. Azure HDInsight powers mission critical applications in a wide variety of sectors and enables a wide range of use cases including ETL, streaming, and interactive querying.