Skip Navigation

Enterprises get deeper insights with Hadoop and Spark updates on Azure HDInsight

Posted on June 27, 2018

Program Manager, Azure Big Data

Azure HDInsight is one of the most popular services amongst enterprise for open source Hadoop & Spark analytics on Azure. With the plus 50 percent price cut on HDInsight, customers moving to the cloud are reaping more savings than ever.

PROS is a pioneer in using machine learning to give companies an accurate and profitable pricing. PROS Guidance product runs enormously complex pricing calculations based on variables that comprise multiple terabytes of data. In Azure HDInsight, a process that formerly took several days now takes just a few minutes.”- Ed Gonzalez, Product Manager, PROS

Today we are announcing updates to Apache Spark, Apache Kafka, ML Services, Azure Data Lake Storage Gen2 and enhancements to Enterprise Security Package. These new capabilities will continue to drive savings for many of our customers. In addition to this, Microsoft is continuing to deepen its commitment to the Apache Hadoop ecosystem and has extended its partnership with Hortonworks to bring the best of Apache Hadoop and the open source big data analytics to the Cloud.

PartnerLogos

Continued investment in Open Source for new capabilities and reliability

Reliable Open Source

Microsoft’s is contributing to Apache Hadoop ecosystem and also ensuring Azure is the most reliable place to run this ecosystem.  The ecosystem has dependencies on many open source projects which need to be configured together for the stack to work. HDInsight provides pre-tuned clusters out of the box for the best performance. Today we are enabling updates to Apache Hadoop, Apache Spark 2.3, Apache Kafka 1.0 and 2000 plus bug fixes across 20 plus Open Source frameworks that are part of HDInsight. 

“HDInsight is the best place to run open source frameworks for big data. It meets key enterprise needs allowing them to easily modernize their on-premise solutions on a fully managed HDInsight cloud service.” - Rohan Kumar, Corporate Vice President, Microsoft

Enabling data scientists with Machine Learning Services 9.3

Today, we are excited to announce the general availability of Machine Learning (ML) Services 9.3 on Azure HDInsight. With this release, we are providing data scientists and engineers with the best of open source enhanced with algorithmic innovations and ease of operationalization, all available in their preferred language with the speed of Apache Spark. This release expands upon the capabilities offered in R Server with added support for Python, leading to the cluster name change from R Server to ML Services. Get started with ML Services on HDInsight.

Introducing an all new Azure Data Lake Storage Gen2, the Data Lake for everyone

Gen2 Storage

Today’s Data Lake options require customers to choose between scalability, availability, and cost with the security they require. Azure Data Lake Storage provides a no-compromise foundation for building scalable, and highly performant big data analytics solutions without trading off security and cost. Microsoft is announcing a preview of Azure Data Lake Storage Gen2, a globally available HDFS filesystem to store and analyze petabyte-size files and trillions of objects. Today we are excited to announce a preview of HDInsight with Azure Data Lake Storage Gen2.

Uncompromising on security: Virtual Network Service Endpoints

Service Endpoints

Today we are enhancing HDInsight to include support for Virtual Network Service Endpoints which allows customers to securely connect to Azure Blob Storage, Azure Data Lake Storage Gen2, Cosmos DB and SQL databases. By enabling a Service Endpoint for Azure HDInsight, traffic flows through a secured route from within the Azure data center.

Secure and compliant

HDInsight brings enterprise grade protection of your data with encryption, virtual networks, Active Directory based authentication, role-based authorization, fine grained access control, single pane of glass for monitoring and more. The service is globally available in 20 plus regions including sovereign clouds in US, Germany and China and meets key compliance standards such has HIPPA, PCI, ISO and more.

Delivering cost-effectiveness

50 percent price cut

With pay per use billing, on demand clusters, scale up or down and separation of compute and storage customers are able to run their big data jobs more efficiently. We reduced the price of HDInsight by plus 50 percent, so you can enjoy even more cost savings.

Try HDInsight now

We hope you take full advantage of today’s announcements and we are excited to see what you will build with Azure HDInsight. Read this developer guide and follow the quick start guide to learn more about implementing these pipelines and architectures on Azure HDInsight. Stay up-to-date on the latest Azure HDInsight news and features by following us on Twitter #HDInsight and @AzureHDInsight. For questions and feedback, please reach out to AskHDInsight@microsoft.com.

About HDInsight

Azure HDInsight is Microsoft’s premium managed offering for running open source workloads on Azure. Today, we are excited to announce several new capabilities across a wide range of OSS frameworks.

Azure HDInsight powers some of the top customer’s mission critical applications ranging in a wide variety of sectors including, manufacturing, retail education, nonprofit, government, healthcare, media, banking, telecommunication, insurance and many more industries ranging in use cases from ETL to Data Warehousing, from Machine Learning to IoT and many more.