Spark for Azure HDInsight
Date updated: 15 May 2017
Apache Spark will be an open-source processing framework designed to run large-scale data analytics applications. Spark is built on an in-memory compute engine, which will enable high-performance querying on big data. It will take advantage of a parallel data processing framework that'll persist data in-memory and disk if needed. This will allow Spark to deliver 100x faster speed and a common execution model for tasks such as extract, transform, load (ETL), batch, interactive queries, and others on data in an Apache Hadoop Distributed File System (HDFS). Azure makes Apache Spark easy and cost effective to deploy with no hardware to buy, no software to configure, a full notebook experience to author compelling narratives, and integration with partner business intelligence tools.
Spark for Azure HDInsight will offer customers an enterprise-ready solution that’s fully managed, secured, and highly available. It'll also be simplified for users with experiences that are both compelling and interactive.