Skip Navigation

Getting started with Apache Spark on Azure Databricks

Posted on May 30, 2018

Product Marketing Manager, Azure Databricks

Data is growing at an astounding rate, with an estimated 2.5 quintillion bytes being created everyday. Data analysts predict that by 2020, the world’s collected data will quadruple. In the sea of all this data, we are continually exploring new ways of analyzing and interpreting data in a way that’s productive, meaningful and insightful.

Designed in collaboration with the original founders of Apache® Spark™, Azure Databricks combines the best of Databricks and Microsoft Azure to help customers accelerate innovation with streamlined workflows, an interactive workspace and one-click set up. Azure Databricks is an analytics engine built for large scale data processing that enables collaboration between data scientists, data engineers and business analysts.

Azure Databricks can be used to run workloads faster and write applications in the language of your choice, whether that’s Scala, SQL, R or Python. When in sync with Azure Databricks, businesses can innovate within the safe, protected cloud environment of Microsoft Azure and benefit from the native integration with other Azure services such as Power BI, Azure SQL Data Warehouse, and Azure Cosmos DB.

When you’re getting started with Apache Spark on Azure Databricks, you’ll have questions that are unique to your businesses implementation and use case. In this introductory webinar provided by Microsoft, we’ll answer your questions in real-time and cover the following common use cases:

  • RDD’s, Data Frames, Data Sets and other fundamentals of Apache Spark
  • How to set up Azure Databricks
  • How to use Azure Databricks interactive notebooks, providing you a collaborative workspace for your analytics team
  • How to put your work into immediate production by scheduling notebooks

Register for this webinar to get answers to your questions and learn how to get started with Apache Spark on Azure Databricks.