Azure Databricks – VNet injection, DevOps Version Control and Delta availability

Posted on 13 March, 2019

Group Product Manager, Azure Data

Azure Databricks provides a fast, easy, and collaborative Apache® Spark™-based analytics platform to accelerate and simplify the process of building big data and AI solutions that drive the business forward, all backed by industry-leading SLAs.

With Azure Databricks, you can set up your Spark environment in minutes and autoscale quickly and easily. You can also apply your existing skills and collaborate on shared projects in an interactive workspace with support for Python, Scala, R, and SQL, as well as data science frameworks and libraries like TensorFlow and PyTorch.

We’re continuously listening to customers and answering questions as we evolve this service. This blog outlines important service announcements that we are proud to deliver for our customers.

Azure Databricks Delta available in Standard and Premium SKUs

Azure Databricks Delta brings new levels of reliability and performance for production workloads based on a number of improvements including transaction support, schema validation, indexing, and data versioning.

Since the preview of Delta was announced, we have received overwhelmingly positive feedback on how it has helped customers build complex pipelines for both batch and streaming data, and simplified ETL pipelines. We are excited to announce that Delta is now available in our Standard SKU offering in addition to Premium SKU offering so you can leverage its capabilities to the fullest and build pipelines more efficiently. Now everyone can get the benefits of Databricks Delta‘s reliability and performance.

You can read more about Azure Databricks Delta in our guide, “Introduction to Databricks Delta,” and import our quickstart notebook.

Azure DevOps Services Version Control

Azure DevOps is a collection of services that provide an end-to-end solution for the five core practices of DevOps: planning and tracking, development, build and test, delivery, and monitoring and operations.

Initially, we started with GitHub integration for Azure Databricks notebooks. On popular demand, we have introduced the ability to set your Git provider to Azure DevOps Services.

Authentication with Azure DevOps Services is done automatically when you authenticate using Azure Active Directory (Azure AD). The Azure DevOps Services organization must be linked to the same Azure AD tenant as Databricks. You can easily select your Git provider to Azure DevOps Services as shown in the documentation, “Azure DevOps Services Version Control.”

Deploy Azure Databricks in your own Azure virtual network (VNet injection) preview

By default, we deploy and manage your clusters for you in managed VNETs, with peering enabled. We create and manage these VNETs, but they reside in your subscription. We also manage the accompanying network security group rules.

Some customers, however, require network customization. I am pleased to announce that if you need to, now you can deploy Azure Databricks in your own existing virtual network (VNet injection). Connect Azure Databricks to other Azure services, such as Azure Storage, in a secure manner using service endpoints or to on-premises data sources for use with Azure Databricks, taking advantage of user-defined routes. You can also connect Azure Databricks to a network virtual appliance to inspect all outbound traffic and take actions according to allow and deny rules. Configure Azure Databricks to use custom DNS and configure network security group (NSG) rules to specify egress traffic restrictions.

Deploying Azure Databricks to your own virtual network also lets you take advantage of flexible CIDR ranges. See the documentation to quickly and easily configure Azure Databricks in your Vnet using the Azure Portal UI.

Get started today!

Try Azure Databricks and let us know your feedback!