Azure Databricks Delta now in preview
Published date: September 24, 2018
The delta feature is now available in preview at no additional cost in the premium SKU of Azure Databricks. With delta, customers get better data reliability, improved performance for their jobs and queries, and opportunity to simplify their data pipelines. With explosive growth in the volume of data being analyzed, the proliferation of different data types, and the need for real-time analytics, data pipelines have become extremely complex. Most customers build multi-stage pipelines that require resiliency at each step to handle issues like schema irregularities and conflicting writes. This complexity of solution leads to performance issues at scale. With delta in Azure Databricks, customers can significantly simplify their pipelines. Delta is a transactional storage layer in Azure Databricks. Interact with it by reading and writing data to a delta table, which is an optimized version of a Spark table. It stores your data in parquet format and adds metadata that provides additional functionality over a Spark table. It provides better reliability and higher performance on Spark jobs and queries in Azure Databricks. Delta also simplifies data pipelines by allowing both batch and streaming jobs to use the same table while providing data consistency, enabling customers to simplify building high performance analytics solution at scale. Start taking advantage of delta on Azure Databricks with minimal code changes. It works with all existing APIs in Spark that customers use for Spark tables. To get started with delta on Azure Databricks, visit the Databricks delta quickstart notebook, and read more about Azure Databricks delta and its capabilities in the delta documentation. |