Big data analytics and AI with optimized Apache Spark
Unlock insights from all your data and build artificial intelligence (AI) solutions with Azure Databricks, set up your Apache Spark™ environment in minutes, autoscale, and collaborate on shared projects in an interactive workspace. Azure Databricks supports Python, Scala, R, Java, and SQL, as well as data science frameworks and libraries including TensorFlow, PyTorch, and scikit-learn.
Apache Spark™ is a trademark of the Apache Software Foundation.
Reliable data engineering
Large-scale data processing for batch and streaming workloads.
Analytics for all your data
Enable analytics for the most complete and recent data.
Collaborative data science
Simplify and accelerate data science on large datasets.
Rooted in open source
Fast, optimized Apache Spark environment.
Start quickly with an optimized Apache Spark environment
Boost productivity with a shared workspace and common languages
Turbocharge machine learning on big data
Get high-performance modern data warehousing
Key service capabilities
-
Optimized spark engine
Simple data processing on autoscaling infrastructure, powered by highly optimized Apache Spark™ for up to 50x performance gains.
-
Machine learning run time
One-click access to preconfigured machine learning environments for augmented machine learning with state-of-the-art and popular frameworks such as PyTorch, TensorFlow, and scikit-learn.
-
MLflow
Track and share experiments, reproduce runs, and manage models collaboratively from a central repository.
-
Choice of language
Use your preferred language, including Python, Scala, R, Spark SQL and .Net—whether you use serverless or provisioned compute resources.
-
Collaborative notebooks
Quickly access and explore data, find and share new insights, and build models collaboratively with the languages and tools of your choice.
-
Delta lake
Bring data reliability and scalability to your existing data lake with an open source transactional storage layer designed for the full data lifecycle.
-
Native integrations with Azure services
Complete your end-to-end analytics and machine learning solution with deep integration with Azure services such as Azure Data Factory, Azure Data Lake Storage, Azure Machine Learning, and Power BI.
-
Interactive workspaces
Enable seamless collaboration between data scientists, data engineers, and business analysts.
-
Enterprise-grade security
Effortless native security protects your data where it lives and creates compliant, private, and isolated analytics workspaces across thousands of users and datasets.
-
Production-ready
Run and scale your most mission-critical data workloads with confidence on a trusted data platform, with ecosystem integrations for CI/CD and monitoring.
Learn more from solution architecture examples
Data science and machine learning with Azure Databricks
Get insights from live-streaming data with ease. Capture data continuously from any IoT device, or logs from website clickstreams, and process it in near-real time.
Modern analytics architecture with Azure Databricks
Transform your data into actionable insights using best-in-class machine learning tools. This architecture allows you to combine any data at any scale, and to build and deploy custom machine learning models at scale.
Ingestion, ETL, and stream processing pipelines with Azure Databricks
Accelerate and manage your end-to-end machine learning lifecycle with Azure Databricks, MLflow, and Azure Machine Learning to build, share, deploy, and manage machine learning applications.
Comprehensive security and compliance, built in
-
Microsoft invests more than $1 billion annually on cybersecurity research and development.
-
We employ more than 3,500 security experts who are dedicated to data security and privacy.
-
Azure has more certifications than any other cloud provider. View the comprehensive list.
Learn more about Azure Databricks products and services
Azure Data Factory
Hybrid data integration service that simplifies ETL at scale.
Azure Data Lake Storage Gen 2
Massively scalable, secure data lake functionality built on Azure Blob Storage.
Azure Machine Learning
Enterprise-grade machine learning service to build and deploy models faster.
Power BI
Add analytics and interactive reporting to your applications.
-
Azure Databricks pricing
Spin up clusters quickly and autoscale up or down based on your usage needs. Explore all Azure Databricks pricing options.
Get started with an Azure free account
1
Start free. Get $200 credit to use within 30 days. While you have your credit, get free amounts of many of our most popular services, plus free amounts of 55+ other services that are always free.
2
After your credit, move to pay as you go to keep building with the same free services. Pay only if you use more than your free monthly amounts.
3
After 12 months, you'll keep getting 55+ always-free services—and still pay only for what you use beyond your free monthly amounts.
Community and Azure support
Ask questions and get support from Microsoft engineers and Azure community experts on MSDN Forum and Stack Overflow, or contact Azure support.
Popular labs and templates
Discover self-paced labs and popular quickstart templates for common configurations made by Microsoft and the community.
Frequently asked questions about Azure Databricks
-
The Azure Databricks SLA guarantees 99.95 percent availability.
-
A Databricks unit, or DBU, is a unit of processing capability per hour, billed on per-second usage.
-
A data engineering workload is a job that automatically starts and terminates the cluster on which it runs. For example, a workload may be triggered by the Azure Databricks job scheduler, which launches an Apache Spark cluster solely for the job and automatically terminates the cluster after the job is complete.
The data analytics workload isn’t automated. For example, commands within Azure Databricks notebooks run on Apache Spark clusters until they’re manually terminated. Multiple users can share a cluster to analyze it collaboratively.