Smarter, faster, safer: Azure SQL Data Warehouse is simply unmatched

Posted on April 8, 2019

Partner Director Software Engineer, Azure SQL Data Warehouse

Today, we want to call attention to the exciting news that Azure SQL Data Warehouse has again outperformed other cloud providers in the most recent GigaOm benchmark report.

This is the result of relentless innovation and laser-focused execution on providing new features our customers need, all while reducing prices so customers get industry-leading performance at the best possible value. In just the past year, SQL Data Warehouse has released 130+ features focused on providing customers with enhanced speed, flexibility, and security. And today we are excited to announce three additional enhancements that continue to make SQL Data Warehouse the industry leader:

  • Unparalleled query performance
  • Intelligent workload management
  • Unmatched security and privacy

In this blog, we’ll take a closer look at the technical capabilities of these new features and, most importantly, how you can start using them today.

Unparalleled query performance

In our March 2019 release, a collection of newly available features improved workload performance by up to 22x compared to previous versions of Azure SQL Data Warehouse, which contributed to our leadership position in both the TPC-H and TPC-DS benchmark reports.

This didn’t just happen overnight. With decades of experience building industry-leading database systems, like SQL Server, Azure SQL Data Warehouse is built on top of the world’s largest cloud architectures.

Key innovations that have improved query performance include:

  • Query Optimizer enhancements
  • Instant Data Movement
  • Additional advanced analytic functions

Query Optimizer enhancements

Query Optimizer is one of the most critical components in any database. Making optimal choices on how to best execute a query can and does yield significant improvement. When executing complex analytical queries, the number of operations to be executed in a distributed environment matters. Every opportunity to eliminate redundant computation, such as repeated subqueries, has a direct impact to query performance. For instance, the following query is reduced from 13 down to 5 operations using the latest Query Optimizer enhancements.

Animated GIF displaying Query Optimizer enhancements

Instant Data Movement

For a distributed database system, having the most efficient data movement mechanism is also a critical ingredient in achieving great performance. Instant Data Movement was introduced with the launch of the second generation of Azure SQL Data Warehouse. To improve instant data movement performance, broadcast and partition data movement operations were added. In addition, performance optimizations around how strings are processed during the data movement operations yielded improvements of up to 2x.

Advanced analytic functions

Having a rich set of analytic functions simplifies how you can write SQL across multiple dimensions that not only streamlines the query, but improves its performance. A set of such functions is GROUP BY ROLLUP, GROUPING(), GROUPING_ID(). See the example of a GROUP BY query from the online documentation below:

SELECT Country
,Region
,SUM(Sales) AS TotalSales
FROM Sales
GROUP BY ROLLUP(Country, Region)
ORDER BY Country
,Region

Intelligent workload management

The new workload importance feature in Azure SQL Data Warehouse enables prioritization over workloads that need to be executed on the data warehouse system. Workload importance provides administrators the ability to prioritize workloads based on business requirements (e.g., executive dashboard queries, ELT executions).

Workload classification

It all starts with workload classification. SQL Data Warehouse classifies a request based on a set of criteria, which administrators can define. In the absence of a matching classifier, the default classifier is chosen. SQL Data Warehouse supports classification at different levels including at the SQL query level, a database user, database role, Azure Active Directory login, or Azure Active Directory group, and maps the request to a system defined workload group classification.

Workload importance

Each workload classification can be assigned one of five levels of importance: low, below_normal, normal, above_normal, and high. Access to resources during compilation, lock acquisition, and execution are prioritized based on the associated importance of a request.

The diagram below illustrates the workload classification and importance function:

ADW_GIF2_v3H

Classifying requests with importance

Classifying requests is done with the new CREATE WORKLOAD CLASSIFIER syntax. Below is an example that maps the login for the ExecutiveReports role to ABOVE_NORMAL importance and the AdhocUsers role to BELOW_NORMAL importance. With this configuration, members of the ExecutiveReports role have their queries complete sooner because they get access to resources before members of the AdhocUsers role.

CREATE WORKLOAD CLASSIFIER ExecReportsClassifier
   WITH (WORKLOAD_GROUP = 'mediumrc'
        ,MEMBERNAME     = 'ExecutiveReports'
        ,IMPORTANCE     =  above_normal);
 
CREATE WORKLOAD CLASSIFIER AdhocClassifier
    WITH (WORKLOAD_GROUP = 'smallrc'
         ,MEMBERNAME     = 'AdhocUsers'
         ,IMPORTANCE     =  below_normal);

For more information on workload importance, refer to the classification importance and CREATE WORKLOAD CLASSIFIER documents.

Unmatched security and privacy

When using a data warehouse, customers often have questions regarding security and privacy. As illustrated by Donald Farmer, a well-respected thought leader in the analytics space, Azure SQL Data Warehouse has the most advanced security and privacy features in the market. This wasn’t achieved by chance. In fact, SQL Server, the core technology of SQL Data Warehouse, has been the least vulnerable database over the last eight years in the NIST vulnerabilities database.

One of our newest security and privacy features in SQL Data Warehouse is Data Discovery and Classification. This feature enables automated discovery of columns potentially containing sensitive data, recommends metadata tags to associate with the columns, and can persistently attach those tags to your tables.

These tags will appear in the Audit log for queries against sensitive data, in addition to being included alongside the query results for clients which support this feature.

The Azure SQL Database Data Discovery & Classification article walks you through enabling the feature via the Azure portal. While the article was written for Azure SQL Database, it is now equally applicable to SQL Data Warehouse.

Next steps

Azure is the best place for data analytics

Azure continues to be the best cloud for analytics. Learn more why analytics in Azure is simply unmatched.