Secondary Indexes on Column Store accelerate SQL Data Warehouse look up queries

By Shivani Gupta Senior Program Manager

Secondary Indexes on Column Store accelerate SQL Data Warehouse look up queries • 2 min read

Posted on December 5, 2016
2 min read

We are pleased to announce that Azure SQL Data Warehouse now supports the creation of secondary B-Tree indexes (also referred to as non-clustered indexes or NCI) on column store tables (also referred to as clustered column store indexes or CCI). Most analytic queries aggregate large amounts of data and are served well by scanning the column store segments directly. However, there is often a need to look for a ‘needle in a haystack’ which translates to a query that does a lookup of a single row or a small range of rows. Such look up queries can get orders of magnitude (even 1000x) improvement in response time and potentially run in sub-second if there is a B-Tree index on the filter column.

SQL Data Warehouse is your go-to SQL-based view across data, offering a fast, fully managed, petabyte-scale cloud solution. It is highly elastic, enabling you to provision and scale up to 60 times larger in seconds. You can scale compute and storage independently, allowing you to range from burst to archival scenarios, and pay based off what you're using instead of being locked into a confined bundle. Plus, SQL Data Warehouse offers the unique option to pause compute, giving you even more freedom to better manage your cloud costs.

Prior to the availability of secondary B-Tree indexes on column store tables, users could meet response time requirements for their point look up queries by duplicating column store data in a clustered B-Tree index. However, the duplication of data adds implementation complexity, storage cost as well as latency. Some of these users have now tried the new secondary indexes over column store and are delighted that they can get the same interactive response time without the data duplication.

How to Create a Secondary Index on a Column Store Table

This follows the same syntax as the generic Create Index Transact-SQL statements. A simple test on 1TB TPC-H data demonstrated that the query time for selecting orders for a given orderkey from lineitem went down from 41 seconds to under a second after a secondary index on orderkey was added.

VSCapture6

Best Practices for Using Secondary Indexes

Here are some guidelines to bear in mind when using secondary indexes on column store tables.

Use them for high cardinality columns that are used as filters in queries returning a small number of rows.
Don’t be heavy handed with secondary indexes as there is an overhead to maintaining them during loads. Best to limit to 1 or 2 secondary indexes per table.

Secondary indexes can be created on partitioned column store tables as well. However, as they are local to each distribution and partition, they cannot be used to implement UNIQUE constraint.

Next Steps

In this blog post we talked about benefits of the new functionality offered by secondary B-Tree indexes on column store tables. This is now available in all SQL Data Warehouse Azure regions worldwide. We encourage you to try it out if you have a use case for point lookups.

Learn More

Check out the many resources for learning more about SQL Data Warehouse, including:

What is Azure SQL Data Warehouse?
SQL Data Warehouse best practices
Video library
MSDN forum
Stack Overflow forum

Secondary Indexes on Column Store accelerate SQL Data Warehouse look up queries

How to Create a Secondary Index on a Column Store Table

Best Practices for Using Secondary Indexes

Next Steps

Learn More

Explore

Related posts

Enabling Diagnostic Logging in Azure API for FHIR®

Conformidad protegida por el programa IRAP desde la infraestructura hasta la capa de aplicaciones SAP en Azure

MileIQ and Azure Event Hubs: Billions of miles streamed

Join the conversation

Destacadas

IA y Machine Learning

Análisis

Compute

Contenedores

Bases de datos

DevOps

Herramientas para desarrolladores

Híbrido y multinube

Identidad

Integración

Internet de las cosas

Administración y Gobernanza

Multimedia

Migración

Realidad mixta

Movilidad

Redes

Seguridad

Almacenamiento

Web

Windows Virtual Desktop

Casos de uso

Desarrollo de aplicaciones

Inteligencia artificial

Migración y modernización en la nube

Datos y análisis

Nube e infraestructura híbridas

Internet de las cosas

Seguridad y gobernanza

Tipo de organización

Recursos

How to Create a Secondary Index on a Column Store Table

Best Practices for Using Secondary Indexes

Next Steps

Learn More

Explore

Related posts

Join the conversation