Clustered Columnstore Tables are default in Azure SQL Data Warehouse

2015년 12월 4일 금요일

Clustered columnstore indexes are the most efficient way you can store your data in Azure SQL Data Warehouse. Storing your data in tables that have a clustered columnstore index are the fastest way to query your data. It will give you the greatest data compression and lower your storage costs. To ensure your data is using this latest technology, clustered columnstore index is now the table default in Azure SQL Data Warehouse. With this change come some changes to CREATE TABLE and CREATE TABLE AS SELECT syntax. For example, the following code will now create a clustered columnstore table:
 CREATE TABLE myClusteredColumnstoreTable
 (
 id int NOT NULL,
 lastName varchar(20),
 zipCode varchar(6)
 )
 WITH
 (
 DISTRIBUTION = HASH (id)
 );
If you want to create your table with the previous default, HEAP, you simply need to add this option to your WITH clause. For example, the following code will create a HEAP table:
 CREATE TABLE myHeapTable
 (
 id int NOT NULL,
 lastName varchar(20),
 zipCode varchar(6)
 )
 WITH
 (
 DISTRIBUTION = HASH (id),
 HEAP
 );
If you want to see what types of tables you have in your database, you can run the following query:
 SELECT
 a.name as schema_name,
 b.name as table_name,
 c.type_desc as table_type,
 c.name as index_name
 FROM
 sys.schemas a
 INNER JOIN sys.tables b
 ON a.schema_id = b.schema_id
 INNER JOIN sys.indexes c
 ON b.object_id = c.object_id
 ORDER BY
 1, 2;
To convert a table from HEAP to a clustered columnstore, you can use CREATE CLUSTERED COLUMNSTORE INDEX, for example:
 CREATE CLUSTERED COLUMNSTORE INDEX cciIndex ON myHeapTable;
To convert a table from clustered columnstore to HEAP, you can use DROP INDEX, for example:
 DROP INDEX cciIndex ON myHeapTable;
There are two subtle impacts of this change that may impact you:
  1. If you create staging tables for loads, where the main purpose is to load data into your database quickly and then move that data to a permanent table, you may find that HEAP tables perform better.
  2. If you have automation that currently leverages a secondary index on a HEAP table, you will want to add the HEAP keyword to your DDL because secondary indexes are not yet supported on clustered columnstore tables.

무료 계정

지금 등록하여 $200 Azure 크레딧을 받으세요.

무료로 시작

Visual Studio

최대 $1800 연간 Azure 서비스를 구독자에게 제공

지금 활성화

시작

BizSpark 프로그램에 가입하고 무료 Azure 서비스를 이용하세요

자세한 정보