We’re excited to announce the first release of capabilities for the Azure SQL Data Warehouse, incorporating key feedback from early customers to provide a number of sought after improvements.
Parallelized Data Export
This release further expands PolyBase, SQL Data Warehouse’s fully parallelized loader to support export as well as import. With support for the CREATE EXTERNAL TABLE AS SELECT (CETAS) statement, customers can use PolyBase to export data from SQL Data Warehouse to Azure Blob Storage. Now customers can import multiple files, process them, and then export the results.
The new CETAS statement follows the current patterns Loading Data with PolyBase – creating/reusing credentials, creating/reusing a data source, creating/reusing a file format and then Creating an External table making it easy to use. For example, the results of a query can be exported with the below statement:
CREATE EXTERNAL TABLE Weblogs2014 WITH ( LOCATION='wasbs://mycontainer@test.blob.core.windows.net/', DATA_SOURCE=AzureStorage, FILE_FORMAT=TextFile ) AS SELECT Uri, DateRequested FROM dbo.Weblogs WHERE 1=1 AND DateRequested > '12/31/2013' AND DateRequested < '01/01/2015';
Loading Performance Enhancements
In addition to adding CETAS, we have also made stability and performance improvements for both PolyBase and BCP, two of SQL Data Warehouses primary loading mechanisms.
In PolyBase: customers can now CREATE STATISTICS on external tables giving better performance across all external table operations. With statistics, the SQL Data Warehouse is able to generate and execute better query plans which drives better decisions in pushdown and streaming scenarios. In addition, we have created additional validation during "Create External Table" to fail fast and improved object clean-up in the case of failed CTAS operations.
In BCP: We are now allowing insert bulk statements to take advantage of resource classes (via ALTER SERVER ROLE), giving users the ability to run BCP with extra resources for performance improvements. Additionally, we have also improved BCP memory management to improve reliability and performance during data loads.
Expanded Function Support
Users can now create functions in schemas other than the default dbo schema. This allows customers to organized functions into working/functional areas. See the CREATE FUNCTION syntax on MSDN.
Additional Improvements
In addition to the above we have also made subtler improvements to the system designed to make SQL Data Warehouse easier to use:
- Improved certain error messages for added clarity during data loading and other error conditions
- Ensure that objects are correctly cleaned up upon DROP operations
- Made the "location" option during temp table creation syntax optional to simplify migrations
- Improvements to DMV for more precise monitoring, added full support for the sys.dm_exec_dms_services, sys.dm_exec_dms_workers, sys.dm_exec_external_work, and sys.dm_exec_external_operations DMVs