Azure SQL Data Warehouse is Microsoft’s SQL analytics platform, the backbone of your Enterprise Data Warehouse (EDW). The service is designed to allow customers to elastically, and independently, scale compute and storage with massively parallel processing. SQL DW integrates seamlessly with big data stores and acts as a hub to your data marts and cubes for an optimized and tailored performance of your EDW. Azure SQL DW offers guaranteed 99.9% high availability, compliance, advanced security, and tight integration with upstream and downstream services so you can build a data warehouse that fits your needs. Azure SQL DW is the first and only service enabling enterprises to replicate their data everywhere with global availability in more than 30 regions. Today, we will show you how to push data quickly into Azure Analysis Services for optimized performance.
One of the key features of Azure SQL Data Warehouse is our decoupled compute and storage model. For key price and performance, it is important that you choose the performance level that matches your current workload, not just the maximum needed. This blog post is first of a two part blog series covering how to use Azure Function App to save money by automating compute levels.
The first solution we consider is a schedule based set of Azure functions, which allows you to set times during which you would like to have your data warehouse scaled up, down, or paused. Many data warehouse users have periods of activity and inactivity, usually corresponding with the workday. Shown below is a real customer data warehouse instance with their corresponding query usage over several days. Observing this daily workload, the query patterns display clear spikes during the working hours of each day with a drop off during the night.
Currently this customer is not changing its DWU regardless of its usage, and is constantly running at DW600. While a DW600 may be necessary during the workday, a DW200 may be more appropriate during off-hours. Over a month of scaling between these two values with a 16:8 hour ratio, one could potentially be saving around around 32%. Looking at its usage, we would recommend scaling up once before the working spike at 12am UTC and scale it back down to 5pm UTC daily. Using Azure TimeTrigger functions, we could reduce this customer’s monthly spend by 25%!
To learn more about SQL Data Warehouse schedule-based scaling with Azure functions, check out our documentation and its Github repository, or deploy now to your instance! This schedule based function is a proof of concept (POC) and we look forward to any community contributions around logging, retry-logic in the case of failures, and other features that enhance the user experience. This function defaults to using a consumption plan with blob storage. Consumption plan pricing includes a monthly free grant of 1 million requests and 400,000 GB-s of resource consumption per month.
If you need our help for a POC, contact us directly. Stay up-to-date on the latest Azure SQL DW news and features by following us on Twitter @AzureSQLDW.