PowerPivot for the DBA: Part 1

[This article was contributed by the SQL Azure team.]

In this article I am going to tie some simple terminology and methodology in business intelligence and PowerPivot with Transact-SQL – bring it down to earth for the DBA. You don’t need to know Transact-SQL to build awesome reports with PowerPivot, however if you already do these articles will attempt to bridge the learning gap.

History

I have been hearing the term Business Intelligence (BI) thrown around for a while, mostly from executives wanting their companies to start “doing” business intelligence. I have also been avoiding learning about business intelligence assuming it involved complicated mathematics and a whole new set of terminology – putting off what appeared to be a big learning curve until I had more time. I prefer to learn new technology by building on what I already know; I really wanted to relate business intelligence to my other SQL Server skills, including Transact-SQL. I finally took the time to learn the basics of business intelligence and realized that there is a close tie in with my Transact-SQL skills.

PowerPivot

PowerPivot, the Excel 2010 extension, is a great way to get started with business intelligence. It lets you experiment with relationships and report building. With it you can quickly and easily prototype reports and investigate data issues before you commit to an ERD or server set up etc

Let get going a do a simple example, connection to SQL Azure and the Adventure Works database. Then import these tables:

  • Sales.SalesOrderHeader
  • Sales.SalesOrderDetail
  • Production.Product
  • Production.ProductSubcategory
  • Production.ProductCategory

I covered how to connect to SQL Azure and import tables using PowerPivot in this blog post. The next step is to create a PivotTable. To do that go to the PowerPivot ribbon bar in Excel and choose PivotTable.

image

When the Pivot table appears and the docked PowerPivot Field List window add the LineTotal column from the SalesOrderDetail table in Values section of the field list, gives you the uninteresting PivotTable that looks like the one below.

image

This is the same as running this SELECT statement in Transact-SQL:

SELECT    SUM(LineTotal)  FROM    Sales.SalesOrderDetail

Now that we have gotten the “Hello World” example out of the way, add the ProductCategory.Name column to the Rows Labels in PowerPivot is just like creating a SELECT statement in Transact-SQL with a GROUP BY clause.

image

You would get the same output (without the Grand Total) in the PowerPivot sample above by executing this statement Transact-SQL Statement

SELECT    ProductCategory.Name, SUM(LineTotal)  FROM    Sales.SalesOrderDetail      INNER JOIN Production.Product ON           Product.ProductID = SalesOrderDetail.ProductID      INNER JOIN Production.ProductSubcategory ON           Product.ProductSubcategoryID =     ProductSubcategory.ProductSubcategoryID      INNER JOIN Production.ProductCategory ON           ProductSubcategory.ProductCategoryID =     ProductCategory.ProductCategoryID  GROUP BY ProductCategory.Name

If you continue to use the Adventure works database and move the ProductCategory.Name column to the Column Labels and add the Sales.SalesOrderHeader.OrderDate to the Row Labels PowerPivot will look like this:

image

Which is the same data as this Transact-SQL Statement containing a GROUP BY statement with both OrderDate and ProductCategory.Name:

SELECT    ProductCategory.Name, SalesOrderHeader.OrderDate,    SUM(LineTotal)  FROM    Sales.SalesOrderHeader      INNER JOIN Sales.SalesOrderDetail ON           SalesOrderHeader.SalesOrderID = SalesOrderDetail.SalesOrderID      INNER JOIN Production.Product ON           Product.ProductID = SalesOrderDetail.ProductID      INNER JOIN Production.ProductSubcategory ON           Product.ProductSubcategoryID =     ProductSubcategory.ProductSubcategoryID      INNER JOIN Production.ProductCategory ON           ProductSubcategory.ProductCategoryID =     ProductCategory.ProductCategoryID  GROUP BY ProductCategory.Name, SalesOrderHeader.OrderDate  ORDER BY SalesOrderHeader.OrderDate

However, here is where we see PowerPivot start to shine, with both a row label and a column label, PowerPivot pivots the category names into columns, sorts by date automatically, and provides grand totals for the rows and the columns. The DBA using Transact-SQL can get their tables pivoted also by using the PIVOT command in SQL Server 2008, here is what that Transact-SQL looks like for the sample above:

SELECT OrderDate, [1] AS Bikes, [2] AS Components, [3] AS Clothing,   [4] AS Accessories  FROM   (SELECT    Sales.SalesOrderDetail.LineTotal,    ProductCategory.ProductCategoryID,    SalesOrderHeader.OrderDate      FROM    Sales.SalesOrderHeader          INNER JOIN Sales.SalesOrderDetail ON               SalesOrderHeader.SalesOrderID = SalesOrderDetail.SalesOrderID          INNER JOIN Production.Product ON               Product.ProductID = SalesOrderDetail.ProductID          INNER JOIN Production.ProductSubcategory ON               Product.ProductSubcategoryID =     ProductSubcategory.ProductSubcategoryID          INNER JOIN Production.ProductCategory ON               ProductSubcategory.ProductCategoryID =     ProductCategory.ProductCategoryID  ) p  PIVOT  (      SUM(LineTotal)      FOR ProductCategoryID IN ( [1], [2], [3], [4] )  ) AS pvt  ORDER BY pvt.OrderDate;

Summary

After reading this blog post you might try to summarize Business Intelligence as GROUP BY for Mangers; which would be a great slogan for a bumper sticker.

image

However, this post only scratching the surface, in my next blog post I will discuss what a measure is in business intelligence and continue writing matching Transact-SQL. Do you have questions, concerns, comments? Post them below and we will try to address them.