[This article was contributed by the SQL Azure team.]
In this article I am going to tie some simple terminology and methodology in business intelligence and PowerPivot with Transact-SQL – bring it down to earth for the DBA. You don’t need to know Transact-SQL to build awesome reports with PowerPivot, however if you already do these articles will attempt to bridge the learning gap.
History
I have been hearing the term Business Intelligence (BI) thrown around for a while, mostly from executives wanting their companies to start “doing” business intelligence. I have also been avoiding learning about business intelligence assuming it involved complicated mathematics and a whole new set of terminology – putting off what appeared to be a big learning curve until I had more time. I prefer to learn new technology by building on what I already know; I really wanted to relate business intelligence to my other SQL Server skills, including Transact-SQL. I finally took the time to learn the basics of business intelligence and realized that there is a close tie in with my Transact-SQL skills.
PowerPivot
PowerPivot, the Excel 2010 extension, is a great way to get started with business intelligence. It lets you experiment with relationships and report building. With it you can quickly and easily prototype reports and investigate data issues before you commit to an ERD or server set up etc
Let get going a do a simple example, connection to SQL Azure and the Adventure Works database. Then import these tables:
- Sales.SalesOrderHeader
- Sales.SalesOrderDetail
- Production.Product
- Production.ProductSubcategory
- Production.ProductCategory
I covered how to connect to SQL Azure and import tables using PowerPivot in this blog post. The next step is to create a PivotTable. To do that go to the PowerPivot ribbon bar in Excel and choose PivotTable.
When the Pivot table appears and the docked PowerPivot Field List window add the LineTotal column from the SalesOrderDetail table in Values section of the field list, gives you the uninteresting PivotTable that looks like the one below.
This is the same as running this SELECT statement in Transact-SQL:
SELECT SUM(LineTotal) FROM Sales.SalesOrderDetail
Now that we have gotten the “Hello World” example out of the way, add the ProductCategory.Name column to the Rows Labels in PowerPivot is just like creating a SELECT statement in Transact-SQL with a GROUP BY clause.
You would get the same output (without the Grand Total) in the PowerPivot sample above by executing this statement Transact-SQL Statement
SELECT ProductCategory.Name, SUM(LineTotal) FROM Sales.SalesOrderDetail INNER JOIN Production.Product ON Product.ProductID = SalesOrderDetail.ProductID INNER JOIN Production.ProductSubcategory ON Product.ProductSubcategoryID = ProductSubcategory.ProductSubcategoryID INNER JOIN Production.ProductCategory ON ProductSubcategory.ProductCategoryID = ProductCategory.ProductCategoryID GROUP BY ProductCategory.Name
If you continue to use the Adventure works database and move the ProductCategory.Name column to the Column Labels and add the Sales.SalesOrderHeader.OrderDate to the Row Labels PowerPivot will look like this:
Which is the same data as this Transact-SQL Statement containing a GROUP BY statement with both OrderDate and ProductCategory.Name:
SELECT ProductCategory.Name, SalesOrderHeader.OrderDate, SUM(LineTotal) FROM Sales.SalesOrderHeader INNER JOIN Sales.SalesOrderDetail ON SalesOrderHeader.SalesOrderID = SalesOrderDetail.SalesOrderID INNER JOIN Production.Product ON Product.ProductID = SalesOrderDetail.ProductID INNER JOIN Production.ProductSubcategory ON Product.ProductSubcategoryID = ProductSubcategory.ProductSubcategoryID INNER JOIN Production.ProductCategory ON ProductSubcategory.ProductCategoryID = ProductCategory.ProductCategoryID GROUP BY ProductCategory.Name, SalesOrderHeader.OrderDate ORDER BY SalesOrderHeader.OrderDate
However, here is where we see PowerPivot start to shine, with both a row label and a column label, PowerPivot pivots the category names into columns, sorts by date automatically, and provides grand totals for the rows and the columns. The DBA using Transact-SQL can get their tables pivoted also by using the PIVOT command in SQL Server 2008, here is what that Transact-SQL looks like for the sample above:
SELECT OrderDate, [1] AS Bikes, [2] AS Components, [3] AS Clothing, [4] AS Accessories FROM (SELECT Sales.SalesOrderDetail.LineTotal, ProductCategory.ProductCategoryID, SalesOrderHeader.OrderDate FROM Sales.SalesOrderHeader INNER JOIN Sales.SalesOrderDetail ON SalesOrderHeader.SalesOrderID = SalesOrderDetail.SalesOrderID INNER JOIN Production.Product ON Product.ProductID = SalesOrderDetail.ProductID INNER JOIN Production.ProductSubcategory ON Product.ProductSubcategoryID = ProductSubcategory.ProductSubcategoryID INNER JOIN Production.ProductCategory ON ProductSubcategory.ProductCategoryID = ProductCategory.ProductCategoryID ) p PIVOT ( SUM(LineTotal) FOR ProductCategoryID IN ( [1], [2], [3], [4] ) ) AS pvt ORDER BY pvt.OrderDate;
Summary
After reading this blog post you might try to summarize Business Intelligence as GROUP BY for Mangers; which would be a great slogan for a bumper sticker.
However, this post only scratching the surface, in my next blog post I will discuss what a measure is in business intelligence and continue writing matching Transact-SQL. Do you have questions, concerns, comments? Post them below and we will try to address them.