The purpose of the Windows Azure ISV blog series is to highlight some of the accomplishments from the ISVs we’ve worked with during their Windows Azure application development and deployment. Today’s post, written by Windows Azure Architect Evangelist Ricardo Villalobos, is about how Digital Folio is using Windows Azure to deliver their online shopping service.
Digital Folio is an Internet browser plug-in that allows end users compare prices and find product suggestions while shopping online. The client portion of the solution is displayed as a sidebar or widget that is easily accessible while searching. Once the product has been found, the end-user can start comparing prices from different vendors or simply drag-and-drop the item into one of their “folios” to track latest prices, price history, and other information from retailers such as Amazon, BestBuy, Sears, Target, and Wal-Mart. Users can share their folios with friends, family, and sales staff, creating a rich social shopping experience.
Although Digital Folio’s practical and collaborative user interface is extremely impressive, it’s important to understand the role that Windows Azure architecture, infrastructure, and technology plays in supporting it.
Architecture
The Digital Folio browser plug-in was created using Silverlight, served from an IIS website running on a Windows Azure web role. Once installed on the client machine, it asynchronously communicates with a series of web services hosted on a second web role, using WCF as the Service layer. These services, in turn, talk to a Business / Data layer, which takes cares of concurrency and transaction management. Up to this point, this is a typical line-of-business application architecture, taking advantage of the clustered nature of Windows Azure to easily scale out and adapt to different levels of traffic. However, what makes this architecture special is the use of Windows Azure Storage tables to save all the information generated by the multiple users comparing and shopping products online.
Digital Folio first considered using SQL Azure as their primary storage mechanism, but quickly realized that the decision was not that simple, based on typical business drivers of consumer Internet applications like out-of-the-box scalability and capacity planning. Eventually, Digital Folio went with Windows Azure tables, but some functions and features – like reporting – were not as easy to implement when using table storage. The following section summarizes the lessons and best practices that they learned when working with Windows Azure tables.
Deciding between Windows Azure Storage tables and SQL Azure
When Digital Folio was making many of these decisions over one year ago, SQL Azure was still growing up. For instance, one could only purchase 1 GB – 10 GB database sizes, but today instances go up to 150 GB… and there were no supported options for SQL Azure sharding until recently, when SQL Azure Federation support went into production.
Digital Folio started by identifying the different characteristics that were relevant to their cloud architecture, and came up with the following:
- Cost
- Scalability
- Performance
- Reporting / Custom Views
- Capacity
Cost Analysis
In terms of hard costs, using non-relational Windows Azure tables represented a significant reduction in operation costs, given that each Gigabyte is priced at $0.14 USD per month, plus $0.01 per 10,000 transactions (compared to an average price of $9.99 per Gigabyte per month for SQL Azure). However, the learning curve and figuring out best practices for table storage were certainly costs for the Digital Folio team at the time. Today, this soft cost should be lower given the amount of guidance and tooling available to support development efforts on Windows Azure tables.
Scalability Analysis
Given the partitioning of Windows Azure tables, the Digital Folio team was confident in the ability to easily scale tables to hundreds of millions of rows as long as a proper partitioning strategy was employed on each table. At the time, Digital Folio was concerned about SQL Azure’s size restrictions and lack of clear scalability targets. Today, with SQL Azure Federations, combined with increased database sizes, these issues are less of a concern, but structured SQL storage with ACID properties will tend to need more TLC to attain similar levels of scalability as out-of-the-box NoSQL approaches. With a careful partitioning strategy for each Windows Azure table, the 500 requests/sec/partition metrics that Microsoft has targeted would work just fine for the number of expected users.
Performance Analysis
The biggest performance differences between SQL Azure and Windows Azure tables depend on how many results are returned in a single query and how many indexes are required per entity. SQL Azure, generally, provides better performance for queries that return greater than 1,000 rows, since each Windows Azure table query is currently limited to returning only 1,000 results per query along with a continuation token that is used to get additional results. Keeping this in mind, a query that returns 2,500 results would require a single SQL Azure call, but three Windows Azure table storage requests. Since Digital Folio had a small number of entity types to persist to storage, with small numbers of rows returned per query, Windows Azure tables were a great fit.
The second major performance difference comes from tables that have more than one or two indexes. Since Windows Azure tables get scalability from partitioning every row by a single partition key per row, lookups outside the partition key are essentially full table scans (read “performance impact with large tables”). SQL Azure is obviously a more traditional database in that multiple indexes can be added to each table. This can be certainly overcome by creating tables that are essentially indexes into other tables, and in fact, Digital Folio has done this on a few occasions as the need arose. Most queries to Windows Azure table storage generally returned in 150ms given the careful partitioning strategy that was built out across the Azure tables by the Digital Folio team.
Reporting/Custom Views
Windows Azure table storage is generally a poor choice as a repository for full reporting given that only 1,000 rows are returned per query and then each additional 1,000 rows requires an extra call with the continuation token provided by the previous one. To that end, the Digital Folio team placed all analytics events in a separate SQL Azure database system so that traditional reporting can occur.
Capacity
Windows Azure tables can scale to 100 TB for table, blob, and queue storage per storage account, which will be plenty for most applications. Currently, SQL Azure goes up to 150 GB per database, with larger databases possible with the use of Federations.
Conclusion
Digital Folio considered different factors before choosing Windows Azure Tables as the storage mechanism for their cloud solution. The same process can be followed by companies with similar requirements, as they consider factors such as cost, learning curve, scalability, performance, reporting, and capacity.
Stay tuned for the next post in the Windows Azure ISV Blog Series and feel free to tell us what you think about the series by posting a comment below.We look forward to hearing from you!