The "Cloud Service Fundamentals" application, referred to as "CSFundamentals," demonstrates how to build database-backed Azure services. This includes a description of the scenario, implementation architecture and reusable components for logging, configuration and data access. The code base is intended to be a tangible exploration of best practices for delivering scalable, available services on Azure based on production deployments by the Windows Azure Customer Advisory Team.
With the majority of companies these days looking to move their cloud initiatives forward, the business drivers for specific solutions vary greatly, from lowering costs to significantly increasing agility and scale. When solutions are trying to attain “cloud scale”, the strategy of “vertical scalability”, increasing capacity by upgrading the hardware, is replaced with “horizontal scalability”, increasing the number of machines that share a specific task. A great example of this tradeoff is the creation of web farms where many servers serve the same web site content versus having a single monolithic machine try to handle the load.
Most people that start down this path plan for horizontal scalability on the compute nodes but bypass the more complex and potentially more critical state tiers, such as the relational database management system (RDBMS) and caches. These services are often IO-intensive and bound by a single instance. One technique to implement horizontal scalability in the state tier is known as sharding. Sharding is when you logically separate your RDBMS data into multiple databases, typically with the same schema. For example, an employee table could be split across three distinct databases where each database stores a different department’s employees.
The benefits of sharding assists in far more than just capacity related scenarios. For the purposes of this post we will focus on sharding an RDBMS that is implemented in Azure SQL Database platform, and primarily serves OLTP scenarios. Some example scenarios that could benefit from a sharded database structure include:
- Throttling thresholds or throughput limits are hit too often.
- Size of the database becomes unwieldy (index rebuilds, backups).
- A single unavailable database affects all users (as opposed to a single shard).
- A database that has difficulty scaling up and down gracefully in respond to demand.
- Certain business models, such as multi-tenant or Software as a Service offerings.
When using a multi-tenant database as a service solution, such as Windows Azure SQL Database, there are typically Quality of Service (QOS) controls put in place that throttle clients under various conditions. Throttling typically occurs when resource pressure climbs. Sharding is a key strategy to help reduce resource pressure by taking the load that would typically affect a single server and spreading it across as multiple servers that each contain a shard. For example, assuming an even distribution, creating five shards reduces the load to approximately twenty percent on each database.
As with anything that grants greater power, there are sacrifices that must be made. Sharding increases the complexity of several key areas, requiring more planning. These include:
- Identity columns should be globally unique across all shards in case future business needs necessitate the reduction in shard count. If the identity is not unique across all shards, merging two shards can result in conflicts.
- Referential integrity cannot reference or enforce relationships to rows in other shards as they are independent databases.
- Queries that cross shards should be avoided if possible, because they require querying each shard and merging the results. The need to do “fan out” queries across the shards is not only costly from a performance point of view but increases the complexity of the sharding framework that must support it. If cross-shard queries are necessary, the typical strategy is to query each shard asynchronously. However, there are times where a synchronous approach offers more control of the resultset size.
In most cases, sharding is a Data Access Layer (DAL) concept, abstracting most of the intricacies from the higher-level application logic.
How you define a “tenant” is one of the most crucial decisions that you can make when building a sharded architecture. A tenant is the largest unique categorization of data that is guaranteed to be on the same shard. Queries that are constrained to a single tenant typically perform better as they do not need to execute fan-out operations during normal operational states. Some of the factors that influence the decision of the appropriate tenant definition include:
- The level of awareness of the higher-level application code of the identifier.
- The ability of most core business transactions to be performed at this level.
- The ability to avoid throttling in general day-to-day operations at the tenant granularity.
In an effort to surface these concepts and considerations at a high level, the Windows Azure Customer Advisory Team has built out a basic sharding Data Access Layer (DAL) in the Cloud Services Fundamentals (CSF) package (https://code.msdn.microsoft.com/Cloud-Service-Fundamentals-4ca72649).
In CSF the tenant is defined as an individual user. Some of the factors that led to selecting this tenant were:
- Most of the core business requirements do not require queries across multiple users.
- An unavailable shard only impacts a defined set of users, leaving all others to continue their typical usage of the system. The quantity of users that are on a single shard can be controlled to a number that is tolerable to a business.
The tenant is defined and implemented such that cross-database transactions are not required. In Figure 1, we refer to this set of data as a shardlet, a Data Model Transaction Boundary.
Figure 1 - Data Model Transaction Boundary
When the user connects to the database for the first time in the session they have a series of simple queries that they can execute to understand if any features are unavailable to them due to offline shards.
In an effort to simplify the sharding techniques being demonstrated in CSF, we decided to create a set of shards that have enough storage to fulfill capacity needs for the foreseeable future. By selecting this size, it eliminated the need to demonstrate increasing and reducing the number of shards, which would also include actions like tenant movement. An integer is generated by hashing the tenant’s name and this id is used to lookup a matching range in a “shard map”. CSF uses a range-based mechanism, where a range of these numbers are assigned to a specific shard (captured in the “shard map”).
In the event that shards need to be added or removed from the shard set, it requires that tenants become unavailable until they are transitioned into their new shards. Due to this significant limitation, it is expected that the shard set will be significantly over provisioned when it is first created to reduce or eliminate the need to do complex shard management.
This solution requires that the Data Access Layer (DAL) is aware of the tenant id to determine the placement of the tenants in the shard set. If a query is executed that includes a shard that is unavailable the entire query will fail. In the event that the DAL does not include the tenant id all shards will have to be queried, increasing the chance of failure and reducing performance.
There is currently some preliminary work underway to offer further code samples demonstrating more advanced sharding techniques. The sample will offer improvements in the following areas:
- Reactive and proactive shard management.
- Global uniqueness and identity management.
- Migration of tenants between shards within the set.
- Expansion and contraction of the shard set.
- Improvements in queries that are not tenant aware.
In conclusion, the Cloud Services Fundamentals code sample is a great way to start to explore the basic concepts of sharding, which is an important technique for creating “cloud scale” applications.