Azure #DocumentDB Service Level Agreements

Publisert på 11 april, 2017

Senior Program Manager

Why enterprises trust us for their globally distributed applications.

Enterprise applications and massive scale applications need a data store that is globally distributed, offers limitless scale, geographical reach, and is fast and performant. Along with enterprise grade security and compliance, a major criterion is the level of service guarantees the database provides in terms of availability, performance, and durability. Azure DocumentDB is Microsoft’s globally distributed database service designed to enable you to build planet-scale applications, enabling you to elastically scale both throughput and storage across any number of geographical regions. The service offers guaranteed single-digit millisecond low latency at the 99th percentile, 99.99% high availability, predictable throughput, and multiple well-defined consistency models.

We recently updated our Service Level Agreements (SLA) to make them comprehensive to include latency, availability, throughput, and consistency. By virtue of its schema-agnostic and write-optimized database engine, DocumentDB, by default, is capable of automatically indexing all the data it ingests and serves across SQL, MongoDB, and JavaScript language-integrated queries in a scale-independent manner. As one of the foundational services of Azure, DocumentDB has been used virtually ubiquitously as a backend for first-party Microsoft services for many years. Since its general availability in 2015, DocumentDB is one of the fastest growing services on Azure.

SLA

Industry leading comprehensive SLA

Since its inception, Azure DocumentDB always offered the best SLA in the industry with 99.99% guarantees for availability. Now, we are the only cloud service offering a comprehensive SLA for:

  • Availability: The most classical SLA. Your system will be available for more than 99.99% of the time or you get refund.
  • Throughput: At a collection level, we guarantee the throughput for your database collection is always executed according to the maximum throughput you provisioned.
  • Latency: Since speed is important, we guarantee that 99% of your requests will have a latency below 10ms for document read or 15ms for document write operations.
  • Consistency: We ensure that we will honor the consistency guarantees in accordance with the consistency levels chosen for your requests.

While everyone is familiar with the notion of SLA on availability or uptime, providing financial guarantees on throughput, latency, and consistency is a first and industry leading initiative. This is not only difficult to implement but also hard to provide transparency to users. Thanks to the Azure portal, we provide full transparency on uptime, latency, throughput, and the number of requests and failures. In the rare case that we are unable to honor any of these SLA, we will provide credits from 10% to 25% of your monthly bill as a refund.

Availability SLA – 99.99%

Availbility SlA

The following equation shows the SLA formula for availability, given a month with 744 hours:

Formula1

Screenshot_2

A failed request has the HTTP code 5xx or 408 (for document Read/Write/Query operations) as shown in the portal.

Throughput SLA – 99.99%

The following equation shows the SLA formula for throughput, given a month with 744 hours:

Formula3

Screenshot_3

What defines "Throughput Failed Requests", are requests that are throttled by the DocumentDB collection resulting in an error code, but before consumed RUs have exceeded the provisioned RUs for a partition in the collection for a given second. To avoid being throttled due to a misuse, we highly recommend you to look into the best practice in partitioning and scaling DocumentDB.

Consistency SLA – 99.99%

"Consistency Level" is the setting for a particular read request that supports consistency guarantees. You can monitor the consistency SLA through Azure portal:

Eventual consistency

Note: In this screenshot SLA = Actual

The following table captures the guarantees associated with the Consistency Levels. Please note:

  • "K" is the number of versions of a given document for which the reads lag behind the writes.
  • "T" is a given time interval.

 

CONSISTENCY LEVEL CONSISTENCY GUARANTEES
Strong Strong
Session Read Your Own Write
  Monotonic Read
  Consistent Prefix
Bounded Staleness Read Your Own Write (Within Write Region)
  Monotonic Read (Within a Region)
  Consistent Prefix
  Staleness Bound < K,T
Consistent Prefix Consistent Prefix
Eventual Eventual

If a month has 744 hours, the SLA formula for consistency is:

Formula5

Screenshot_4

Latency SLA – P99

Observed read

For a given application deployed within a local Azure Region, in a month, we sum the number of one-hour intervals during which Successful Requests submitted by an Application resulted in a P99 latency greater than or equal to 10ms for document read or 15ms for document write operations. We call these hours “Excessive Latency Hours.

Formula7

If Monthly P99 Latency Attainment % is below 99%, we consider it a violation of the SLA and we will refund you up to 25% of your monthly bill.

We hope that this short blog helped you understand the large coverage of our Enterprise SLAs.

Azure DocumentDB, home for Mission Critical Applications

Azure DocumentDB hosts a growing number of customer mission critical apps. Our customers come from diverse verticals such as banking and capital markets, professional services, discrete manufacturers, startups, and health solutions. However, they share a common characteristic, the need to scale out globally while not compromising on speed and availability. Thanks to one of the best architectures, Azure DocumentDB can deliver on these promises and at a very low cost.

Build your first globally distributed application

Our vision is to be the database for all modern applications. We want to enable developers to truly transform the world we are living in through the apps they are building, which is even more important than the individual features we are putting into DocumentDB. Developing applications is hard, developing distributed applications at planet scale that are fast, scalable, elastic, always available, and yet simple, is even harder. Yet it is a fundamental pre-requisite in reaching people globally in our modern world. We spend limitless hours talking to customers every day and adapting DocumentDB to make the experience truly stellar and fluid.

So what are the next steps you should take? Here are a few that come to mind:

If you need any help or have questions or feedback, please reach out to us on the developer forums on Stack Overflow. Stay up-to-date on the latest DocumentDB news and features by following us on Twitter (@DocumentDB) and join our LinkedIn Group.