• 5 min read

Azure.com operates on Azure part 1: Design principles and best practices

In part one of our two-part series, we will peek behind the Azure.com web page to show you how we think about running a major brand website on a global scale.

Azure puts powerful cloud computing tools into the hands of creative people around the world. So, when your website is the face of that brand, you better use what you build, and it better be good. As in, 99.99-percent composite SLA good.

That’s our job at Azure.com, the platform where Microsoft hopes to inspire people to invent the next great thing. Azure.com serves up content to millions of people every day. It reaches people in nearly every country and is localized in 27 languages. It does all this while running on the very tools it promotes.

In developing Azure.com, we practice what we preach. We follow the guiding principles that we advise our customers to adopt and the principles of sustainable software engineering (SSE). Even this blog post is hosted on the very infrastructure that it describes.

In part one of our two-part series, we will peek behind the Azure.com web page to show you how we think about running a major brand website on a global scale. We will share our design approach and best practices for security, resiliency, scalability, availability, environmental sustainability, and cost-effective operations—on a global scale.

Products, features, and demos supported on Azure.com

As a content platform, Azure.com serves an audience of business and technical people—from S&P 500 enterprises to independent software vendors, and from government agencies to small businesses. To make sure our content reaches everyone, we follow Web Content Accessibility Guidelines (WCAG). We also adopted sustainable software engineering principles to help us responsibly achieve global scale and reduce our carbon footprint.

Azure.com supports static content, such as product and feature descriptions. But the fun is in the interactive components that let readers customize the details, like the products available by region page where we show service availability across 61 regions (and growing), the Azure updates page that keeps people informed about Azure changes, and the search box.

The Azure pricing page provides up-to-date pricing information for more than 200 services across multiple markets, and it factors in any discounts for which a signed-in user is eligible. We also built a comprehensive pricing calculator for all services. Prospective customers can calculate and share complex cost estimates in 24 currencies.

As a marketing channel, Azure.com also hosts demos. For example, we created in-browser interactive demos to display the benefits of Azure Cognitive Services, and we support streaming media for storytelling. We also provided a total cost of ownership (TCO) calculator for estimating cloud migration savings in 27 languages and 12 regions.

And did we mention the 99.99-percent composite SLA that Azure.com meets?

Azure Pricing calulator.

Pricing calculator: Interactive cost estimation tool for all Azure products and services.

History of Azure.com

As the number of Azure services has grown, so has our website, and it has always run on Azure. Azure.com is always a work in progress, but here are a few milestones in our development history:

  • 2013: Azure.com begins life on the popular open-source Umbraco CMS. It markets seven Azure services divided into four categories: compute, data services, app services, and network.
  • 2015: Azure.com moves to a custom ASP.NET Model View Controller (MVC) application hosted on Azure. It now supports 16 Azure services across four categories.
  • 2020: Azure.com continues to expand its support of more categories of content. Today, the website describes more than 200 Azure offerings, including Azure services, capabilities, and features.

 

Azure.com timeline of supported products and services.

Azure.com timeline: Every year we support more great Azure products and services.

Design principles behind Azure.com

To create a solid architectural foundation for Azure.com, we follow the core pillars of great Azure architecture. These pillars are the design principles behind the security, performance, availability, and efficiency that make Azure.com run smoothly and meet our business goals.

The core pillars of great Azure architecture.

Design principles: Azure.com follows the tenets of Azure architectural best practices.

You can take a class on how to Build great solutions with the Microsoft Azure Well-Architected Framework.

A pillar of security and resiliency

Like any cloud application, Azure.com requires security at all layers. That means everything covered by the Open Systems Interconnection (OSI) model, from the network to the application, web page, and backend dependencies. This is our defense-in-depth approach to security.

Resiliency is the ability to defend against malicious attacks, bad actors, or bots saturating your compute resources and possibly causing unnecessary scale-out and cost overruns. Resiliency isn’t about avoiding failure, but rather responding to failure in a way that avoids downtime and data loss.

One metric for resiliency is the recovery time objective (RTO), which says how long an application can be offline after suffering an outage. For us, it’s less than 30 minutes. Failure mode analysis (FMA) is another assessment of resiliency and includes planning for failures and running live fire drills. We use both these methods to assess the resiliency of Azure.com.

Super scalable and highly available

Any cloud application needs enough scalability to handle peak loads. For Azure.com, peaks occur during major events and marketing campaigns. Regardless of the load, Azure.com requires high availability to support around-the-clock operations. We trust the platform to support business continuity and guard against unexpected outages, overloaded resources, or failures caused by upstream dependencies.

As a case in point, we rely on Azure scalability to handle the big spikes in demand during Microsoft Build and Microsoft Ignite, the largest annual events handled by Azure.com. The number of requests per second (RPS) jumps 20 to 30 percent as tens of thousands of event attendees flock to Azure.com to learn about newly announced Azure products and services.

Whatever the scale, the Azure platform provides reliable, sustainable operations that enable Microsoft and other companies to deliver premium content to our customers.

Cost-effective high performance is a core design principle

Our customers often tell us that they want to move to a cloud-based system to save money. It’s no different at Azure.com, where cost-efficient provisioning is a core design principle. Azure.com has a handy cost calculator to compare the cost of running on-premises to running on Azure.

Efficiency means having a way to track and optimize underutilized resources and use dynamic scaling to support seasonal traffic demands. This principle applies to all layers of the software development life cycle (SDLC), starting with managing all the work items, using a source code repository, and implementing continuous integration (CI) and continuous deployment (CD). Cost-efficiency extends to the way we provision and host resources in multiple environments, and maintain an inventory of our digital estate.

But being cost-conscious doesn’t mean giving up on speed. Top-notch performance takes minimal network latency, fast server response times, and consistent page load and render times. Azure.com performance always focuses on the user experience, so we make sure to optimize network routing and minimize round-trip time (RTT).

Operating with zero downtime

Uptime is important for any large web application. We aim for zero downtime. That means no service downtime—ever. It’s a lofty goal, but it’s possible when you use CI/CD practices that spare users from the effects of the build and deployment cycles.

For example, if we push a code update, we aim for no site downtime, no failed requests, and no adverse impact on Azure.com users. Our CI/CD pipeline is based on Azure DevOps and pumps out hundreds of builds and multiple deployments to the live production servers every day without a hitch.

Another service level indicator (SLI) that we use is mean time to repair (MTTR). With this metric, lower is better. To minimize MTTR SLI, you need DevOps tools for identifying and repairing bottlenecks or crashing processes.

Next steps

From our experience working on Azure.com, we can say that following these design principles and best practices improves application resiliency, lowers costs, boosts security, and ensures scalability.

To review the workings of your Azure architecture, consider taking the architecture assessment.

For more information about the Azure services that make up Azure.com, see the next article in this blog series, How Azure.com operates on Azure part 2: Technology and architecture.