Running one of the largest clouds in the world, Microsoft has gained a lot of insight into building and managing a global, high performance, highly available, and secure network. Experience has taught us that with hundreds of datacenters and tens of thousands of switches, we needed to:

  • Use best-of-breed switching hardware for the various tiers of the network.
  • Deploy new features without impacting end users.
  • Roll out updates securely and reliably across the fleet in hours instead of weeks.
  • Utilize cloud-scale deep telemetry and fully automated failure mitigation.
  • Enable our Software-Defined Networking software to easily control all hardware elements in the network using a unified structure to eliminate duplication and reduce failures.

To address these requirements, Microsoft pioneered Software for Open Networking in the Cloud (SONiC), a breakthrough for network switch operations and management. Microsoft open-sourced this innovation to the community, making it available on our SONiC GitHub Repository. SONiC is a uniquely extensible platform, with a large and growing ecosystem of hardware and software partners, that offers multiple switching platforms and various software components.

Switch Abstraction Interface (SAI) accelerates hardware innovation

SONiC is built on the Switch Abstraction Interface (SAI), which defines a standardized API. Network hardware vendors can use it to develop innovative hardware platforms that can achieve great speeds while keeping the programming interface to ASIC (application-specific integrated circuit) consistent. Microsoft open sourced SAI in 2015. This approach enables operators to take advantage of the rapid innovation in silicon, CPU, power, port density, optics, and speed, while preserving their investment in one unified software solution across multiple platforms.


Figure 1. SONiC: one investment to unblock hardware innovation

Modular design with containers accelerates software evolution

SONiC is the first solution to break monolithic switch software into multiple containerized components. SONiC enables fine-grained failure recovery and in-service upgrades with zero downtime. It does this in conjunction with Switch State Service (SWSS), a service that takes advantage of open source key-value pair stores to manage all switch state requirements and drives the switch toward its goal state. Instead of replacing the entire switch image for a bug fix, you can now upgrade the flawed container with the new code, including protocols such as Border Gateway Protocol (BGP), without data plane downtime. This capability is a key element in the serviceability and scalability of the SONiC platform.

Containerization also enables SONiC to be extremely extensible. At its core, SONiC is aimed at cloud networking scenarios, where simplicity and managing at scale are the highest priority. Operators can plug in new components, third-party, proprietary, or open sourced software, with minimum effort, and tailor SONiC to their specific scenarios.

Confirguration and management tools

Figure 2. SONiC: plug and play extensibility

Monitoring and diagnostic capabilities are also key for large-scale network management. Microsoft continuously innovates in areas such as early detection of failure, fault correlation, and automated recovery mechanisms without human intervention. These innovations , such as Netbouncer and Everflow, are all available in SONiC, and they represent the culmination of years of operations experience.

Rapidly growing ecosystem

SONiC and SAI have gained wide industry support over the last year. Most major network chip vendors are supporting SAI on their flagship ASICs:

The community are actively adding new extensions and advanced capabilities to SAI releases:

  • Broadcom, Marvell, Barefoot, and Microsoft are driving advanced monitoring and telemetry in SAI to enable deep visibility into the ASIC and powerful analytic capabilities.
  • Mellanox, Cavium, Dell, and Centec are contributing to protocol announcement to SAI for richer protocol support and large scale network scenarios; for example, MPLS, Enhanced ACL model, Bridge Model, L2/L3 Multicast, segment routing, and 802.1BR.
  • Dell and Metaswitch are bringing failure resiliency and performance to SAI by adding L3 fast reroute and BFD proposals.
  • The pipeline model driven by Mellanox and Broadcom and multi-NPU by Dell enriches the infrastructure that SAI and network stack built on top can apply to.

At the Open Compute Project U.S. Summit 2017, we will demonstrate 100-gigabits switches from multiple switch hardware companies. SONiC is enabled on their latest and fastest SKUs. The platforms that support SONiC are:

With SONiC, the cloud community has choices—they can cherry pick best-of-breed solutions. Partners are joining the eco-system to make it richer:

  • Arista is offering containerized EOS components like EOS BGP to run on top of SONiC. The SONiC community now has easy access to Arista’s rich software suite of EOS.
  • Canonical enabled SONiC as a snap for Ubuntu. It enables MAAS to deploy SONiC to switches as well as using SONiC to deploy the servers. Unified network and server deployment is going to significantly improve the agility of operators.
  • Docker enabled using Swarm to manage the SONiC containers. With its simple and declarative service model, Swarm can manage and update SONiC at scale.
  • Mellanox is using SONiC to unleash the hardware-based packet generation capabilities in the Spectrum ASIC. This is a highly sought-after capability that will help diagnosis and troubleshooting.

By working with the community and our partner ecosystem, we’re looking to revolutionize networking for today and into the future.

SONiC is fully open sourced on GitHub and is available to industrial collaborators, researchers, students, and innovators alike. With the SONiC containerized approach and software simulation tools, developers can experience the switch software used in Microsoft Azure, one of the world’s largest cloud platforms, and contribute components that will benefit millions of customers. SONiC will benefit the entire cloud community, and we’re very excited for the increasingly strong partner momentum behind the platform.

