Passer au contenu principal

Advancing Reliability

Advancing anomaly detection with AIOps—introducing AiDice

lundi 3 octobre 2022

We introduce AiDice, a novel anomaly detection algorithm developed jointly by Microsoft Research and Microsoft Azure that identifies anomalies in large-scale, multi-dimensional time series data. AiDice captures incidents quickly and provides engineers with important context that helps them diagnose issues more effectively, providing the best experience possible for end customers.

Chief Technology Officer and Technical Fellow, Microsoft Azure

Progression de l’analyse de la disponibilité des machines virtuelles Azure avec Project Flash

lundi 14 février 2022

Aujourd’hui, nous sommes ravis d’annoncer la fin des deux premières grandes étapes du projet : la préversion des données de disponibilité des machines virtuelles dans Azure Resource Graph et la préversion privée d’une métrique de disponibilité des machines virtuelles dans Azure Monitor.

Chief Technology Officer and Technical Fellow, Microsoft Azure

Amélioration de la résilience du service dans Azure Active Directory avec son service d’authentification de sauvegarde

lundi 22 novembre 2021

La promesse la plus critique de nos services d’identité est de s’assurer que chaque utilisateur peut accéder aux applications et aux services dont il a besoin sans interruption. Nous avons renforcé cette promesse par le biais d’une approche multicouche, ce qui a entraîné notre promesse améliorée de temps d’activité de 99,99 % pour l’authentification Azure Active Directory (Azure AD).

Chief Technology Officer and Technical Fellow, Microsoft Azure

Advancing reliability through a resilient cloud supply chain

jeudi 30 septembre 2021

Microsoft’s cloud supply chain is essential to deliver the infrastructure—servers, storage, and networking gear—that enables cloud reliability and growth. Our vision is for cloud capacity to be available like a utility so that customers can seamlessly turn it on when and where they need it.

Chief Technology Officer and Technical Fellow, Microsoft Azure

Advancing application reliability with the Azure Well-Architected Framework

lundi 12 juillet 2021

We created the Azure Well-Architected Framework to help improve the quality of your workloads, and reliability is one of its five core pillars so for the latest post in our series, I have asked Cloud Advocate David Blank-Edelman to run through how best to approach using the framework to guide your conversations and design decisions in this space.

Chief Technology Officer and Technical Fellow, Microsoft Azure

Advancing resiliency threat modeling for large distributed systems

mercredi 7 juillet 2021

All service engineering teams in Azure are already familiar with postmortems as a tool for better understanding what went wrong, how it went wrong, and the customer impact of the related outage. For today’s post in our Advancing Reliability blog series, we share insights into our journey as we work towards advancing our postmortem and resiliency threat modeling processes.

Chief Technology Officer and Technical Fellow, Microsoft Azure

Advancing safe deployment with AIOps—introducing Gandalf

mercredi 30 juin 2021

The continuous monitoring of health metrics is a fundamental part of this process, and this is where AIOps plays a critical role. In the post that follows, we introduce how AI and machine learning are used to empower DevOps engineers, monitor the Azure deployment process at scale, detect issues early, and make rollout or rollback decisions based on impact scope and severity.

Chief Technology Officer and Technical Fellow, Microsoft Azure

Advancing in-datacenter critical environment infrastructure availability

lundi 7 juin 2021

There are many factors that can affect critical environment infrastructure availability—the reliability of the infrastructure building blocks, the controls during the datacenter construction stage, effective health monitoring and event detection schemes, a robust maintenance program, and operational excellence to ensure that every action is taken with careful consideration of related risk implications.

Chief Technology Officer and Technical Fellow, Microsoft Azure