|Data Lake 存储|
|Data Lake 分析|
|Azure Active Directory|
|Azure AD 域服务|
|Azure Active Directory B2C|
|Access Control 服务|
|Visual Studio Team Services|
|Visual Studio Application Insights|
|Azure DevTest Labs|
|Microsoft Azure 预览版门户|
From approximately 05 Feb, 2016 at 11:15 UTC a subset of customers using Cloud Services and HDInsights in Central India may experience an error when attempting to create new deployments in the region. All existing deployments of Cloud Services and HDInsights in Central India are unaffected. Engineers are working to apply a mitigation. Further updates will be provided to affected customers directly in their Management Portal.
SUMMARY OF IMPACT: Between 12:26 and 14:08 UTC on 04 Feb 2016, a subset of customers using Cloud Services in Central India may have experienced errors when attempting to perform service management operations, this included the ability to create new Cloud Services or modify existing instances. PRELIMINARY ROOT CAUSE: Engineers identified a configuration change that caused service management operation to fail in this region. MITIGATION: Engineers deployed a hotfix to rectify the misconfiguration. The hotfix has allowed service management operation requests to now succeed. NEXT STEPS: Engineers will continue to investigate the underlying root cause as to why the configuration change caused service management operations to fail.
SUMMARY OF IMPACT: Between 09:10 and 14:28 UTC on 04 Feb 2016, customers attempting to log into their Visual Studio Team Services accounts will have been unable to access their accounts. PRELIMINARY ROOT CAUSE: A SQL stored procedure that was being called was allocating too much memory in one of the critical backend SQL databases. After an extended period of time, this caused the SQL databases to fall into an unresponsive state and resulted in customers being unable to access their VSTS accounts. MITIGATION: Engineers performed a SQL database failover which allowed for temporary mitigation, however the same procedure was quickly allocating memory to the newly assigned databases, which in turn became unresponsive. Engineers manually assigned allocation limits for the procedure that was being called. This has ensured the backend SQL databases remain in a healthy state. NEXT STEPS: Engineers will review all allocation limits for called procedures to prevent further instances occurring.
SUMMARY OF IMPACT: Between 12:55 and 20:15 UTC on 03 FEB 2016 a subset of customers using Virtual Machines on a single scale unit in Southeast Asia experienced the inability to connect or access their Virtual Machines. Service Management operations for existing Virtual Machines on the impacted scale unit would have failed during this time. PRELIMINARY ROOT CAUSE: A network configuration change unexpectedly rebooted multiple switches supporting this cluster. Due to an unrelated error, the DHCP relay addresses configured on the impacted switches pointed to a legacy DHCP server. This impacted the automated recovery systems we have in place, delaying the recovery of Virtual Machines after these switches came back online. MITIGATION: After mitigation steps were performed, the majority of Virtual Machines came back to a healthy state and service management operations recovered immediately. We have identified a small subset of Virtual Machines that may still be inaccessible. Our engineers are completing additional mitigation steps for these Virtual Machines and we will provide further updates to these customers in their management portal (portal.azure.com). NEXT STEPS: Engineers will continue to investigate the underlying root cause of this issue and why automated recovery did not succeed. A detailed RCA will be performed and made available to impacted customers.
SUMMARY OF IMPACT: Between 03:09 and 12:17 UTC on 03 Feb, customers using Visual Studio Team Services \ Load Testing and Visual Studio Team Services and Visual Studio Team Services \ Build & Deployment/Build (XAML) may have experienced latency or timeouts when attempting to log in to their VSTS account. In addition, a subset of customers may have experienced a blank project page once they had logged in. PRELIMINARY ROOT CAUSE: A recent update to VSTS configuration settings resulted in traffic being incorrectly routed to and from Visual Studio Team Services. This caused latency and blank pages to appear for some customers. MITIGATION: Engineers have restored the previous configuration to bring services back to a normal state. NEXT STEPS: Engineers will continue to investigate the underlying issue that caused traffic to route incorrectly.
SUMMARY OF IMPACT: Between 15:30 and 19:15 UTC on 02 FEB 2016 customers using Data Factory in West US may have experienced delays when creating new Data Factories or updated existing Data Factories. Existing jobs were not impacted by this incident. PRELIMINARY ROOT CAUSE: A pre-existing software issue impacted a recent deployment to cause nodes to enter an unhealthy state. MITIGATION: Engineers stopped backend processes to bring nodes back to a healthy state. NEXT STEPS: Engineers will continue to investigate the root cause and will introduce a newer backend system to prevent recurrences.
SUMMARY OF IMPACT: Between 04:00 and 07:35 on 30 Jan 2016 UTC a subset of customers using App Service \ Web App in Central US experienced issues accessing services or performing operations due to an availability impacting a Storage scale unit in the region. Detailed resolution messaging is available under Storage on the history page.
SUMMARY OF IMPACT: Between 04:00 and 06:30 on 30 Jan 2016 UTC a subset of customers using a number of services, including App Service \ Web App, Logic App, Key Vault, Redis Cache, Event Hubs, Media Services and Virtual Machines in Central US experienced issues accessing services or performing operations due to an availability impacting a Storage scale unit in the region. PRELIMINARY ROOT CAUSE: A number of storage nodes entered an unhealthy state and were unable to recover automatically before quorum was lost in the locking service. MITIGATION: Engineers manually recovered the nodes to restore access to storage. NEXT STEPS: determine the root cause and implement repairs to prevent future occurrences. Any Virtual Machines customers still experiencing issues as a result of this interruption will receive direct communication through the Management Portal.
SUMMARY OF IMPACT: Between 04:00 and 06:50 on 30 Jan 2016 UTC a subset of customers using a number of services, including Logic App, Key Vault, Redis Cache, Event Hubs, Media Services and Virtual Machines in Central US experienced issues accessing services or performing operations due to an availability impacting a Storage scale unit in the region. Detailed resolution messaging is available under Storage on the history page.
SUMMARY OF IMPACT: Between 12:00 and 16:55 on 29 JAN 2016 UTC a subset of customers using App Service \ Web App in North Europe experienced timeouts or 500 errors when accessing websites in this region. PRELIMINARY ROOT CAUSE: Engineers identified throttled storage accounts. MITIGATION: Engineers applied a change to a back end load balancer which mitigated the throttling issue. NEXT STEPS: investigate the cause of the unbalanced load and develop a solution to prevent it reoccurring.