Today, modern services generate large volumes of telemetry data to track various aspects of operational health, system performance, usage insights, business metrics, alerting and many others. However, monitoring and gathering insights from this large volume of data for IT departments is often not fully automated and error prone (generally using rules or threshold based alerts), making it hard to effectively and accurately determine the health of the system at any given point in time.
Cortana Intelligence IT Anomaly Insights solves this customer pain by providing a solution with a low barrier of entry that is based on Cortana Intelligence Solutions (for easy deployment of Azure services) and Azure Machine Learning Anomaly Detection API (for fully automated tracking of historical and real time data), making it easy for a business decision maker to evaluate and realise value within minutes, also allowing customers to bring their own data, customise and extend the solution in order to adapt it to their particular scenarios via quick proof of concepts. With this solution, organizations will be able to:
- Leverage state-of-the-art Azure Machine Learning Anomaly Detection API to learn and react to anomalies from both historical and real-time data. This eliminates human-in-the-loop, otherwise needed for recalibrating thresholds for detect missing anomalies and minimise false positives.
- Quickly realise the potential of the solution by trying it out with their own data without any upfront investment. The 'Try it Now' experience also provides users the ability to determine the right set of sensitivity parameters for the use case in hand.
- Deploy an end-to-end pipeline into their subscription to ingest data from on-premises and cloud data sources and report anomalous events to downstream monitoring and ticketing systems in a plug-and-play manner within a matter of minutes.
Try It experience with PowerBI
See Solution architecture and detailed instructions on GitHub.
As described in the solution diagram below, real-time metric streams originating from both on-premises based or cloud based systems can be pumped into Azure Event Hub queue. These events (or time series data points) are processed by Azure Stream Analytics where they are aggregated at five minutes interval. Each time series is sent to Azure Anomaly Detection API for evaluation at 15 minutes cadence. The results from the API along with their dimensions provided during input are then stored in Azure SQL DB. The detected anomalies are also published in Azure Service Bus so that they can be consumed by the downstream ticketing systems. The solution also provides directions to setup Power BI dashboard is also provided so that the anomalies can be visualised quickly for root cause analysis.
Anomaly Detection API
The Anomaly Detection API is used in the 'Try It Now' experience and the deployed solution. It helps detect different types of anomalous patterns in your time series data. It assigns an anomaly score to each data point in the time series, which can be used for generating alerts, monitoring through dashboards or connecting with your ticketing systems. The anomaly detection API can detect the following types of anomalies on time series data:
- Spikes and Dips: For example, when monitoring the number of login failures to a service or number of checkouts in an e-commerce site, unusual spikes or dips could indicate security attacks or service disruptions.
- Positive and negative trends: When monitoring memory usage in computing, for instance, shrinking free memory size is indicative of a potential memory leak; when monitoring service queue length, a persistent upward trend may indicate an underlying software issue.
- Level changes and changes in dynamic range of values: For example, level changes in latencies of a service after a service upgrade or lower levels of exceptions after upgrade can be interesting to monitor.
©2017 Microsoft Corporation. All rights reserved. This information is provided "as-is" and may change without notice. Microsoft makes no warranties, express or implied, with respect to the information provided here. Third party data was used to generate the Solution. You are responsible for respecting the rights of others, including procuring and complying with relevant licences in order to create similar datasets.