Anomaly detection using built-in machine learning models in Azure Stream Analytics

2月 13, 2019 に投稿済み

Principal Program Manager, Azure Big Data

Built-in machine learning (ML) models for anomaly detection in Azure Stream Analytics significantly reduces the complexity and costs associated with building and training machine learning models. This feature is now available for public preview worldwide both in the cloud and on IoT Edge.

What is Azure Stream Analytics?

Azure Stream Analytics is a fully managed serverless PaaS offering on Azure that enables customers to analyze and process fast moving streams of data, and deliver real-time insights for mission critical scenarios. Developers can use a simple SQL language (extensible to include custom code) to author and deploy powerful analytics processing logic that can scale-up and scale-out to deliver insights with milli-second latencies.

Traditional way to incorporate anomaly detection capabilities in stream processing

Many customers use Azure Stream Analytics to continuously monitor massive amounts of fast-moving streams of data in order to detect issues that do not conform to expected patterns and prevent catastrophic losses. This in essence is anomaly detection.

For anomaly detection, customers traditionally relied on either sub-optimal methods of hard coding control limits in their queries, or used custom machine learning models. Development of custom learning models not only requires time, but also high levels of data science expertise along with nuanced data pipeline engineering skills. Such high barriers to entry precluded adoption of anomaly detection in streaming pipelines despite the associated value for many Industrial IoT sites.

Built-in machine learning functions for anomaly detection in Stream Analytics

With built-in machine learning based anomaly detection capabilities, Azure Stream Analytics reduces complexity of building and training custom machine learning models to simple function calls. Two new unsupervised machine learning functions are being introduced to detect two of the most commonly occurring anomalies namely temporary and persistent.

  • AnomalyDetection_SpikeAndDip function to detect temporary or short-lasting anomalies such as spike or dips. This is based on the well-documented Kernel density estimation algorithm.
  • AnomalyDetection_ChangePoint function to detect persistent or long-lasting anomalies such as bi-level changes, slow increasing and slow decreasing trends. This is based on another well-known algorithm called exchangeability martingales.


SELECT  sensorid,  System.Timestamp as time, temperature as temp,        
AnomalyDetection_SpikeAndDip(temperature, 95, 120, 'spikesanddips')
LIMIT DURATION(second, 120) as SpikeAndDipScores
FROM input

In the example above, AnomalyDetection_SpikeAndDip function helps monitor a set of sensors for spikes or dips in the temperature readings. Furthermore, the underlying ML model uses a user supplied confidence level of 95 percent to set the model sensitivity. A training event count of 120 that corresponds to a 120 second sliding window are supplied as function parameters. Note that the job is partitioned by sensorid, which results in multiple ML models being trained under the hood, one for each sensor and all within the same single query.

Get started today

We’re excited for you to try out anomaly detection functions in Azure Stream Analytics. To try this new feature, please refer to the feature documentation, "Anomaly Detection in Azure Stream Analytics."