MATCH_RECOGNIZE in Azure Stream Analytics significantly reduces the complexity and cost associated with building, modifying, and maintaining queries that match sequence of events for alerts or further data computation.
What is Azure Stream Analytics?
Azure Stream Analytics is a fully managed serverless PaaS offering on Azure that enables customers to analyze and process fast moving streams of data and deliver real-time insights for mission critical scenarios. Developers can use a simple SQL language, extensible to include custom code, in order to author and deploy powerful analytics processing logic that can scale-up and scale-out to deliver insights with milli-second latencies.
Traditional way to incorporate pattern matching in stream processing
Many customers use Azure Stream Analytics to continuously monitor massive amounts of data, detecting sequence of events and deriving alerts or aggregating data from those events. This in essence is pattern matching.
For pattern matching, customers traditionally relied on multiple joins, each one detecting a single event in particular. These joins are combined to find a sequence of events, compute results or create alerts. Developing queries for pattern matching is a complex process and very error prone, difficult to maintain and debug. Also, there are limitations when trying to express more complex patterns like Kleene Stars, Kleene Plus, or Wild Cards.
To address these issues and improve customer experience, Azure Stream Analytics provides a MATCH_RECOGNIZE clause to define patterns and compute values from the matched events. MATCH_RECOGNIZE clause increases user productivity as it is easy to read, write and maintain.
Typical scenario for MATCH_RECOGNIZE
Event matching is an important aspect of data stream processing. The ability to express and search for patterns in a data stream enable users to create simple yet powerful algorithms that can trigger alerts or compute values when a specific sequence of events is found.
An example scenario would be a food preparing facility with multiple cookers, each with its own temperature monitor. A shut down operation for a specific cooker need to be generated in case its temperature doubles within five minutes. In this case, the cooker must be shut down as temperature is increasing too rapidly and could either burn the food or cause a fire hazard.
Query SELECT * INTO ShutDown from Temperature MATCH_RECOGNIZE ( LIMIT DURATION (minute, 5) PARTITION BY cookerId AFTER MATCH SKIP TO NEXT ROW MEASURES 1 AS shouldShutDown PATTERN (temperature1 temperature2) DEFINE temperature1 AS temperature1.temp > 0, temperature2 AS temperature2.temp > 2 * MAX(temperature1.temp) ) AS T
In the example above, MATCH_RECOGNIZE defines a limit duration of five minutes, the measures to output when a match is found, the pattern to match and lastly how each pattern variable is defined. Once a match is found, an event containing the MEASURES values will be output into ShutDown. This match is partitioned over all the cookers by cookerId and are evaluated independently from one another.
MATCH_RECOGNIZE brings an easier way to express patterns matching, decreases the time spent on writing and maintaining pattern matching queries and enable richer scenarios that were practically impossible to write or debug before.
Get started with Azure Stream Analytics
Azure Stream Analytics enables the processing of fast-moving streams of data from IoT devices, applications, clickstreams, and other data streams in real-time. To get started, refer to the Azure Stream Analytics documentation.