• 7 min read

Describe, diagnose, and predict with IoT Analytics

In a previous blog article Extracting actionable insights from IoT data I discuss the value of collecting IoT data from machines, assets and products. The whole point of collecting IoT data is to extract actionable insights.

In a previous blog article Extracting actionable insights from IoT data I discuss the value of collecting IoT data from machines, assets and products. The whole point of collecting IoT data is to extract actionable insights. Insights that will trigger some sort of action that will result in some business value such as optimized factory operations, improved product quality, better understanding of customer demand, new sources of revenue, and improved customer experience.

In this blog, I discuss the extraction of value out of IoT data by focusing on the analytics part of the story.

Generally speaking, data analytics comes in four types (Figure 1):

  • Descriptive, to answer the question: What’s happening?
  • Diagnostic, to answer the question: Why’s happening?
  • Predictive, to answer the question: What will happen?
  • Prescriptive, to answer the question: What actions should we take?


Figure 1: IoT Analytics Flavors

Since IoT analytics is a subcase of data analytics, these types map nicely onto IoT analytics as follows:

Data Analytics Type


IoT Application

Representative Questions



What’s happening?

Monitor the status of machines, devices, products, and assets.

Assess if things are going according to plan, or to alert people if anomalies arise.

What’s the throughput and utilization of this machine?

Are there any anomalies that require immediate attention?

How much energy is this machine consuming?

How many parts are we producing with this tool?

How are customers using our products?

Where are my assets?

Usually in the form of dashboards that display current and historical sensor data, statistics, KPIs, and alerts.


Why is something happening?

Examine data from multiple angles to understand why something is happening.

The goal is to find the root cause of a problem, in order to fix, or improve something (a process, a service, or a product).

Why is the OEE of this machine so low?

Why is this machine producing more defective parts than the others?

Why is this machine consuming so much energy?

Why are we producing so few parts with this tool?

Why are we getting so many returns of this product?

Why are we getting so many product returns from our European customers?

Diagnostic capabilities are often extensions to dashboards that allow users to drill into the data, pivot it in multiple ways, compare it, visualize trends and correlations in an ad-hoc way.

The users performing diagnostics from data are normally domain experts (i.e., experts on the specific machine, process, device, product) as opposed to pure data scientists. Data scientists have a supporting role enabling domain experts to extract insights.


What will happen?

Calculate the probability that something will happen within a specific timeframe, based on historical data.

The goal is to proactively take some sort of corrective action before something (usually bad) happens, mitigate risk, or to identify opportunities to gain a competitive advantage.

What’s the probability of this machine failing in the next 24 hours?

What is the expected remaining useful life of this tool?

When should I plan service for this machine?

What will be the demand for this product or feature?

Usually implemented thru machine learning models that are trained with historical data and deployed to the cloud so that they can be used by end-user applications.


What actions should I take?

Recommend actions as a result of a diagnosis or a prediction, or at least provide some visibility to the reasoning behind a diagnostic or prediction.

Often the recommendations are about how to fix or optimize something.

This machine is 90 percent likely to fail in the next 24 hours. What should I do to prevent it?

The OEE of this machine is low. What can I do to improve it?

This machine is producing too many defective parts. What should I do to avoid this?

This design is causing many manufacturing issues. How can I improve the design to reduce them?

Prescriptive analytics is still in its early stages.

Often an extension of predictive analytics, where the user is presented with the steps a machine learning model took to reach a conclusion or prediction. While this is not quite a recommendation, it may provide some insight into the reasoning of the ML algorithm to hints at a recommendation.

The role of Machine Learning in IoT Analytics

Machine learning (ML) is playing an increasingly important role in IoT analytics. One could argue that the recent emergence of real-world applications of ML in manufacturing is thanks to the explosion of data, most of which we can attribute to the IoT. Other factors include the availability of better ML algorithms, and the compute power of the cloud.

A deeper discussion of ML for manufacturing will be the focus of future blogs. For now, the important thing is that there are many ML algorithms to choose from and selecting the right one depends on:

  • What you are trying to do? (predict values, predict categories, find unusual data points, etc.)
  • What is the specific scenario?
  • What is the desired accuracy?
  • What is the desired training time?
  • How linear is the data?
  • How many parameters are there?
  • How many features are there?

And more, see the Azure Machine Learning: Algorithm Cheat Sheet to get a better idea.

Even for a specific use case, you may have multiple algorithm choices. For example, predictive maintenance:

If you want to…

You would likely choose the following ML algorithm

Predict the probability that a piece of equipment will fail within a future time period.

Binary classification

Compute the remaining useful life of an asset.


Predict a range of time to failure for an asset.

Calculate the likelihood of failure in a future period due to one of multiple root causes.

Predict the most likely root cause of a given failure

Multi-class classification

One important point is that ML is not only for predictive and prescriptive analytics, although is commonly associated with these types of analytics. ML can also be used for descriptive and diagnostic analytics. For example, raising alerts when something is abnormal with a machine is a descriptive analytics scenario and can be enabled with anomaly detection ML algorithms.

IoT Analytics solution overview

At a high level, an IoT analytics solution contains the following components (Figure 2):

  • Data ingestion: where we connect to the devices or field gateways to collect the data records they stream. In IoT, data records can be events, messages, alerts, or telemetry (sensor measurements). They are usually timestamped and may come at different frequencies. They may come in different formats and use different communication protocols. This data is ingested into the cloud thru a cloud gateway.
  • Stream processing: in some applications, data is analyzed live as it is streamed. This is usually done with a combination of visualization, live queries, and actions.  For example, a dashboard can display the temperature of a device over a period of time. Live queries may be constantly checking that the average temperature over the last 5 minutes doesn’t exceed 90 F. And these queries can execute some action (for example, shut down the machine, and/or send an alert) if it does. Stream processing data may or may not be stored.
  • Data storage: where we store the IoT data records being ingested. IoT solutions can generate significant amounts of data depending on how many devices are in the solution, how often they send data, and the size of payload in the data records sent from devices. Data is often time stamped and required to be stored where it can be accessed for further processing and used in visualization and reporting. It is common to have IoT data split into “warm” and “cold” data stores:
    • Warm storage: for data that needs to be available for reporting and visualization immediately. Warm storage holds recent data that needs to be accessed with low latency, high throughput, and full query capabilities.
    • Cold storage: for data that is stored longer term and used for batch processing. Data stored in cold storage is typically historical data that needs to be accessed in the future for reporting, analysis, training machine learning models, etc. Most often the cold storage database technology chosen will be cheaper in cost but offer fewer query and reporting features than the warm database solution.

A common implementation for storage is to keep a recent range (e.g. the last day, week, or month) of telemetry data in warm storage and to store historical data in cold storage. With this implementation, the application has access to the most recent data and can quickly observe recent telemetry data and trends. Retrieving historical information for devices can be accomplished using cold storage, generally with higher latency than if the data were in warm storage.

  • Data transformation: involves manipulation or aggregation of the telemetry stream either before or after it is received by the cloud gateway service. Manipulation can include protocol transformation, combining data points, and more.
  • Machine learning: highly structured data stored in cold storage is normally used to train the ML models. Once created, the ML models are deployed and can be consumed (used) by applications.
  • User interface & reporting tools: the end-user applications used to visualize and analyze the IoT data.
    Integrations to other systems: the insight extracted from the IoT solution will normally result in some sort of action. Often, this action takes place in an external line-of-business system such as CRM, PLM, or ERP. For example, when a predictive maintenance machine learning model predicts that a machine will fail, it can trigger an action in the CRM system to schedule its service.


Figure 2: Components of an IoT Analytics Solution


This blog article focuses on the analytics portion of an IoT solution. It discusses the four types of analytics (descriptive, diagnostic, predictive, and prescriptive) and how they map to IoT. I also discussed the role of machine learning in IoT analytics. Finally, I overviewed a high-level solution architecture to highlight the components of an IoT solution that come into play when it comes to IoT analytics.

Recommended next steps

Learn more about IoT analytics by exploring Extracting actionable insights from IoT data.