• 3 min read

Foretell and prevent downtime with predictive maintenance

The story of predictive maintenance (PdM) starts back in the 1990s. Technologies began to arrive that sense the world in new ways: ultrasound, infrared, thermal, vibration, to name a few. However,…

The story of predictive maintenance (PdM) starts back in the 1990s. Technologies began to arrive that sense the world in new ways: ultrasound, infrared, thermal, vibration, to name a few. However, until recently the technology has not been available to make predictive maintenance a reality. But now, with advances in cloud storage, machine learning, edge computing, and the Internet of Things — predictive maintenance looms as the next step for the manufacturing industry.

What is predictive maintenance?

There are three strategies for machine maintenance:

  • Reactive — the “don’t fix what isn’t broken” approach. This means you extract the maximum possible lifetime from a machine. However, costs balloon with unexpected downtime and collateral damage from failures.
  • Preventative — service on a fixed schedule based on the regularity of previous failures. You maximize up-time by fixing machines before they fail. The downside is that components may have life left, and there is still a chance that they will fail before the scheduled maintenance.
  • Predictive — where we use data about previous breakdowns to model when failures are about to occur, and intervene just as sensors detect the same conditions. Until recently this has not been a realistic option, as modeling did not exist, and real-time processing power was too expensive. But Azure solves that problem.

The figure below shows how the three strategies differ.


It is clear that predictive maintenance is superior by far to the other methods. It allows you to maximize uptime while getting the most value out of your machinery. Also, using machine learning, the model can continuously be refined; over time, you will experience fewer failures.

How Azure can help

To use machine learning for a PdM solution, there are three requirements: a machine learning model, data used to train the model, and a data ingestion mechanism to gather the training data. Once you have an ingestion point, you need to collect data from a normally operating machine until the machine fails. You then can then characterize the received data as normal, failing, and failed. This data is used to train the model, which means the model is successively adjusted until it can predict failure with some certainty.

To accurately assess the state of a system which leads to failure, as much data as possible needs to be collected. In other words, start collecting your data now. The more comprehensive the data used to train the model, the more accurate the analysis will be.

Azure options for data ingestion

Azure offers these options for data ingestion: Event Hubs, Azure IoT Hub, and Kafka on HDInsight. (For more information, see Choosing a real-time message ingestion technology in Azure.) Ingesting data from distributed systems is often not a sustainable approach. As more systems talk to each other, the system builds a Ο(n2) complexity. A much better architecture is to have all your systems talking to a central hub. This can be implemented efficiently using the Azure IoT Hub. All your systems can speak to the IoT Hub, and the hub feeds the data into Azure. Loading the data into Azure Data Factory is a great option, as it can move and transform data as part of an automated pipeline.
Once the data has been ingested, it is then used to train your machine learning model. Azure options for doing this include Azure Machine Learning Studio, Azure Databricks, Data Science Virtual Machine, and Azure Batch AI — to name a few. The choice depends on the complexity of your problem, the experience of your team, and the size of the data to be processed.

Once the model has been trained and is ready for use, the results can be presented. This means building a mechanism to predict future failures and generating notifications for action. The workflow looks something like this:

image In a working system, the results are presented to the maintenance team in real-time, along with recommendations for action. The team can decide the best course of action.

Overall, Azure can provide your predictive maintenance solution with the following:

  • Scalability, as storage and processing power, is easily scalable.
  • Availability and resilience, through the fact that you can provision resources as needed.
  • Management, through a variety of options including ARM, PowerShell, and management APIs.
  • Security, via IoT Hub keys, encryption and much more.
  • Cost-effectiveness, as resources can be provisioned and discarded, as necessary.

Read the Predictive Maintenance overview for more detailed information about creating your predictive maintenance solution on Azure.