Detect the time at which deviation occurs in time series data

Question

I working on multivariate time series data. I have sensor data generated by a machine every time it is operated. Data set consists of machine_ID(machines of same model), hours_ operated, measurements from various sensors. The machine starts to degrade after operating for certain hours. I would like to find the hours after which there is step change after which the performance starts to degrade.

I want to do this using machine learning approach preferably and would like to plot the graph marking the deviation. Which ML techniques could be used for this approach.

I have performed exploratory data analysis where I could find the point at which there is deviation occurrence. Now, I want to confirm this by running a model to detect the occurrence of step change. In the figure above, the decline starts somewhere at 100 and decline gradually. Now, is there any way I could find this pint through models.

I greatly appreciate any links or suggestion to deal with this.

Thanks in advance.

Brian Spiering · Answer

AnomalyDetection is an open-source R package to detect anomalies
  which is robust, from a statistical standpoint, in the presence of
  seasonality and an underlying trend.

A blogpost that introduces the package can be found here and more formal paper can be found here.

Brian Spiering · Answer

You could model the process as a Weibull distribution which is common for survival analysis and reliability engineering. There has been work on monitoring the "health" of systems using it, examples are here and here.

Answered by Brian Spiering on March 6, 2021

tom · Answer

I would try using Google's CausalImpact package.  Your use-case isn't causal inference exactly, but CausalImpact relies on bayesian structural time series models (using the bsts package) and has some good defaults that keep you from needing to dive into bsts immediately.

Basically, you fit a model to the first part of your data, then forecast the rest. You see where your model deviates from the forecast.  Using bayesian models means you can get error bounds - so you can have a degree of confidence in the deviation.  In your case, you'd set your 'intervention' point to wherever timestamp you want to separate your modeling data from your forecasting data. Then compare the forecast to the actual data (look up 'nowcasting').

Here is a tutorial to get you started, and here's an introductory video.

Detect the time at which deviation occurs in time series data

3 Answers

Add your own answers!

Ask a Question