TransWikia.com

Detect the time at which deviation occurs in time series data

Data Science Asked on March 6, 2021

I working on multivariate time series data. I have sensor data generated by a machine every time it is operated. Data set consists of machine_ID(machines of same model), hours_ operated, measurements from various sensors. The machine starts to degrade after operating for certain hours. I would like to find the hours after which there is step change after which the performance starts to degrade.

I want to do this using machine learning approach preferably and would like to plot the graph marking the deviation. Which ML techniques could be used for this approach.

Here is how the plot of values 'sensor 2' from run-to-failure

Snapshot of data

I have performed exploratory data analysis where I could find the point at which there is deviation occurrence. Now, I want to confirm this by running a model to detect the occurrence of step change. In the figure above, the decline starts somewhere at 100 and decline gradually. Now, is there any way I could find this pint through models.

I greatly appreciate any links or suggestion to deal with this.

Thanks in advance.

3 Answers

AnomalyDetection is an open-source R package to detect anomalies which is robust, from a statistical standpoint, in the presence of seasonality and an underlying trend.

A blogpost that introduces the package can be found here and more formal paper can be found here.

Answered by Brian Spiering on March 6, 2021

You could model the process as a Weibull distribution which is common for survival analysis and reliability engineering. There has been work on monitoring the "health" of systems using it, examples are here and here.

Answered by Brian Spiering on March 6, 2021

I would try using Google's CausalImpact package. Your use-case isn't causal inference exactly, but CausalImpact relies on bayesian structural time series models (using the bsts package) and has some good defaults that keep you from needing to dive into bsts immediately.

Basically, you fit a model to the first part of your data, then forecast the rest. You see where your model deviates from the forecast. Using bayesian models means you can get error bounds - so you can have a degree of confidence in the deviation. In your case, you'd set your 'intervention' point to wherever timestamp you want to separate your modeling data from your forecasting data. Then compare the forecast to the actual data (look up 'nowcasting').

Here is a tutorial to get you started, and here's an introductory video.

Answered by tom on March 6, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP