TransWikia.com

Voting classifier using grid search for Time Series

Data Science Asked by Shan Khan on August 24, 2021

I have three models:

  1. Arima
  2. Auto ARIMA
  3. Double Exponential Smoothing

I would like to apply an ensemble method – a voting method and allow the classifier to learn weights for these three models.

I have checked the votingclassifier present in scikit learn. It requires:
fit(x,y) to run.

Time series object that is present in series object don’t have y.

How do you apply a voting classifier and learn weights through grid search?

One Answer

In a time series problem, your input (x) and output (y) are, in the most basic case, the same variable, recorded at different points in time.

If you have T time points: 1, 2, 3, ... T then we think of x as an array of data points, with an index t to access each time point.

Typically y will simply be your x array shifted forward in time (in the below example, by 1 time unit, so $y_t = x_{t-1}$ in vector notation or y[t] == x[t-1] in array notation):

  x  |   y
-----|-----
 0.1 |  NaN
 0.2 |  0.1
 0.3 |  0.2
 0.3 |  0.3
 0.4 |  0.3
 0.3 |  0.4
 0.5 |  0.3
 1.0 |  0.5

Pandas has a shift() method for a time series, which allows you to shift your x column at different time intervals, and create a new y column using that shifted series. See https://stackoverflow.com/questions/10982089/how-to-shift-a-column-in-pandas-dataframe

You can add levels of complexity to this by including multiple time lags, as well as other variables, but this explains the basic principle of how to convert time series forecasting into a supervised learning problem.

Answered by ajrwhite on August 24, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP