Voting classifier using grid search for Time Series

Question

I have three models:

Arima
Auto ARIMA
Double Exponential Smoothing

I would like to apply an ensemble method - a voting method and allow the classifier to learn weights for these three models.

I have checked the votingclassifier present in scikit learn. It requires:
fit(x,y) to run.

Time series object that is present in series object don't have y.

How do you apply a voting classifier and learn weights through grid search?

ajrwhite · Answer

In a time series problem, your input (x) and output (y) are, in the most basic case, the same variable, recorded at different points in time.

If you have T time points: 1, 2, 3, ... T then we think of x as an array of data points, with an index t to access each time point.

Typically y will simply be your x array shifted forward in time (in the below example, by 1 time unit, so $y_t = x_{t-1}$ in vector notation or y[t] == x[t-1] in array notation):

x  |   y
-----|-----
 0.1 |  NaN
 0.2 |  0.1
 0.3 |  0.2
 0.3 |  0.3
 0.4 |  0.3
 0.3 |  0.4
 0.5 |  0.3
 1.0 |  0.5

Pandas has a shift() method for a time series, which allows you to shift your x column at different time intervals, and create a new y column using that shifted series. See https://stackoverflow.com/questions/10982089/how-to-shift-a-column-in-pandas-dataframe

You can add levels of complexity to this by including multiple time lags, as well as other variables, but this explains the basic principle of how to convert time series forecasting into a supervised learning problem.

Voting classifier using grid search for Time Series

One Answer

Add your own answers!

Ask a Question