How to add previous predictions for new predictions in LSTM?

Question

I am trying to train a model on a big data sequence like this [0.2 0.1 0.1 ..... 0.4 0.8] . I create X vectors with length 60 for inputs and Y scaler numbers as labels(It means the LSTM reads first 60 numbers as input(one row in X_train) and the 61'th number as the output label(rows in y_train)).

model = Sequential()
model.add(LSTM(units = 50, return_sequences = True , input_shape = (X_train.shape[1], 1)))
model.add(Dense(units = 1))
model.compile(optimizer = 'adam', loss = 'mean_squared_error')
model.fit(X_train, y_train, epochs = 100, batch_size = 32)

And:

X_train.shape = (1000,60,1) ,       y_train.shape = (1000,)

It's OK till here, but the problem is in the prediction part. What I am trying to do is create a (60*1) input vector and use it to predict the next number in my sequence. Then adding the new predicted number to my sequence to predict the next number(second predicted  number) and so on.
For that goal, I created a new model, retrieved the weights from previous model, then fed new_model only with one (60*1) vector to predict the next number. then added the predicted number to the sequence and shift input vectors one number to the right(to use new predicted number for the next number prediction).

new_model = Sequential()
new_model.add(LSTM(units = 50, return_sequences = True , input_shape = (1 , 60 , 1 )))
new_model.add(Dense(1))
old_weights = model.get_weights()
new_model.set_weights(old_weights)
new_model.compile(optimizer = 'adam', loss = 'mean_squared_error')

inputs = []
for i in range(10):
    inputs = dataset_total[len(dataset_total) - 60:].values
    inputs = np.reshape(inputs, (1 , 60, 1))
    predicted = new_model.predict(inputs)
    inputs.append(predicted)

But what do I get is such errors:

ValueError: Input 0 is incompatible with layer lstm_61: expected
ndim=3, found ndim=4

I don't know how to solve this problem!

There is also a similar but without related answer here(for clarification):
How to Predict the future values of time horizon with Keras?

ignatius · Answer

For your problem, Keras is telling you that the input vectorhas 4 dims, which is invalid, because for a LSTM model you need a input vector with (batch_size,num_time_steps,feature_space_dimension). Check the shape of the input vector, you may be adding an extra 1 dimension with the reshape operator

Other thing is the approach your are taking, which may be valid, but usually this problem is tackle with a specific type of models: Sequence-to-Sequence models, using LSTM encoder-decoder architecture.

This models try to predict a sequence

$Y = [x_{(t=1)}, x_{(t=2)},...,x_{(t=t+N)}]$

from a sequence of previous values

$X = [x_{(t=0)}, x_{(t=-1)},...,x_{(t=t-M)}]$

Note that the number of samples in each sequence can be different

In these models, the input sequence $X$  is fed into a LSTM encoder, which captures its dynamics creating a new context feature vector $F$. This new feature vector are then fed to the LSTM decoder as initial state, and then, the prediction $hat y(t)$ of cell for time $t$ is used as input for the cell for time $t+1$ to generate the predicted value $hat y(t+1)$

One key difference is, that Seq2seq models should be more efficient than your approach, since it performs the prediction of the N future values with a single run of the model. It could be interesting if you try this approach out and to compare it with yours

You can visit a Keras tutorial for these type of models here

How to add previous predictions for new predictions in LSTM?

One Answer

Add your own answers!

Ask a Question