TransWikia.com

Questions about a multivariate timeseries forecasting model - keras

Data Science Asked by wazzup on December 13, 2020

I have trouble understanding the model I’m trying to create.
I have few questions so I’ll explain my model first and what I’m trying to do:

I have created sequences of data (input and ouput of the model) 7 timesteps each so the input would be the values of the days of a certain week and the output is the values of the days the following week (**so input1 of my model has an imput shape of (7,1) same as the output**).
I also have prepared another input list that has some extra features like holiday flag and weather condition for the **following week** so the model’s second input2 has the shape (7,7) each example.

This is the full model summary:

full model summary

After the lstm layers and a fully connected NN, I tried to concatenate the two inputs together, basically, I want to concatenate input2 to the output of the layer dense_1.
So I’m concatenating (7,1) shape with the second input ‘input_2‘ of shape (7,7)

My questions:

1- The outputs of my dense layers confuse me I thought they would be in the shape (None, number of units) but they seem to be (None, 7 , number of units) there is always the 2nd dimension "7" which i don’t understand.

dense layers

2- For the concatenation part, since i’m adding a 7×7 input to a 7×1 input i was expecting to have 49+7=56 units connected to the next dense layer but the number of parameters tells me it’s not the case being just 14*120+120=1800

concat

3- I was thinking of having 7 units in the last layer ( 1 for each day to predict ) but had to make it 1 so it would output (7,1) and for it to work, there is something i’m clearly missing.

3rd question

This is the code for model definition:

input1 = tf.keras.layers.Input(shape=(7,1),name="input_1")
x = tf.keras.layers.Conv1D(120, kernel_size=5,strides=1,activation="relu",
                           padding="causal",input_shape=[7, 1])(input1)
x = tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(210, return_sequences=True),name="LSTM_1")(x)
x = tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(210, return_sequences=True,name="LSTM_2"))(x)
x = tf.keras.layers.Dense(60,activation="relu",name="dense_1_1")(x)
x = tf.keras.layers.Dense(30,activation="relu",name="dense_1_2")(x)
x = tf.keras.layers.Dense(7,name="dense_1_3")(x)

input2 = tf.keras.layers.Input(shape=(7,7),name="input_2")
concat = tf.keras.layers.concatenate([x, input2],name="concat_1")
x = tf.keras.layers.Dense(120,activation="selu",name="dense_2_1")(concat)
x = tf.keras.layers.Dense(90,activation="selu",name="dense_2_2")(x)
x = tf.keras.layers.Dense(60,activation="selu",name="dense_2_3")(x)
output = tf.keras.layers.Dense(1)(x)

model = tf.keras.Model([input1, input2], output)

These are the dimensions of the inputs (input1 & input2) and the output

dimensions

One Answer

From Keras docs: https://keras.io/api/layers/core_layers/dense/

Input shape

N-D tensor with shape: (batch_size, ..., input_dim). The most common situation would be a 2D input with shape (batch_size, input_dim).

Output shape

N-D tensor with shape: (batch_size, ..., units). For instance, for a 2D input with shape (batch_size, input_dim), the output would have shape (batch_size, units).

If the input to the layer has a rank greater than 2, then Dense computes the dot product between the inputs and the kernel along the last axis of the inputs and axis 1 of the kernel (using tf.tensordot). For example, if input has dimensions (batch_size, d0, d1), then we create a kernel with shape (d1, units), and the kernel operates along axis 2 of the input, on every sub-tensor of shape (1, 1, d1) (there are batch_size * d0 such sub-tensors). The output in this case will have shape (batch_size, d0, units).

The input to dense_1_1 is (None, 7, 420). The Dense layer will be applied to every [:,i,:] for in 1..7 slice of the input producing an output of size (None, 7, 60)

If you're expecting an output with shape (None, 60) you can reshape the input to (None, 420*7) but that would not make sense in this case since you would want to apply the Dense layer to each time slice of your input.

Correct answer by vineet gundecha on December 13, 2020

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP