LSTM keras model architecture interpretation

Question

I would appreciate if anyone could correct my interpretation of the LSTM architecture in keras-
for example in this simple case
model = Sequential()
model.add(LSTM(32, input_shape=(10, 2)))
model.add(Dense(1))

My interpretation is that, 2 features get mapped to 32 cells (sort of dense layer connection) and each of this cell unrolls in time 10 times. Next, since return_sequences=False by default, the output would be 32 values, one from each cell which then again get mapped to a single output neuron.
So something like where
F -> Feature matrix at an instant
I -> Identity matrix
for 10 times:

$F[1*2]_{matrix}$ X $I[2*32]_{matrix}$ ----> outputs $[1*32]_{input-to-32-cells}$
$in[1*32]$ X $[LSTM-equations]$ ----> outputs $[1*32]_{output-to-dense-on-last-iteration}$

if last_iterration:

$[1*32]$ X $[32*1]_{trainable-weights}$ ----> outputs a single scalar value

and the second case around which I am trying to understand is, when there are multiple LSTM layers-
something like
model = Sequential()
model.add(LSTM(32, input_shape=(10, 2)))
model.add(LSTM(64))
model.add(Dense(32))
model.add(Dense(1))

Focusing on what is happening between the two LSTM layers, I assumed that there would be a $I[32*64]$ identity matrix but what controls the timesteps of this second LSTM layer? I mean-
A) does the output from 64 cells is processed every time the layer below processes 10 timesteps or
B) does it wait a for a total of 100 timesteps or 10 samples (10*10) and then produce an output?
(I think A should be the right one, but please do comment on it)
Hope I was able to express clearly, if not please let me know.

LSTM keras model architecture interpretation

Add your own answers!

Ask a Question