What should be the input shape for convLSTM if ResNet-50 is applied?

Data Science Asked on July 12, 2021

I have a dataset 12 videos. Each video is comprised of 179 frames. On these frames, I have applied ResNet-50 to extract features, and I received (179,7,7,2048) features. As far I know,

179=Total number of frames

2048=total number of features generated from a frame

7*7=kernel size / Filter size

Now I have to train my model using convLSTM by passing the features extracted through ResNet-50. And I know that input shape for convLSTM is

batch_shape + (channels, conv_dim1, conv_dim2, conv_dim3)
OR

batch_shape + (conv_dim1, conv_dim2, conv_dim3, channels)

So what should be the input shape for convLSTM and how can I apply the output of ResNet-50 to the convLSTM?

Regard

computer vision convolutional neural network lstm training

Add your own answers!

Ask a Question

Get help from others!