CNN output shape explanation

Question

I have the following sequential model:
model = models.Sequential()
model.add(Reshape(([1]+in_shp), input_shape=in_shp))
model.add(ZeroPadding2D((0, 2)))
model.add(Conv2D(256, (1, 3),padding='valid', activation="relu", name="conv1",data_format="channels_first", kernel_initializer='glorot_uniform'))
model.add(Dropout(dr))
model.add(ZeroPadding2D((0, 2)))
model.add(Conv2D(80, (2, 3), padding="valid", activation="relu", name="conv2",data_format="channels_first", kernel_initializer='glorot_uniform'))
model.add(Dropout(dr))
model.add(Flatten())
model.add(Dense(256, activation='relu', kernel_initializer='he_normal', name="dense1"))
model.add(Dropout(dr))
model.add(Dense( len(classes), kernel_initializer='he_normal', name="dense2" ))
model.add(Activation('softmax'))
model.add(Reshape([len(classes)]))
model.compile(loss='categorical_crossentropy', optimizer='adam')
model.summary()

and I got the following summary:
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
reshape_1 (Reshape)          (None, 1, 2, 128)         0         
_________________________________________________________________
zero_padding2d_1 (ZeroPaddin (None, 1, 6, 128)         0         
_________________________________________________________________
conv1 (Conv2D)               (None, 256, 6, 126)       1024      
_________________________________________________________________
dropout_1 (Dropout)          (None, 256, 6, 126)       0         
_________________________________________________________________
zero_padding2d_2 (ZeroPaddin (None, 256, 10, 126)      0         
_________________________________________________________________
conv2 (Conv2D)               (None, 80, 9, 124)        122960    
_________________________________________________________________
dropout_2 (Dropout)          (None, 80, 9, 124)        0         
_________________________________________________________________
flatten_1 (Flatten)          (None, 89280)             0         
_________________________________________________________________
dense1 (Dense)               (None, 256)               22855936  
_________________________________________________________________
dropout_3 (Dropout)          (None, 256)               0         
_________________________________________________________________
dense2 (Dense)               (None, 8)                 2056      
_________________________________________________________________
activation_1 (Activation)    (None, 8)                 0         
_________________________________________________________________
reshape_2 (Reshape)          (None, 8)                 0         
=================================================================
Total params: 22,981,976
Trainable params: 22,981,976
Non-trainable params: 0

The model works fine. But, I want to understand something regarding conv1 layer. Why the width value has been reduced from 128 to 126? I am really confused about that shouldn't it be the same value as the previous layer?
Also, the same thing for the conv2 layer too where the height and width have decreased from (10,126) to (9,124).
Could someone explain me why?

Hunar · Accepted Answer

In the convolution layer, the filter (3x3 in your case) is applied to the images in order to produce the output (feature map) and is slid to the right and bottom by a parameter called stride (in your case, it is not defined, the default is 1). Now if padding='valid', the output dimension will change, but if you change it to padding='same', the output dimension will be the same as input and this is because of the idea of zero padding (i.e., padding image borders with zero).

Media · Answer

It is because of the kind of convolution you've used. It is a valid convolution. If you want the output to be $128$, set the convolution to be same. Consider that this is also applicable to the deep layers too. They also can have either of these convolutions.

Answered by Media on March 10, 2021

CNN output shape explanation

2 Answers

Add your own answers!

Ask a Question