Artificial Intelligence Asked by Vesko Vujovic on December 25, 2021
I’m building a denoising autoencoder. I want to have the same input and output shape image.
This is my architecture:
input_img = Input(shape=(IMG_HEIGHT, IMG_WIDTH, 1))
x = Conv2D(32, (3, 3), activation='relu', padding='same')(input_img)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(64, (3, 3), activation='relu', padding='same')(x)
encoded = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(32, (3, 3), activation='relu', padding='valid')(encoded)
x = UpSampling2D((2, 2))(x)
x = Conv2D(32, (3, 3), activation='relu', padding='same')(x)
x = UpSampling2D((2, 2))(x)
decoded = Conv2D(1, (3, 3), activation='sigmoid', padding='same')(x)
# decodedSize = K.int_shape(decoded)[1:]
# x_size = K.int_shape(input_img)
# decoded = Reshape(decodedSize, input_shape=decodedSize)(decoded)
autoencoder = Model(input_img, decoded)
autoencoder.compile(optimizer='adadelta', loss='binary_crossentropy')
My input shape is: 1169×827
This is Keras output:
Model: "model_6"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_7 (InputLayer) [(None, 1169, 827, 1)] 0
_________________________________________________________________
conv2d_30 (Conv2D) (None, 1169, 827, 32) 320
_________________________________________________________________
max_pooling2d_12 (MaxPooling (None, 585, 414, 32) 0
_________________________________________________________________
conv2d_31 (Conv2D) (None, 585, 414, 64) 18496
_________________________________________________________________
max_pooling2d_13 (MaxPooling (None, 293, 207, 64) 0
_________________________________________________________________
conv2d_32 (Conv2D) (None, 291, 205, 32) 18464
_________________________________________________________________
up_sampling2d_12 (UpSampling (None, 582, 410, 32) 0
_________________________________________________________________
conv2d_33 (Conv2D) (None, 582, 410, 32) 9248
_________________________________________________________________
up_sampling2d_13 (UpSampling (None, 1164, 820, 32) 0
_________________________________________________________________
conv2d_34 (Conv2D) (None, 1162, 818, 1) 289
===============================================================
How can I have the same input and output shape?
I don't know if this is the right way of doing it but I solved the problem.
Following the code from above I've added:
img_size = K.int_shape(input_img)[1:]
resized_image_tensor = tf.image.resize(decoded, list(img_size[:2]))****
autoencoder = Model(input_img, resized_image_tensor)
autoencoder.compile(optimizer='adadelta', loss='binary_crossentropy')
I used tf.image.resize to synchronize the shape of reconstructed image and input image.
Hope it helps.
Answered by Vesko Vujovic on December 25, 2021
If you look at Keras' output, there are various steps which lose pixels:
Max pooling on odd sizes will always lose one pixel. Conv2D using 3x3 kernels will also lose 2pixels, although I'm puzzled that it doesn't seem to happen in the downsampling steps.
Intuitively, padding the original images with enough border pixels to compensate for the pixel loss due to the various layers would be the simplest solution. At the moment I can't calculate how much it should be, but I suspect rounding up to a multiple of 4 should take care of the max pooling layers. For denoising, borders could be just copied from the outermost pixels, probably with some sort of low pass filtering to avoid artefacts.
Answered by Hans-Martin Mosner on December 25, 2021
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP