TransWikia.com

Understanding convolutional pooling sizes (deep learning)

Data Science Asked by FullStack on May 9, 2021

I’m dumb but still trying to understand the code provided from this e-book on deep learning, but it doesn’t explain where the n_in=40*4*4 comes from. 40 is from the 40 previous feature maps, but what about the 4*4?

enter image description here

>>> net = Network([
        ConvPoolLayer(image_shape=(mini_batch_size, 1, 28, 28), 
                      filter_shape=(20, 1, 5, 5), 
                      poolsize=(2, 2), 
                      activation_fn=ReLU),
        ConvPoolLayer(image_shape=(mini_batch_size, 20, 12, 12), 
                      filter_shape=(40, 20, 5, 5), 
                      poolsize=(2, 2), 
                      activation_fn=ReLU),
        FullyConnectedLayer(
            n_in=40*4*4, n_out=1000, activation_fn=ReLU, p_dropout=0.5),
        FullyConnectedLayer(
            n_in=1000, n_out=1000, activation_fn=ReLU, p_dropout=0.5),
        SoftmaxLayer(n_in=1000, n_out=10, p_dropout=0.5)], 
        mini_batch_size)
>>> net.SGD(expanded_training_data, 40, mini_batch_size, 0.03, 
            validation_data, test_data)

For instance, what if I do a similar analysis in 1D as shown below, which should that n_in term be?

>>> net = Network([
        ConvPoolLayer(image_shape=(mini_batch_size, 1, 81, 1), 
                      filter_shape=(20, 1, 5, 1), 
                      poolsize=(2, 1), 
                      activation_fn=ReLU),
        ConvPoolLayer(image_shape=(mini_batch_size, 20, 12, 1), 
                      filter_shape=(40, 20, 5, 1), 
                      poolsize=(2, 1), 
                      activation_fn=ReLU),
        FullyConnectedLayer(
            n_in=40*???, n_out=1000, activation_fn=ReLU, p_dropout=0.5),
        FullyConnectedLayer(
            n_in=1000, n_out=1000, activation_fn=ReLU, p_dropout=0.5),
        SoftmaxLayer(n_in=1000, n_out=10, p_dropout=0.5)], 
        mini_batch_size)
>>> net.SGD(expanded_training_data, 40, mini_batch_size, 0.03, 
            validation_data, test_data)

Thanks!

One Answer

In the given example from the e-book, the number $4$ comes from $(12-5+1) over 2$, where $12$ is the input image size $(12*12)$ of the second constitutional layer; 5 is the filter size (5*5) used in that layer; and $2$ is the poolsize.
This is similar to how you get the number $12$ from the first constitutional layer: $12= {(28-5+1) over 2}$. It's well explained in your linked chapter.

Regarding your "For instance" code, your 6th line is not correct:

ConvPoolLayer(image_shape=(mini_batch_size, 20, 12, 1),

The number 12 should be replaced by $(81-5+1)over2$ which unfortunately is not an integer. You may want to change the filter_shape in the first layer to (6,1) to make it work. In that case, your 6th line should be:

ConvPoolLayer(image_shape=(mini_batch_size, 20, 38, 1),

and your 11th line should be:

n_in=40171, n_out=1000, activation_fn=ReLU, p_dropout=0.5),

Correct answer by user12075 on May 9, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP