Padding in Convolution Formula

Data Science Asked by dontloseyourgoalie on March 9, 2021

Why is it that the formula for each element in a convolution between an image $I$ and a $k times k$ sized kernel $K$ is
$$ (I*K)_{ij}=sum_{m=0}^{k-1}sum_{n=0}^{k-1}I_{(i-m),(j-n)}K_{mn}=sum_{m=0}^{k-1}sum_{n=0}^{k-1}I_{(i+m),(j+n)}K_{-m,-n}$$

Seeing as the double sum formulas should be equal, we need to work on the same region of $I$ in both formulas. In the left formula we work on a region $I[i-(k-1):i, j-(k-1):j]$ whereas we in the right formula work on the region $I[i:i+(k-1), j:j+(k-1)]$.

Shouldn’t the formula instead be
$$ (I*K)_{ij}=sum_{m=0}^{k-1}sum_{n=0}^{k-1}I_{(i-m),(j-n)}K_{mn}=sum_{m=0}^{k-1}sum_{n=0}^{k-1}I_{(i-p+m),(j-p+n)}K_{-m,-n}$$

So that the regions are the same if we use the padding $p=k-1$.

backpropagation convolution convolutional neural network neural network

Add your own answers!

Ask a Question

Get help from others!