Understanding The Vertical and Horizontal stack in conditional gated Pixelcnn paper

Question

I found some confusion understanding the importance of vertical and horizontal stacks as a solution to the blind spot problem presented in the original pixel cnn architecture discussed in this paper. The vertical and horizontal stacks ideas were presented in the this paper.

Therefore, after browsing, I found this link to explain the concept. 
In the vertical stack section of the web page, I still find that pixel f, still cannot see pixels c, d and e.

Any help is much appreciated!!

ricvo · Answer

I also had the same problem when reading the paper, I believe it is not explicitly mentioned.
I think the whole image is padded with zeros on the top row when you are using the vertical stack. In this way, you ensure that the output of any pixel of row i will depend only on the pixels at row i-1 till row 0. But at the same time, you have the advantage of depth of field compositionality and remove the blind spot (given you are stacking enough convolutional layers).

Tobbey · Answer

I am very surprised not to find many more questions related to this (now) few years old paper, apart from few blog post blindly copy pasting the content of the article.
I'd be happy to get a clearer answer from someone that actually understand where the blind spot is removed.

Understanding The Vertical and Horizontal stack in conditional gated Pixelcnn paper

2 Answers

Add your own answers!

Ask a Question