How to decide if gradients are vanishing?

Question

I am trying to debug a neural network. I am seeing gradients close to zero. How can I decide whether these gradients are vanishing or not? Is there some threshold to decide on vanishing gradient by looking at the values? I am getting values close to 4 decimal places(0.0001) and in some cases close to 5 decimal places (0.00001). The network seems not to be learning since the histogram of weight is also quite similar in all epochs. I am using RELU activation and Adam optimizer. What could be the reason for the vanishing gradient in case of RELU activation? Is possible please point me to some resources that might be helpful. Thanks in advance.

deep learning gradient descent pytorch

Brian Spiering · Answer

Vanishing gradients are the updates to the weights (not the weight themselves) over the layers after each training batch.
You should check the update signal between the top-most layer and lowest layer after a single batch.

How to decide if gradients are vanishing?

One Answer

Add your own answers!

Ask a Question