Data Science Asked on December 18, 2020
I am trying to debug a neural network. I am seeing gradients close to zero. How can I decide whether these gradients are vanishing or not? Is there some threshold to decide on vanishing gradient by looking at the values? I am getting values close to 4 decimal places(0.0001)
and in some cases close to 5 decimal places (0.00001)
. The network seems not to be learning since the histogram of weight is also quite similar in all epochs. I am using RELU
activation and Adam
optimizer. What could be the reason for the vanishing gradient in case of RELU
activation? Is possible please point me to some resources that might be helpful. Thanks in advance.
Vanishing gradients are the updates to the weights (not the weight themselves) over the layers after each training batch.
You should check the update signal between the top-most layer and lowest layer after a single batch.
Answered by Brian Spiering on December 18, 2020
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP