Data Science Asked by Alexander Engelhardt on June 25, 2021
In the Residual learning paper by He et al., there are a number of plots of training/test error vs. backprop iteration. I’ve only ever seen “smooth” curves on these plots, while in this paper’s application, there are sudden jumps in improvement. In this figure (Fig. 4a in the paper linked above), they are at around 15e4 and 30e4 iterations:
What happened here? My intuition would say that the backpropagation hit a plateau with a gradient of close to zero, and then very suddenly found a path steeply downward – but is that a realistic shape for a cost function?
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP