Data Science Asked by Lucian cahil on May 2, 2021
Start with a default Learning rate of 0.1
Run the gradient descents normally.
If the cost function ever gives a higher value than the previous iteration, divide the learning rate by 10.
Repeat steps 1-3 until the learning rate becomes $10^{-6}$.
Starting with a high value is not a good idea. Since we are not aware of where the training starts on the Loss surface, so it can easily diverge.
Also, in between we need a higher LR to quickly move on saddle points (plateaus).
Both start and end should be at a small value
A well-known approach is called Cyclical-Learning-Rate suggested by Leslie N. Smith in this paper Link
It suggests following a Cyclic path e.g. Triangle, Sinusoidal, etc.
It says to first increase from a minimum bound and starts decreasing when reached a maximum bound.
$hspace{3cm}$
Paper has clearly explained -
- How to find the bounds
- How to decide the stepsize
Answered by 10xAI on May 2, 2021
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP