Data Science Asked on February 11, 2021
Why is RMSProp in many cases converging faster than Momentum?
Momentum:
$$v_{dW} := beta v_{dw} +(1-beta)dW$$
$$W := W-alpha v_{dw}$$
RMSProp:
$$ S_{dw} := B cdot S_{dw} + (1-B)cdot (dW)^2$$
$$W := W- alpha frac{dW}{sqrt{S_{dw}}}$$
Where $alpha$ is the learning rate (0.01 etc), $beta$ is the momentum term (0.9 etc), similar to B
From my point of view, both momentum and RMSProp have “tendency to keep moving”. Well, I can see how RMSprop will naturally accelerate on flat surfaces due to
$$frac{1}{sqrt{S_{dw}}}$$
when $S_{dw}$ is small, but is there another benefit that RMSprop provides?
The basic intuition is that you should not have the same learning rate for different dimensions. For instance, you can have a high slope in one direction but not for another. Consequently, you should not have the same speed for the two directions. Momentum adds acceleration. Suppose gradient is your instant velocity and the average is your average velocity. Momentum is actually viscosity or somehow friction. Suppose that you are near your optimal points, your gradients become zero and you have low average which means your speed changes slowly. They have both alpha term but what is going to be used is the running average, just a kind of average which is simple to be calculated. Take a look at here and here for making an analogy.
Answered by Media on February 11, 2021
Momentum is linear and provides speed to the update
RMSprop contributes the exponentially decaying average of past "squared gradients"
In RMS Prop By using the average, we actually try to diminish the vertical movement because they sum up to 0(approximately) while averaging.
RMS provides average to the update
Adam uses RMS prop and Momentum Speed and Average of update combined together, On an average it will speed up the direction in which more update is needed
All three are faster than Stochastic Gradient Decent without Exponential Weighted Average, Worst Case use Momentum, Dont go for normal weight updates
Answered by Varun Bajpai on February 11, 2021
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP