Data Science Asked on June 16, 2021
While using "Two class neural network" in Azure ML, I encountered "Momentum" property. As per documentation, which is not clear, it says
For The momentum, type a value to apply during learning as a weight on
nodes from previous iterations.
Although that is not very clear. Can someone please explain?
Momentum in neural networks is a variant of the stochastic gradient descent. It replaces the gradient with a momentum which is an aggregate of gradients as very well explained here.
It is also the common name given to the momentum factor, as in your case.
Maths
The momentum factor is a coefficient that is applied to an extra term in the weights update:
Note: image from visual studio magazine post
Advantages
Beside others, momentum is known to speed up learning and to help not getting stuck in local minima.
Intuition behind
As it is really nicely explained in this quora post, the momentum comes from physics:
Momentum is a physical property that enables a particular object with mass to continue in it's trajectory even when an external opposing force is applied, this means overshoot. For example, one speeds up a car and then suddenly hits the brakes, the car will skid and stop after a short distance overshooting the mark on the ground.
The same concept applies to neural networks, during training the update direction tends to resist change when momentum is added to the update scheme. When the neural net approaches a shallow local minimum it's like applying brakes but not sufficient to instantly affect the update direction and magnitude. Hence the neural nets trained this way will overshoot past smaller local minima points and only stop in a deeper global minimum.
Thus momentum in neural nets helps them get out of local minima points so that a more important global minimum is found. Too much of momentum may create issues as well as systems that are not stable may create oscillations that grow in magnitude, in such cases one needs to add decay terms and so on. It's just physics applied to neural net training or numerical optimizations.
In video
This video shows a backpropagation for different momentum values.
Other interesting posts
How does the momentum term for backpropagation algorithm work?
Hope it helps.
Correct answer by etiennedm on June 16, 2021
As a non formal definition and non thorough, you can understand momentum in the gradient descent as an inertia.
So when you are doing down the hill in the optimization problem you just add "momentum" to the descending and it helps with things as noise in the data, saddle points and stuff like that.
For a more thorough analysis see https://towardsdatascience.com/stochastic-gradient-descent-with-momentum-a84097641a5d
This is not dependent on azure but common in all NN
Answered by Carlos Mougan on June 16, 2021
Momentum is a technique to prevent sensitive movement. When the gradient gets computed every iteration, it can have totally different direction and the steps make a zigzag path, which makes training very slow. Something like this.
To prevent this from happening, momentum kind of stabilizes this movement. You can find more in the Following Article
Answered by Hazarapet Tunanyan on June 16, 2021
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP