Cross Validated Asked by White on January 1, 2022
My question boils down to this : Which optimizer should I use to train my neural network.
I understand this question depends on the problem.
However, for instance it seems that ADAM optimizer generally outperforms SGD. So maybe we can find some rationality in the choice of optimizers (and their learning rate) ?
Maybe it won’t be mathematically correct, but maybe some of you came accross a standard method for choosing an optimizer ?
There is no science behind which optimiser to use in a model. It mostly boils down to heuristics and type of model in use.
ADAM is not generally better than SGD: This blog post reviews an article about SGD being a better optimiser than ADAM. https://shaoanlu.wordpress.com/2017/05/29/sgd-all-which-one-is-the-best-optimizer-dogs-vs-cats-toy-experiment/
There are pros and cons of each optimiser and no optimiser works best
https://ruder.io/optimizing-gradient-descent/ This link compares different gradient descent optimisers.
Answered by Vivek on January 1, 2022
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP