Data Science Asked on December 16, 2021
I have a dataset with at least 70% of labels incorrect.
I’d expect that incorrect labels would compensate each other while true labels will be taught properly (given a very large dataset).
For example, if I have 300 samples saying a => -1
and 300 samples saying a => 1
, the result for the input "a" eventually will be 0 (for a regression problem).
If I use Adam for the example above, won’t it affect the results for the inputs with noisy labels due to its adaptive nature? Won’t it be better to use SGD instead and decay the learning rate, or does Adam change its weights only at the end of every epoch?
Adam works in the same way as SGD does in this regard, it updates the weights at the end of each iteration, so at the end of an epoch multiple weight updates have been applied.
Inherently neither Adam nor SGD do anything to counteract the noisy labels, they just try to find the best parameters that minimize a loss function. I don't think anyone can answer apriori if it will be better to use Adam or SGD for your problem.
Answered by Djib2011 on December 16, 2021
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP