Data Science Asked by nmtp on June 16, 2021
I’m currently using an autoencoder CNN that’s built upon the VGG-16 architecture that was designed by someone else. I want to replicate their results using their dataset first but I’m finding that:
-Validation losses diverge from training losses fairly early on (I get to around 10 epochs and it already looks like it’s overfitting)
-At its best, the validation losses aren’t even close to being as low as training losses
-In general, the accuracy is still worse than reported in their paper.
I’m new to machine learning and want to know if there are hyperparameters I should try to change or what I can do to maybe tinker with it without changing its architecture?
Are you in fact using the same architecture as they are? If not that could potentially be the problem.
Otherwise, are you using the same trainings protocol as they, i.e. optimizer, learning rate, learning rate schedule, batch size, preprocessing, weight initialization, number of training epochs? Depending on the size of your model and the amount of training data, 10 epochs might not be enough to judge about your models performance.
Can you link the paper?
Answered by Tinu on June 16, 2021
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP