TransWikia.com

larger batches decrease learning rate because of a technical artifact?

Data Science Asked on August 18, 2021

I’m training a neural network for a classification task and experimenting with different batch sizes. I’m using the negative log likelihood loss averaged over the samples in the batch.

I realized that because I’m keeping the number of epochs and the learning rate constant, and because I’m averaging the loss over the samples in the batch, I get slower convergence when I use larger batches simply because when I double the batch size, I’m doing half the learning steps

How can I fix this technical artifact and study the real effect of batch size (and homogeneity) for my task like in here? Should I just stop averaging and instead sum the loss over the batch samples?

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP