Data Science Asked on August 26, 2021
When Implementing custom loss function how to make it invariant to the batch size
. For example lets say dice loss
is being implemented. The formula for dice loss
is:
$$
sum_{c}{} 1 – {DSC}_{c}
c: text{Classes},hspace{5mm} {DSC}_{c}: text{Dice score of class }c
$$
This formula does not clarify how to deal with the batch size
. It is worth noting that according to this formula, the value of loss
will increase if the batch size
is increased. The obvious intuition would be to normalize the loss
value using batch size
. I have two question in this context.
loss
using the batch size
?tensor
at hand is 4D
. The dice score
/ loss
can be calculated for each image and each class resulting into a 2D
tensor like below, where $n$ is no. of classes and $m$ is the batch size
.$$
begin{bmatrix}
l_{1, 1} & dots & l_{1, n}
vdots & ddots & vdots
l_{m, 1} & dots & l_{m, n}
end{bmatrix}
$$
Now the aggregated loss
can be calculated as:
$$
sum_{j=1}^{n}frac{1}{m}sum_{i=1}^{m} l_{i, j} = frac{1}{m}sum_{j=1}^{n}sum_{i=1}^{m} l_{i, j}
$$
batch size
from the beginning and generate per class loss
in a 1D
tensor
: Accordingly, the normalized aggregated loss
will be:
$$
frac{1}{m}sum_{k=1}^{n} l_{k}^{‘}
$$
Is there any fundamental difference between these two approaches? Which one is correct (if any)?
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP