TransWikia.com

How is calculated the error with multiple output neurons in neural network?

Data Science Asked on May 10, 2021

Machine Learning books generally explains that the error calculated for a given sample $i$ is:

$e_i = y_i – hat{y_i}$

Where $hat{y}$ is the target output and $y$ is the actual output given by the network. So, a loss function $L$ is calculated:

$L = frac{1}{2N}sum^{N}_{i=1}(e_i)^2$

The above scenario is explained for a binary classification/regression problem. Now, let’s assume a MLP network with $m$ neurons in the output layer for a multiclass classification problem (generally one neuron per class).

What does change in the equations above? Since we now have multiple outputs, both $e_i$ and $y_i$ should be a vector?

One Answer

You are mixing various concepts:

  • $L = frac{1}{2N}sum^{N}_{i=1}(e_i)^2$ is used only for regression problem and not for binary classification because MSE fits very well when your target distribution is normal
  • You can use the latter formula for binary classification but will works really bad because your target data distribution is a Bernoulli, not Normal. Remember that the choice of the right imply a prior assumption on the target data distribution. For this reason the right formula is binary crossentropy (aka negative log likelihood of a Bernoulli) $$ L = - sum_i y_i log hat{y_i} (1 - y_i) log(1 - hat{y_i}) $$
  • For multi classification problem there is a generalized formula of binary crossentropy which is called categorical crossentropy. If $hat{y}$ is a vector of C element, one for each class and the true class $y$ is encoded as integer (e.g 0, 1, 2 ...) then the loss is $$ L = - sum_i log(hat{y_i}[y]) $$

Answered by Mikedev on May 10, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP