What does from_logits=True do in SparseCategoricalcrossEntropy loss function?

Question

In the documentation it has been mentioned that y_pred needs to be in the range of [-inf to inf] when from_logits=True. I truly didn't understand what this means, since the probabilities need to be in the range of 0 to 1! Can someone please explain in simple words the effect of using from_logits=True?

model.compile(optimizer='adam',
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
              metrics=['accuracy'])

today · Accepted Answer

The from_logits=True attribute inform the loss function that the output values generated by the model are not normalized, a.k.a. logits. In other words, the softmax function has not been applied on them to produce a probability distribution. Therefore, the output layer in this case does not have a softmax activation function:
out = tf.keras.layers.Dense(n_units)  # <-- linear activation function

The softmax function would be automatically applied on the output values by the loss function. Therefore, this does not make a difference with the scenario when you use from_logits=False (default) and a softmax activation function on last layer; however, in some cases, this might help with numerical stability during training of the model. You may also find this and this answers relevant and useful about the numerical stability when from_logits=True.

What does from_logits=True do in SparseCategoricalcrossEntropy loss function?

One Answer

Add your own answers!

Ask a Question