Data Science Asked by Nagendra Prasad on September 2, 2021
In the documentation it has been mentioned that y_pred
needs to be in the range of [-inf to inf] when from_logits=True
. I truly didn’t understand what this means, since the probabilities need to be in the range of 0 to 1! Can someone please explain in simple words the effect of using from_logits=True
?
model.compile(optimizer='adam',
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics=['accuracy'])
The from_logits=True
attribute inform the loss function that the output values generated by the model are not normalized, a.k.a. logits. In other words, the softmax function has not been applied on them to produce a probability distribution. Therefore, the output layer in this case does not have a softmax activation function:
out = tf.keras.layers.Dense(n_units) # <-- linear activation function
The softmax function would be automatically applied on the output values by the loss function. Therefore, this does not make a difference with the scenario when you use from_logits=False
(default) and a softmax activation function on last layer; however, in some cases, this might help with numerical stability during training of the model. You may also find this and this answers relevant and useful about the numerical stability when from_logits=True
.
Correct answer by today on September 2, 2021
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP