Data Science Asked by user134132523 on February 6, 2021
I am building an LSTM with keras
which have an activation
parameter in the layer. I have read that scaling on the output data should match the activation function’s output values.
Ex: tanh
activation outputs values between -1 and 1, therefore the output training (and testing) data should be scaled to values between -1 and 1. So if the activation function is asigmoid
the output data should be scaled to values between 0 and 1.
Does this hold for all activation functions? If I use ReLu
as activation in my layers what should the output data be rescaled to?
What you read hold true for the neurons of the output layers and not for the hidden layers!
Hence, its true that if you are using tanh in output layers then you need the data labels to be within [-1, 1] where as between [0, 1] for sigmoid.
As for your concern with relu, use it on output layers if you know that the range of the labels of the data is positive only. If you are using relu in hidden layers then the scaling doesn't depend on relu but rather on the type of activation function used in output layers.
Answered by user1825567 on February 6, 2021
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP