Data Science Asked by Lauramvp on February 4, 2021
Activation function between LSTM layers
In the above link, the answer to the question whether activation function are required for LSTM layers was answered as follows: as an LSTM unit already consists of multiple non-linear activation functions, it is not necessary to use a (recurrent) activation function.
My question:
Is there a specific reason why Keras by default uses a "tanh" activation and "sigmoid" recurrent_activation if those activations are not necessary? I mean, for a Dense layer the default activation is "none". Keras could just have used none as default for LSTM units as well, right? Could it be that Keras uses these activations for a reason? Also, a lot of tutorials or blogs use ReLu (without clarifying why), and I have not come across a one specifying "none" as (recurrent) activation. Why is ReLu used so much (while the outputs from the LSTM unit are already activated)?
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP