Data Science Asked by rahul tomar on June 12, 2021
Let us assume we have a GRU network containing $H$ layers to process a training dataset with $K$ tuples, $I$ features, and $H_i$ nodes in each layer.
I have a pretty basic idea how the complexity of algorithms are calculated, however, with the presence of multiple factors that affect the performance of a GRU network including the number of layers, the amount of training data (which needs to be large), number of units in each layer, epochs and maybe regularization techniques, training with back-propagation through time, I am messed up. I have found intriguing answers for neural networks complexity out here- https://ai.stackexchange.com/questions/5728/what-is-the-time-complexity-for-training-a-neural-network-using-back-propagation, and bi-directional recurrent neural networks here- what is the complexity of a bidirectional recurrent neural network? but that was not enough to clear my doubt.
I am aware that back-propagation through time is used for training the recurrent neural network. But I am not able to understand how this happens for the bi-directional versions of the recurrent neural networks?
So, I was hoping if anyone help me with how to:
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP