Artificial Intelligence Asked by Johncowk on August 24, 2021
I am trying to design a model based on LSTM cells to do time-series prediction. The ouput value is an integer in [0,13]
. I have noticed that one-hot encoding it and using cross-entropy loss gives better results than MSE loss.
Here is my problem : no matter how deep I make the network or how many fully connected layers I add I always obtain pretty much the same behavior. Changing the optimizer also doesn’t really help.
9
, I really do not understand why since I have one-hot encoded the input and the output.Here is an example of a the results of a typical training phase, with the total loss :
Do you have any tips/ideas as to how I could improve this or could have gone wrong ? I am a bit of a beginner in ML si I might have missed something. I can also include the code (in PyTorch) if necessary.
I found the issue, I should have done more unit testing. Upon computing the batch loss before backpropagation, one of the dimension of the "prediction" tensor was not corresonding to the "truth" tensor. The shape match but the content is not the one that was supposed to be. This is due to how the NLL loss is implemented in pytorch which I was not aware of ...
Answered by Johncowk on August 24, 2021
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP