Data Science Asked by Tknoobs on June 23, 2021
Hello I am trying to understand LSTMs but have a few problems:
What is the input? Since LSTM is seq2seq I would think it is a sequence of words, but in a Codecademy lesson is mentioned that each sentence is represented as a matrix with a bunch of vectors containing 1 or 0 for the timestep -> sentence "I like Bobo" like = [0, 1, 0], so what is now the input? The matrix or the sequence of words?
What is passed to the next LSTM cell after a prediction before was false? Since the false prediction is noted in the hidden state, how does the network know whether previous predictions were false? Or does it even know when predicting the next step?
I am excited for the answers,
love Phiona.
The input of an LSTM is a sequence of vectors. In your case, each of these vectors represents a word encoded as a one-hot vector. One-hot encoding is a way to express a discrete element (e.g. a word) numerically. Each one-hot vector is a vector of length $d$, where $d$ is the total number of words we can represent, and where all positions in the vector are 0 except the position associated with the represented word, which contains a 1.
The hidden state passed to the next LSTM cell is not the final binary prediction, but the dense numerical vectors we obtain before computing the binary prediction.
Correct answer by noe on June 23, 2021
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP