How can ANN handle varied sized inputs?

Question

I have a dataset with a message (string) and an associated mood. I am trying to use an ANN to predict one of the 6 moods using the encoded inputs.
This is how my X_train looks like:
array([list([1, 60, 2]),
   list([1, 6278, 14, 9137, 334, 9137, 8549, 1380, 7]),
   list([5, 107, 1, 2, 156]), ..., list([1, 2, 220, 41]),
   list([1, 2, 79, 137, 422, 877, 5, 230, 621, 18]),
   list([1, 11, 66, 1, 2, 9137, 175, 1, 6278, 5624, 1520])],
  dtype=object)

Since every array has a different length, it's not being accepted. What can I do about it?
PS: The encoded values were generated using keras.preprocessing.Tokenizer()

ddaedalus · Answer

One way is to encode your input in a fixed size dimension. That is, you can use an RNN, like an LSTM, with padding and its output should be the input of an ANN.

Jakub · Answer

I am not sure how well it will perform, but what about padding the shorter messages with special characters (let's say zeros) so that they are as long as the longest message?
However, some kind of embedding would definitely be better (also for the sake of the target - predicting the sentiment) if you have enough data.

How can ANN handle varied sized inputs?

2 Answers

Add your own answers!

Ask a Question