Multi class text classification when having only one sample for classes

Question

I have a dataset of texts, each text was identified with an ID number. I would like to do a prediction by finding the best match ID number for upcoming new texts. To use multi text classification, I am not sure if this is the right approach since there is only one text for most of ID numbers. In this case, I wouldn't have any test set. Can up-sampling help? Or is there any other approach than classification for such a problem?
The data set looks like this:
id1 'text1', id2 'text2', id3 'text3', id3 'text4', id3 'text5', id4 'text6', . . id200 'text170'
I would appreciate any guidance to find the best approach for this problem.

user31264 · Answer

Siamese networks may be useful in your case.
http://www.cs.cmu.edu/~rsalakhu/papers/oneshot1.pdf
https://en.wikipedia.org/wiki/Siamese_neural_network
https://link.springer.com/protocol/10.1007%2F978-1-0716-0826-5_3

Multi class text classification when having only one sample for classes

One Answer

Add your own answers!

Ask a Question