Data Science Asked by user1845926 on December 5, 2020
Is there any way I can map generated topic from LDA to the list of documents and identify to which topic it belongs to ?
I am interested in clustering documents using unsupervised learning and segregating it into appropriate cluster.
Any link, code example, paper will greatly be appreciated.
After training your LDA topic model you can input documents into the model and it will classify them into the pre defined number of topics. In gensim (python), this would look something like this:
ques_vec = dictionary.doc2bow(tokenized_document)
topic_vec = ldamodel[ques_vec]
At this point, you will not know what is the meaning of each topic (class), because it is the result of unsupervised classification. To know what is the meaning of each topic that your lda model clusters your documents into, you have to look into the trained parameters like this:
words = ldamodel.show_topic(topic_number, topn = 200)
If you print that, you'll see the top 200 words that make up that topic number. Based on the meaning of the words in each topic, you name that topic as an appropriate class.
Answered by Sid on December 5, 2020
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP