TransWikia.com

extract document topic vectors from lda model

Data Science Asked by Syrinebh on September 4, 2021

how can I extract document-topic matrix from LDA model and use it as input features an svm classifier? I am using gensim for implementation

One Answer

I've done this before in Gensim, hopefully it will help:

train_vecs = []
for i in range(len(your_training_examples)):
    top_topics = lda_train.get_document_topics(train_corpus[i], minimum_probability=0.0)
    topic_vec = [top_topics[i][1] for i in range(20)]
    train_vecs.append(topic_vec)

The above would give the top 20 topics for every document. 'train_corpus' is the result of doing something like this in Gensim once you have a bigram object from the 'Phrases' Gensim model class:

train_corpus = [id2word.doc2bow(text) for text in bigram]

Answered by Marc Kelechava on September 4, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP