Reputation: 1456
For reference, I already looked at the following questions:
I am looking to have my LDA model trained from Gensim classify a sentence under one of the topics that the model creates. Something long the lines of
lda = models.LdaModel(corpus=corpus, id2word=id2word, num_topics=7, passes=20)
lda.print_topics()
for line in document: # where each line in the document is its own sentence for simplicity
print('Sentence: ', line)
topic = lda.parse(line) # where the classification would occur
print('Topic: ', topic)
I know gensim does not have a parse
function, but how would one go about accomplishing this? Here is the documentation that I've been following but I haven't gotten anywhere with it:
Thanks in advance.
edit: More documentation- https://radimrehurek.com/gensim/models/ldamodel.html
Upvotes: 2
Views: 1337
Reputation: 498
Let me get your problem right: You want to train a LDA Model on some documents an retrieve 7 topics. Then you want to classify new documents in one (or more?) of these topics, meaning you want to infer topic distributions on new, unseen documents.
If so, the gensim documentation provides answers.
lda = models.LdaModel(corpus=corpus, id2word=id2word, num_topics=7, passes=20)
lda.print_topics()
count = 1
for line in document: # where each line in the document is its own sentence for simplicity
print('\nSentence: ', line)
line = line.split()
line_bow = id2word.doc2bow(line)
doc_lda = lda[line_bow]
print('\nLine ' + str(count) + ' assigned to Topic ' + str(max(doc_lda)[0]) + ' with ' + str(round(max(doc_lda)[1]*100,2)) + ' probability!')
count += 1
Upvotes: 1