Reputation: 25
I am training my ldamodel
using gensim
, and predicting using a test corpus like this ldamodel[doc_term_matrix_test]
, it works just fine but I don't understand how the prediction is actually done using the trained model (what ldamodel[doc_term_matrix_test]
is doing).
Here is the code :
dictionary2 = corpora.Dictionary(test)
dictionary = corpora.Dictionary(train)
dictionary.merge_with(dictionary2)
doc_term_matrix2 = [dictionary.doc2bow(doc) for doc in test]
doc_term_matrix = [dictionary.doc2bow(doc) for doc in train]
Lda = gensim.models.ldamodel.LdaModel
ldamodel = Lda(doc_term_matrix, num_topics=2, id2word =
dictionary,random_state=100, iterations=50, passes=1)
topics = sorted(ldamodel[doc_term_matrix2],
key=lambda
x:x[1],
reverse=True)
Upvotes: 2
Views: 2089
Reputation: 711
To quote from gensim docs about ldamodel:
This module allows both LDA model estimation from a training corpus and inference of topic distribution on new, unseen documents.
So apparently, what your code does is not quite "prediction" but rather inference. That is, your trained LDA model yields for every test document T
an estimation of the topic distribution of T
.
Upvotes: 2