Reputation: 1642
Suppose I have generated a latent Dirichlet allocation model of Corpus1
using the basic command:
ldamodel = gensim.models.ldamodel.LdaModel(corpus1, num_topics=25, id2word = dictionary, passes=50, minimum_probability=0)
My question would be, how can I classify the new documents from say `Corpus2'?
I am trying to use the following command print(ldamodel[Corpus2[1]])
to obtain the distribution for the first document but I get the following error:
ValueError: not enough values to unpack (expected 2, got 1)
I am very confused regarding the class that the object Corpus2
should be. Any suggestions of where to find more information or a tutorial is more than welcome
Upvotes: 1
Views: 823
Reputation: 26
I had faced a similar issue. Ensure that corpus2 has the same representation as corpus1. By the looks of it, I'm guessing Corpus2[1] is a list of words appearing in a document. Vectorize the same. Perform a tf-idf transformation and then feed it to the model. That way, it has two elements. (word_id, tf-idf value)
Upvotes: 1