CODEWITHSUNDEEP

pythongensimcorpus

Sanket

Reputation: 31

Understanding how words are stored in dictionary of gensim corpus after using "gensim.corpora.Dictionary(TEXT)"

After converting a list of text documents to corpora dictionary and then converting it to a bag of words model using:

dictionary = gensim.corpora.Dictionary(docs) # docs is a list of text documents
corpus = [dictionary.doc2bow(doc) for doc in docs]

We can find out the index value of particular words in the dictionary using:

dictionary.doc2idx(["righteous","height"])

Is there any way to find the word stored in dictionary at particular index?

Upvotes: 3

Views: 4076

Answers (1)

aneesh joshi

Reputation: 583

TL;DR:

dictionary.get(index_of_word)

Example:

import gensim

docs=[['hello', 'world'],['i','am', 'groot']]

dictionary = gensim.corpora.Dictionary(docs) # docs is a list of text documents
corpus = [dictionary.doc2bow(doc) for doc in docs]

print(dictionary.get(0))
print(dictionary.get(3))

Output:

hello
groot

Hope that helps!

Upvotes: 5

Related Questions