Reputation: 438
I am trying to train a Doc2Vec model using gensim.
The dataset i am using is the 20 newsgroups dataset [1] which is included in sklearn's datasets module.
I have used the example in the gensim documentation to create the model.
docs = newsgroups_train['data']
enumerated_docs = enumerate(docs)
documnets= [TaggedDocument(doc.split(),i) for i, doc in enumerated_docs]
model = Doc2Vec(documnets, vector_size=20, window=2, min_count=30, workers=4)
I checked every line of code, all seems to be working up to the line which initializes the model.
I get a type error:
TypeError: 'int' object is not iterable
[1] https://scikit-learn.org/0.19/datasets/twenty_newsgroups.html
Upvotes: 0
Views: 255
Reputation: 165
Enumerate
returns an integer counter and the value in the list. So, in your third line of code, i
is an integer. However, the second parameter of TaggedDocument
function should be an iterable.
Upvotes: 1