mommomonthewind
mommomonthewind

Reputation: 4640

What does size parameter in gensim doc2vec represent

In doc2vec function, there is a parameter called size.

I understand that, size is the dimension of output vector, and if size=400 it will capture the content better than if size=100.

However, I do not understand, what does size stand for? Does it mean how far Doc2Vec will lookup from a word, to predict the next word? Or what does it mean?

Thanks a lot,

Upvotes: 1

Views: 776

Answers (1)

gojomo
gojomo

Reputation: 54173

size is the number of dimensions in the created vectors. So size=100 means each document (actually, document-tag) receives a 100-dimensional vector from training.

More dimensions aren't always better: they mean slower training and a larger model. And if you're working on a small dataset, too many dimensions risks overfitting – preventing the model from representing generalizable patterns in the data.

Upvotes: 1

Related Questions