Deshwal
Deshwal

Reputation: 4162

How to load pre trained FastText Word Embeddings using Gensim?

I downloaded word embedding from this link. I want to load it in Gensim to do some work but I am not able to load it. I have found many resources and none of it is working. I am using Gensim version 4.1.

I have tried

gensim.models.fasttext.load_facebook_model('/home/admin1/embeddings/crawl-300d-2M.vec')
gensim.models.fasttext.load_facebook_vectors('/home/admin1/embeddings/crawl-300d-2M.vec')

and it is showing me

NotImplementedError: Supervised fastText models are not supported

I went to try to load it using using FastText.load('/home/admin1/embeddings/crawl-300d-2M.vec',) but then it showed UnpicklingError: could not find MARK.

Also, using

Upvotes: 1

Views: 1114

Answers (1)

gojomo
gojomo

Reputation: 54208

Per the NotImplementedError, those are the one kind of full Facebook FastText model, -supervised mode, that Gensim does not support.

So sadly, the answer to "How do you load these?" is "you don't".

The .vec files contain just the full-word vectors in a plain-text format – no subword info for synthesizing OOV vectors, or supervised-classification output features. Those can be loaded into a KeyedVectors model:

kv_model = KeyedVectors.load_word2vec_format('crawl-300d-2M.vec')

Upvotes: 3

Related Questions