Patel Sanni
Patel Sanni

Reputation: 1

How I can get vector from output matrix in FastText ?

In this study author have found out that, Word2Vec generates the two kinds of embeddings(IN & OUT).

https://arxiv.org/abs/1602.01137

Well, you can easily get that using syn1 attribute in gensim word2vec. But in the case of gensim fastText, the syn1 do exists but as the concept of fastText is sub-word based, it's not possible to get a vector for word from output matrix by matching the indexes. Do you know any other way around to calculate vector using output matrix??

Upvotes: 0

Views: 1176

Answers (1)

gojomo
gojomo

Reputation: 54203

In FastText, the vector for a word is the combination of:

  • the full-word vector, if it exists; and
  • all the subword vectors

You can view the gensim method that returns a vector, composed from subwords if necessary, at:

https://github.com/RaRe-Technologies/gensim/blob/2ccc82bf50bcfbee44932c160db076a873cf893e/gensim/models/keyedvectors.py#L1970

(I think this method might have a bug, in comparison to the original FastText approach, in that this gensim method perhaps should also add the subword vectors to the whole-word-vector, even when a whole-word-vector is available.)

Upvotes: 3

Related Questions