Eva Agustine
Eva Agustine

Reputation: 15

How to tune FastText parameter for OOV word?

I already heard that FastText is generating OOV word vectors using its n-gram's. It is already automatically built-in at FastText architecture or we should like to tune specific parameters to it? like an oov_tokens in Keras tokenizer. I already looking for what parameters to tune in Fast Text but I couldn't find any.

If anyone knows and wants to share their knowledge I would be very appreciative of that.

Thank you.

Upvotes: 0

Views: 662

Answers (1)

Vector generation for OOV words is integrated into fastText (at least in the original implementation by Facebook).

To generate these vectors, fastText uses subword n-grams. To learn more, you can read this thread and this visual guide.

For this reason, the parameters that most influence the creation of vectors for OOV words are the following:

  • minn (min length of char ngram)
  • maxn (max length of char ngram)

For more information about fastText options/parameters, see the official documentation.

Upvotes: 1

Related Questions