Reputation: 69
I am trying to calculate the word embeddings using fasttext for the following sentence.
a = 'We are pencil in the hands'
I dont have any pretrained model, so how do i go about it?
Upvotes: 0
Views: 530
Reputation: 11213
You need a table of trained embeddings.
You can download pre-trained embeddings from the FastText website and use the code they provide for loading the embeddings. You don't even need to install FastText for that:
import io
def load_vectors(fname):
fin = io.open(fname, 'r', encoding='utf-8', newline='\n', errors='ignore')
n, d = map(int, fin.readline().split())
data = {}
for line in fin:
tokens = line.rstrip().split(' ')
data[tokens[0]] = map(float, tokens[1:])
return data
Then you just pick-up the from the dictionary.
Alternatively, you can train fasttext yourself on your text data by following a tutorial. The reasonable minimum of a dataset to train the word embeddings on is hundreds of thousands of words.
Upvotes: 1