Erich
Erich

Reputation: 21

Converting KeyedVector to a tsv file

I am trying to convert a KeyedVector word2vec object to a tsv file. Here is my code:

wv_embeddings = KeyedVectors.load_word2vec_format('GoogleNews-vectors-negative300.bin.gz', binary=True, limit=100000)

Would you loop through each of the embeddings and save them to a tsv file?

Upvotes: 1

Views: 1029

Answers (1)

Stanislas Morbieu
Stanislas Morbieu

Reputation: 1827

The vocabulary is stored in wv_embeddings.wv.vocab.keys() and wv_embeddings.wv.get_vector() allows to get the vector corresponding to a word. The tsv can be written with the csv standard module:

import csv

with open('wv_embeddings.tsv', 'w') as tsvfile:
    writer = csv.writer(tsvfile, delimiter='\t')
    words = wv_embeddings.wv.vocab.keys()
    for word in words:
        vector = wv_embeddings.wv.get_vector(word).tolist()
        row = [word] + vector
        writer.writerow(row)

Upvotes: 1

Related Questions