How to store text embedding of varying size for RNN in python?

I'm training an classifier for text using Word2vec and RNN (pytorch).

I would like to embed all of my instances of text with varying lengths via word2vec, and store them in a csv file.

I'm considering storing them as strings, but I'm not convicted this is a good solution.

What's a convenient way to store the embeddings?

Upvotes: 0

Answers (1)

Idioteche_fish

Reputation: 91

If you generate a csv file, you can store it in pickle format. Pandas provides an easy way to read pickle files If you have a dataframe of the form ['text', 'word2vec_embedding'] then you can store it to pickle as

df.to_pickle(filepath)

and loaded as a dataframe by

df = pd.read_pickle(filepath)

Documentation for pandas.DataFrame.to_pickle

Upvotes: 1

How to store text embedding of varying size for RNN in python?

Answers (1)

Related Questions