mHelpMe
mHelpMe

Reputation: 6668

upickling error data was truncated - better way to save large dataframe

I have a quite a large dataframe that I need to save. The size is approx 300mb when I save it using pickle.

I read about some other ways of saving large dataframes. I am using bz2.BZ2File & I can see the file is now only 50mb. However when I try to load the data I get the following error,

UnpicklingError: pickle data was truncated

Is there a better way for saving a large dataframe?

Upvotes: 1

Views: 312

Answers (1)

Tristan
Tristan

Reputation: 2088

Saving the dataframe as a csv file can help. A dataframe contains more information than solely the data, so when pickling, such dataframe is converted to a string which takes up a lot of space which a csv would not.

Notice that the method to_csv even supports compression. E.g. to save as a zip:

df.to_csv('filename.zip', compression='infer')

Upvotes: 1

Related Questions