Reputation: 6668
I have a quite a large dataframe that I need to save. The size is approx 300mb when I save it using pickle.
I read about some other ways of saving large dataframes. I am using bz2.BZ2File & I can see the file is now only 50mb. However when I try to load the data I get the following error,
UnpicklingError: pickle data was truncated
Is there a better way for saving a large dataframe?
Upvotes: 1
Views: 312
Reputation: 2088
Saving the dataframe as a csv file can help. A dataframe contains more information than solely the data, so when pickling, such dataframe is converted to a string which takes up a lot of space which a csv would not.
Notice that the method to_csv
even supports compression. E.g. to save as a zip:
df.to_csv('filename.zip', compression='infer')
Upvotes: 1