matlabit
matlabit

Reputation: 858

Load a dataframe to memory python

I have a large file I need to load to a dataframe. I will need to work on it for a while. Is there a way of keeping in loaded in memory, so that if my script fails, I will not need to load it again ?

Upvotes: 1

Views: 3529

Answers (1)

Stefan
Stefan

Reputation: 42905

Here's an example of how one can keep variables in memory between runs.

For persistent storage beyond RAM, I would recommend looking into HDF5. It's fast, simple, and allows for queries if necessary: (see docs).

It supports .read_hdf() and .to_hdf() similar to the _csv() methods, but is significantly faster.

A simple illustration of storage and retrieval including query (from the docs) would be:

df = DataFrame(dict(A=list(range(5)), B=list(range(5))))
df.to_hdf('store_tl.h5','table', append=True)
read_hdf('store_tl.h5', 'table', where = ['index>2'])

Upvotes: 1

Related Questions