Reputation: 6587
I have a csv file which has a size of around 800MB which I'm trying to load into a dataframe via pandas but I keep getting a memory error. I need to load it so I can join it to another smaller dataframe.
Why am I getting a memory error even though I'm using 64bit versions of Windows, and Python 3.4 64bit and have over 8GB of RAM and plenty of harddisk? Is this is a bug in Pandas? How can I solve this memory issue?
Upvotes: 1
Views: 2964
Reputation: 210812
reading your CSV in chunks might help:
chunk_size = 10**5
df = pd.concat([chunk for chunk in pd.read_csv(filename, chunksize=chunk_size)],
ignore_index=False)
Upvotes: 1