Reputation: 250
I am trying to load a csv file (around 250 MB) as dataframe with pandas. In my first try I used the typical read_csv command but I receive an Error memory. I have tried the approach mentioned in Large, persistent DataFrame in pandas using chunks:
x=pd.read_csv('myfile.csv', iterator=True, chunksize=1000)
xx=pd.concat([chunk for chunk in x], ignore_index=True)
but when I tried to concatenate I received the following error: Exception: "All objects passed were None". In fact I can not access the chunks
I am using winpython 3.3.2.1 for 32 bits with pandas 0.11.0
Upvotes: 4
Views: 4893
Reputation: 1975
I'm late, but the actual problem with the posted code is that using pd.concat([chunk for chunk in x])
effectively cancels any benefit of chunking because it concatenates all those chunks into one big DataFrame again.
That probably even requires twice the memory temporarily.
Upvotes: 0
Reputation: 11232
I suggest that you install the 64 Bit version of winpython. Then you should be able to load a 250 MB file without problems.
Upvotes: 2