user2082695
user2082695

Reputation: 250

Loading big CSV file with pandas

I am trying to load a csv file (around 250 MB) as dataframe with pandas. In my first try I used the typical read_csv command but I receive an Error memory. I have tried the approach mentioned in Large, persistent DataFrame in pandas using chunks:

x=pd.read_csv('myfile.csv', iterator=True, chunksize=1000)
xx=pd.concat([chunk for chunk in x], ignore_index=True)

but when I tried to concatenate I received the following error: Exception: "All objects passed were None". In fact I can not access the chunks

I am using winpython 3.3.2.1 for 32 bits with pandas 0.11.0

Upvotes: 4

Views: 4893

Answers (2)

Norman
Norman

Reputation: 1975

I'm late, but the actual problem with the posted code is that using pd.concat([chunk for chunk in x]) effectively cancels any benefit of chunking because it concatenates all those chunks into one big DataFrame again.
That probably even requires twice the memory temporarily.

Upvotes: 0

w-m
w-m

Reputation: 11232

I suggest that you install the 64 Bit version of winpython. Then you should be able to load a 250 MB file without problems.

Upvotes: 2

Related Questions