CIsForCookies
CIsForCookies

Reputation: 12847

how to read only a chunk of csv file fast?

I'm using this answer on how to read only a chunk of CSV file with pandas.

The suggestion to use pd.read_csv('./input/test.csv' , iterator=True, chunksize=1000) works excellent but it returns a <class 'pandas.io.parsers.TextFileReader'>, so I'm converting it to dataframe with pd.concat(pd.read_csv('./input/test.csv' , iterator=True, chunksize=25)) but that takes as much time as reading the file in the first place!

Any suggestions on how to read only a chunk of the file fast?

Upvotes: 2

Views: 1401

Answers (1)

piRSquared
piRSquared

Reputation: 294556

pd.read_csv('./input/test.csv', iterator=True, chunksize=1000) returns an iterator. You can use the next function to grab the next one

reader = pd.read_csv('./input/test.csv', iterator=True, chunksize=1000)

next(reader)

This is often used in a for loop for processing one chunk at a time.

for df in pd.read_csv('./input/test.csv', iterator=True, chunksize=1000):
    pass 

Upvotes: 5

Related Questions