Reputation: 12847
I'm using this answer on how to read only a chunk of CSV file with pandas
.
The suggestion to use pd.read_csv('./input/test.csv' , iterator=True, chunksize=1000)
works excellent but it returns a <class 'pandas.io.parsers.TextFileReader'>
, so I'm converting it to dataframe with pd.concat(pd.read_csv('./input/test.csv' , iterator=True, chunksize=25))
but that takes as much time as reading the file in the first place!
Any suggestions on how to read only a chunk of the file fast?
Upvotes: 2
Views: 1401
Reputation: 294556
pd.read_csv('./input/test.csv', iterator=True, chunksize=1000)
returns an iterator. You can use the next
function to grab the next one
reader = pd.read_csv('./input/test.csv', iterator=True, chunksize=1000)
next(reader)
This is often used in a for loop for processing one chunk at a time.
for df in pd.read_csv('./input/test.csv', iterator=True, chunksize=1000):
pass
Upvotes: 5