Reputation: 21676
In version 0.16.1 the chunksize
argument was available.
See: http://pandas.pydata.org/pandas-docs/version/0.16.1/generated/pandas.ExcelFile.parse.html
But in latest version it's not available.
https://pandas.pydata.org/pandas-docs/stable/generated/pandas.ExcelFile.parse.html
What was the reason that it was removed?
Also, how should I process excel file by chunks in latest version?
I used to do below:
import pandas as pd
excel = pd.ExcelFile("test.xlsx")
for sheet in excel.sheet_names:
reader = excel.parse(sheet, chunksize=1000)
for chunk in reader:
# process chunk
Upvotes: 0
Views: 2062
Reputation: 21676
As EdChum explained in the comment, this feature was removed in 0.17.0. Chris gave below reason for the same in the comment:
there's no super-compelling reason; the main idea was to match up with api of to_excel, i.e. the "ExcelFileWrapper" (ExcelFile, ExcelWriter) doesn't have any pandas-specific functionality, instead you pass it into the io functions (read_excel, to_excel).
I did update the docs to cover that specific example. edit: although it may be hard to see in the diff - rendered below.
Source: https://github.com/pandas-dev/pandas/pull/11198
I still wonder if there's any alternate way to read excel in chunks?
Upvotes: 1