Reputation: 871
I know the argument usecols
in pandas.read_excel()
allows you to select specific columns.
Say, I read an Excel file in with pandas.read_excel()
. My excel spreadsheet has 1161 rows. I want to keep the 1st row (with index 0), and skip rows 2:337. Seems like the argument skiprows
works only when 0 indexing is involved. I tried several different ways but my code always produces an output where all my 1161 rows are read rather than only after the 337th row on. Such as this:
documentationscore_dataframe = pd.read_excel("Documentation Score Card_17DEC2015 Rev 2 17JAN2017.xlsx",
sheet_name = "Sheet1",
skiprows = "336",
usecols = "H:BD")
Here is another attempt:
documentationscore_dataframe = pd.read_excel("Documentation Score Card_17DEC2015 Rev 2 17JAN2017.xlsx",
sheet_name = "Sheet1",
skiprows = "1:336",
usecols = "H:BD")
I would like the dataframe to exclude rows 2 through 337 in the original Excel import.
Upvotes: 25
Views: 72912
Reputation: 23241
You can also pass a function to skiprows=
. For example, to skip the first 336 rows (after the header row):
df = pd.read_excel('Book1.xlsx', sheet_name='Sheet1', skiprows=lambda x: 1<=x<=336)
Upvotes: 0
Reputation: 164753
As per the documentation for pandas.read_excel
, skiprows
must be list-like.
Try this instead to exclude rows 1 to 336 inclusive:
df = pd.read_excel("file.xlsx",
sheet_name = "Sheet1",
skiprows = range(1, 337),
usecols = "H:BD")
Note: range
constructor is considered list
-like for this purpose, so no explicit list conversion is necessary.
Upvotes: 37