Prince Kumar
Prince Kumar

Reputation: 23

How do I exclude header and footer from an excel file in pandas

df=pd.read_excel(filename)

What attributes should I add to exclude header and footer?

Upvotes: 2

Views: 17773

Answers (3)

Jaskaran Singh
Jaskaran Singh

Reputation: 561

Head over here: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_excel.html

you will find

pandas.read_excel(io, sheet_name=0, header=0, names=None, index_col=None, usecols=None, squeeze=False, dtype=None, engine=None, converters=None, true_values=None, false_values=None, skiprows=None, nrows=None, na_values=None, keep_default_na=True, verbose=False, parse_dates=False, date_parser=None, thousands=None, comment=None, skipfooter=0, convert_float=True, mangle_dupe_cols=True, **kwds)

Out of which ones which are useful to you are:

pandas.read_excel(skiprows=None, skipfooter=0)

you can specify the value in integer to skiprows=1 to skip header and skipfooter=1 to skip footer you can add as many rows are you want to skip

Hope it helps

Upvotes: 8

Aditya Chaturvedi
Aditya Chaturvedi

Reputation: 151

This is how I achieved it:

energy = pd.read_excel('Energy Indicators.xls', index_col=None, header=None, footer=None)
energy = energy[18:245].reset_index()

I used this for data cleaning in my assignment where I was required to remove header and footer.

So I first imported raw data from excel using

energy = pd.read_excel('Energy Indicators.xls', index_col=None, header=None, footer=None)

Now since the row from which actual required data started is 18 and row where footer started is 245 I used this

energy = energy[18:245].reset_index()

I used reset_index() because after removing rows my index was messed up so to reset it.

You might want to remove that extra column named 'index' which is created by reset_index()

Upvotes: 5

Mihir Patel
Mihir Patel

Reputation: 1

you should create a sun dataframe from previous data frame which doesn't contain the header and footer attribute.

for example,

if you excel file contains 500 rows, in which first three rows are header and last 10 rows are footers then you can do as below:

df = df[4:491]

Upvotes: 0

Related Questions