nonegiven72
nonegiven72

Reputation: 427

Pandas read_excel, 1 row header in first sheet but not other sheets

There is a title in row 1 of the first sheet but not in other sheets. Is there an easy way to account for this?

I'm pretty sure I can solve this in a roundabout way (uploading first sheet in one read_excel, then getting the rest of the sheets in a separate read_excel).

Assume 1 workbook and 2 sheets:

all_sheets = pd.read_excel('test.xls', sheetname=None, header=0 )

In [149]: sheet1 = all_sheets["Sheet1"]
Out[149]: sheet1
  This great excel workbook was created on :2016-04-01
0             Column_name1        Column_name2        Column_name3
1               Gary                   23                Paid
2               John                   26                Paid

In [149]: sheet2 = all_sheets["Sheet2"]
Out[149]: sheet2
             Column_name1        Column_name2        Column_name3
0               Patrick                35                Paid
1               Josh                   23                Paid
2               Justin                 26                Paid

Upvotes: 0

Views: 16464

Answers (2)

MaxU - stand with Ukraine
MaxU - stand with Ukraine

Reputation: 210972

Assuming that you want to use headers from the first sheet and merge data from all sheets into one data frame:

import collections
import pandas as pd

fn = r'test.xls'
first_sheet_name = 'Sheet1'

df_dict = pd.read_excel(fn, sheetname=None, header=None)
# sort 
df_dict = collections.OrderedDict(sorted(df_dict.items()))

df = pd.concat(df_dict.values())
df.columns = df.iloc[0]
df = df.ix[df.index > 0]

Upvotes: 1

robroc
robroc

Reputation: 1212

You might have to load your sheets separately and use skiprows to skip the first row in the first sheet. Not sure you can treat sheets separately in one go.

sheet1 = pd.read_excel('test.xls', sheetname="Sheet1", header=0, skiprows=[0] )

Upvotes: 2

Related Questions