Reputation: 6270
I have one excel file with several identical structured sheets on it (same headers and number of columns) (sheetsname: 01,02,...,12).
How can I get this into one dataframe?
Right now I would load it all seperate with:
df1 = pd.read_excel('path.xls', sheet_name='01')
df2 = pd.read_excel('path.xls', sheet_name='02')
...
and would then concentate it. What is the most pythonic way to do it and get directly one dataframe with all the sheets? Also assumping I do not know every sheetname in advance.
Upvotes: 2
Views: 64
Reputation: 28719
read the file as:
collection = pd.read_excel('path.xls', sheet_name=None)
combined = pd.concat([value.assign(sheet_source=key)
for key,value in collection.items()],
ignore_index=True)
sheet_name = None ensures all the sheets are read in.
collection is a dictionary, with the sheet_name as key, and the actual data as the values.
combined uses the pandas concat method to get you one dataframe.
I added the extra column sheet_source, in case you need to track where the data for each row comes from.
You can read more about it on the pandas doco
Upvotes: 3
Reputation: 1320
you can use:
df_final = pd.concat([pd.read_excel('path.xls', sheet_name="{:02d}".format(sheet)) for sheet in range(12)], axis=0)
Upvotes: 0