I have an Excel workbook that contains 20+ sheets with various sheet names, e.g. sheet1, abc, xyz, etc. I would like to import all sheets into one DataFrame and use the sheet name as the first column and the row number of each sheet as the second column. The expected result will be like this: Sheet Row Column1 Column2 ...ColumnN sheet1 1 data data data sheet1 2 data data data sheet1 3 data data data sheet1 4 data data data abc 1 data data data abc 2 data data data abc 3 data data data xyz 1 data data data xyz 2 data data data ... ... data data data

Reputation: 21

How to combine multiple excel worksheets into a single Pandas Dataframe with sheet name as the first column and row number as the second column

I have an Excel workbook that contains 20+ sheets with various sheet names, e.g. sheet1, abc, xyz, etc. I would like to import all sheets into one DataFrame and use the sheet name as the first column and the row number of each sheet as the second column.

The expected result will be like this:

Sheet	Row	Column1	Column2	...ColumnN
sheet1	1	data	data	data
sheet1	2	data	data	data
sheet1	3	data	data	data
sheet1	4	data	data	data
abc	1	data	data	data
abc	2	data	data	data
abc	3	data	data	data
xyz	1	data	data	data
xyz	2	data	data	data
...	...	data	data	data

Upvotes: 0

Answers (1)

Thirasak Phromphimon

Reputation: 21

I've revised the code from asongtoruin, answered Jun 14 '17 at 15:46, Python Loop through Excel sheets, place into one df

import pandas as pd
pd.set_option("display.max_rows", None, "display.max_columns", None) #T: Print an entire pandas DataFrame. Source: Kite, retrieved on 29 Sep 2021, https://www.kite.com/python/answers/how-to-print-an-entire-pandas-dataframe-in-python

vstPathFile = r"D:\.......\filename.xlsm"

sheets_dict = pd.read_excel(vstPathFile, sheet_name=None, header=None)

full_table = pd.DataFrame()
for name, sheet in sheets_dict.items():
    sheet.insert(loc=0, column="Sheet", value=name, allow_duplicates=True)
    sheet.insert(loc=1, column="Row", value=sheet.index+1, allow_duplicates=True)
    full_table = full_table.append(sheet)

full_table.reset_index(inplace=True, drop=True)

Upvotes: 0

How to combine multiple excel worksheets into a single Pandas Dataframe with sheet name as the first column and row number as the second column

Answers (1)

Related Questions