Thirasak Phromphimon
Thirasak Phromphimon

Reputation: 21

How to combine multiple excel worksheets into a single Pandas Dataframe with sheet name as the first column and row number as the second column

I have an Excel workbook that contains 20+ sheets with various sheet names, e.g. sheet1, abc, xyz, etc. I would like to import all sheets into one DataFrame and use the sheet name as the first column and the row number of each sheet as the second column.

The expected result will be like this:

Sheet Row Column1 Column2 ...ColumnN
sheet1 1 data data data
sheet1 2 data data data
sheet1 3 data data data
sheet1 4 data data data
abc 1 data data data
abc 2 data data data
abc 3 data data data
xyz 1 data data data
xyz 2 data data data
... ... data data data

Upvotes: 0

Views: 1472

Answers (1)

Thirasak Phromphimon
Thirasak Phromphimon

Reputation: 21

I've revised the code from asongtoruin, answered Jun 14 '17 at 15:46, Python Loop through Excel sheets, place into one df

import pandas as pd
pd.set_option("display.max_rows", None, "display.max_columns", None) #T: Print an entire pandas DataFrame. Source: Kite, retrieved on 29 Sep 2021, https://www.kite.com/python/answers/how-to-print-an-entire-pandas-dataframe-in-python

vstPathFile = r"D:\.......\filename.xlsm"

sheets_dict = pd.read_excel(vstPathFile, sheet_name=None, header=None)

full_table = pd.DataFrame()
for name, sheet in sheets_dict.items():
    sheet.insert(loc=0, column="Sheet", value=name, allow_duplicates=True)
    sheet.insert(loc=1, column="Row", value=sheet.index+1, allow_duplicates=True)
    full_table = full_table.append(sheet)

full_table.reset_index(inplace=True, drop=True)

Upvotes: 0

Related Questions