Reputation: 21
I have an Excel workbook that contains 20+ sheets with various sheet names, e.g. sheet1, abc, xyz, etc. I would like to import all sheets into one DataFrame and use the sheet name as the first column and the row number of each sheet as the second column.
The expected result will be like this:
Sheet | Row | Column1 | Column2 | ...ColumnN |
---|---|---|---|---|
sheet1 | 1 | data | data | data |
sheet1 | 2 | data | data | data |
sheet1 | 3 | data | data | data |
sheet1 | 4 | data | data | data |
abc | 1 | data | data | data |
abc | 2 | data | data | data |
abc | 3 | data | data | data |
xyz | 1 | data | data | data |
xyz | 2 | data | data | data |
... | ... | data | data | data |
Upvotes: 0
Views: 1472
Reputation: 21
I've revised the code from asongtoruin, answered Jun 14 '17 at 15:46, Python Loop through Excel sheets, place into one df
import pandas as pd
pd.set_option("display.max_rows", None, "display.max_columns", None) #T: Print an entire pandas DataFrame. Source: Kite, retrieved on 29 Sep 2021, https://www.kite.com/python/answers/how-to-print-an-entire-pandas-dataframe-in-python
vstPathFile = r"D:\.......\filename.xlsm"
sheets_dict = pd.read_excel(vstPathFile, sheet_name=None, header=None)
full_table = pd.DataFrame()
for name, sheet in sheets_dict.items():
sheet.insert(loc=0, column="Sheet", value=name, allow_duplicates=True)
sheet.insert(loc=1, column="Row", value=sheet.index+1, allow_duplicates=True)
full_table = full_table.append(sheet)
full_table.reset_index(inplace=True, drop=True)
Upvotes: 0