Fede
Fede

Reputation: 23

Loop in order to create several DataFrames for each sheet in an Excel file

I have an Excel file with 36 sheets. I would like to iterate over each sheet of the Excel file and create a DataFrame for each sheet in the file. Also, I would like the DataFrames to be named with the name of the sheet. I am a newbie in python and in this site, I saw that there is a similar topic for this question but it doesn't answer completely to my question.

This is the for loop I wrote so far, but it is not doing what I have in my mind. With the last record of this code, I am obtaining a dataframe called sheet1 with the last data inside the last sheet in the excel file. What I would like to obtain is to have 36 different dataframe for each sheet in the excel, and if possible each dataframe has to be named with the name of each spreadsheet (I think that the dictionary called "dict_of_sheet" contain all the 36 table that I want to upload in python from excel.

Would you help me to solve this?

Thanks a lot.

multi_sheet_file = pd.ExcelFile("pathfile_Name")
excel_sheet_names = multi_sheet_file.sheet_names

dict_of_sheets = {}
for sheet in excel_sheet_names:
    dict_of_sheets[sheet] = pd.read_excel(multi_sheet_file, sheet_name=sheet)

sheet1=dict_of_sheets[sheet]

Upvotes: 2

Views: 1701

Answers (1)

Amit Amola
Amit Amola

Reputation: 2510

You can make use of exec() for this. exec() function is used for the dynamic execution of Python program which can either be a string or object code.

You can use xlrd library to get the sheet names too. You can use pandas libary too for the sheet names(I didn't look around, there definitely might be a way of doing that).

import xlrd

filename='try.xlsx'
xls = xlrd.open_workbook(filename, on_demand=True)
sheet_names=xls.sheet_names()

print(sheet_names)

Output:

['see1', 'see2', 'Sheet3']

Now that you've got sheet names, you can now run loop over them and use exec to create dataframes of same name:

for name in sheet_names:
    exec(f"{name}=pd.read_excel('{filename}', sheet_name='{name}')")

This creates dataframes with filenames as the see1, see2 and Sheet3.

print(see1)

Output:

   Col1  COl2
0     1     2
1     2     3
2     3     4
3     4     4

Hope this is what you need.

NOTE: In case your sheet name is just numbers, then it won't be possible to name a variable as just a number, so you might have to assign it a new name.

So just for the OP's case, here's a solution:

for name in sheet_names:
    if name.isdigit():
        exec(f"Sheet_name{name}=pd.read_excel('{filename}', sheet_name='{name}')")

    else:
        exec(f"{name}=pd.read_excel('{filename}', sheet_name='{name}')")

So what this code will do is, if you have any sheet name which is just numeric, it will create the variable name as, Sheet_name{the numeric}.

So in my case, I had sheet names as: ['Sheet1', '245', 'Sheet3'] and I finally get the second variable as a dataframe as below:

print(Sheet_name245)

Output:

   Col1  Col2
0     1     4
1     2     5
2     3     6

Hope this helps with your case.

NOTE2: The case where the sheet name has a decimal in it and not just integer as a number, then the above code will stop, since a decimal can't be used in a variable name either. So here's a workaround:

for name in sheet_names:
    if name.isdigit():
        exec(f"Sheet_name{name}=pd.read_excel('{filename}', sheet_name='{name}')")

    elif '.' in name:
        temp_name=name.replace('.', '_')
        exec(f"Sheet_name{temp_name}=pd.read_excel('{filename}', sheet_name='{name}')")        

    else:
        exec(f"{name}=pd.read_excel('{filename}', sheet_name='{name}')")

So now we will get filename for 245.63 as Sheet_name245_63. I hope now your issue is resolved.

Upvotes: 2

Related Questions