Reputation: 3657
I have an Excel file with several sheets. I would like to iterate over each sheet of the Excel file and create a DataFrame for each sheet in the file. Also, I would like the DataFrames to be named "hfi_(name of the sheet)". This is the for loop I wrote so far, but it is not running due to a syntax error. Would you help me solve this?
for sheet in hfi_file:
hfi_file=pd.read_excel("/Users/glosophy/Desktop/hfi/hfi_2016.xlsx",sheet_name=sheet)
return str("hfi"+sheet)=pd.DataFrame(hfi_file)
Upvotes: 1
Views: 3267
Reputation: 1
# Read the excel file
filename = 'yourpath/yourfile.xlsx'
df_raw = pd.read_excel(filename)
# iterate through the sheets in the file
for sheet in df_raw.sheet_names:
sheet = pd.read_excel(filename, sheet_name=sheet)
Upvotes: 0
Reputation: 781
Here is the easiest way I can share with you:
# read the sheet file
import pandas as pd
sheets = pd.ExcelFile('sheet_filename.xlsx')
# save each sheet to a dataframe
df1 = pd.read_excel(sheets, 'Sheet1')
df2 = pd.read_excel(sheets, 'Sheet2')
Hope this helped you!
Upvotes: 0
Reputation: 1615
I'm clearly slow on the draw, but here's another answer:
import pandas as pd
# importing these so sessionInfo comes up for them
import openpyxl
import xlrd
file = 'output.xlsx'
df1 = pd.DataFrame({'col1': [1, 2], 'col2': [3, 4]})
df2 = pd.DataFrame({'sheet2-col1': [0, 1], 'sheet2-col2': [30, 40]})
writer = pd.ExcelWriter(file)
df1.to_excel(writer, 'Sheet1')
df2.to_excel(writer, 'Sheet2')
writer.save()
reader = pd.ExcelFile(file)
dict = {"hfi" + i : pd.read_excel(file, sheet_name=i) for i in reader.sheet_names}
print(dict)
#> {'hfiSheet1': col1 col2
#> 0 1 3
#> 1 2 4, 'hfiSheet2': sheet2-col1 sheet2-col2
#> 0 0 30
#> 1 1 40}
Created on 2018-08-30 by the reprexpy package
import reprexpy
print(reprexpy.SessionInfo())
#> Session info --------------------------------------------------------------------
#> Platform: Darwin-17.7.0-x86_64-i386-64bit (64-bit)
#> Python: 3.6
#> Date: 2018-08-30
#> Packages ------------------------------------------------------------------------
#> openpyxl==2.5.6
#> pandas==0.23.4
#> reprexpy==0.1.1
#> xlrd==1.1.0
Upvotes: 0
Reputation: 5145
You need to import the Excel file first, then read each sheet by name
multi_sheet_file = pd.ExcelFile("/Users/glosophy/Desktop/hfi/hfi_2016.xlsx")
# Gets the sheet names in the file
excel_sheet_names = multi_sheet_file.sheet_names
dict_of_sheets = {}
for sheet in excel_sheet_names:
dict_of_sheets["hfi" + sheet] = pd.read_excel(multi_sheet_file, sheetname=sheet')
Upvotes: 1
Reputation: 323226
I will recommend store those dataframes into a dict
d={"hfi" + sheet:pd.DataFrame(pd.read_excel("/Users/glosophy/Desktop/hfi/hfi_2016.xlsx", sheet_name=sheet)) for sheet in hfi_file}
Then you can using .get
to access the data
d.get('your dict key')
Upvotes: 0