Reputation: 217
I am very new to excel/csv operation in python.
I have 1k+ .xlsx file. Each excel file have 2 sheets , sheet1.csv , sheet2.csv (sheet names are identical in all the .xlsx) . From all the excel file , I have to collect sheet1.csv and have to create new excel with all the sheets.
Sheet1.scv from one excel looks like below:
Name Age Gender State City
N1 22 M Cal ABC
N2 32 M NC Dur
sheet1.csv from another excel looks like below:
Name Age Gender State City
N3 26 F KLM PQR
N4 23 M SC STM
I am looking for output like below including an empty row between each sheet data:
Name Age Gender State City
N1 22 M Cal ABC
N2 32 M NC Dur
N3 26 F KLM PQR
N4 23 M SC STM
I tried this but not able to understand how to proceed ahead for the above work:
for excel in glob.glob(os.path.join(os.getcwd(), "*.xlsx")):
df=pd.read_excel(excel,sheet_name=0) #sheet1.csv is the first sheet in all excel
df.to_excel('Final.xlsx')
This code only keeps the last csv data of the excel file. Please help to fix it for the above problem statement.
Upvotes: 1
Views: 441
Reputation: 1631
You can do the following operation to concat all the excels in the given path.
allFiles = glob.glob(os.path.join(os.getcwd(), "*.xlsx"))
frame = pd.DataFrame()
list_ = []
append = list_.append
for file_ in allFiles:
df = pd.read_excel(file_,sheet_name=0)
append(df)
frame = pd.concat(list_)
Upvotes: 1
Reputation: 863541
Is it possible, but integer
s columns will be converted to float
s:
dfs = []
for excel in glob.glob(os.path.join(os.getcwd(), "*.xlsx")):
df=pd.read_excel(excel,sheet_name=0) #sheet1.csv is the first sheet in all excel
#if really need empty row
df = df.append(pd.DataFrame([np.nan] * len(df.columns)))
dfs.append(df)
df = pd.concat(dfs)
df.to_excel('Final.xlsx')
Upvotes: 1