Reputation: 1345
I am trying to read multiple excel files. Each time one excel file is read I would like to append it to the other excel file. At the end, I should end up with one dataframe which has the content of all excel files.
How can I do that in a for loop?
Here is my attempt:
for i in range(1,10):
temp = pd.read_excel(path[i])
temp_final=temp
The idea here is to have temp_final containing the content of all excel files. Something similar to temp_final=[excelfile1, excelfile2]
pd.concat(temp_final)
I would welcome any idea on how I can finish this for
loop. Many Thanks
Upvotes: 1
Views: 3423
Reputation: 107
i ve got around 1000 excel files located in one folder:
C:/BD/KEN
all the files had the naming format:
'Ken <#> dated .xlsx'
i needed to read all the files, table from the first sheet and then merge all into one dataframe for further manipulation and having ONE BIG excel file to work with:
import pandas as pd
import os
#list of <#> series of excel files (around 1000 files total)
names = ['1125','1126','1127']
#column names
ColNames = ['a', 'b', 'c','d','e','f','g','h']
#empty dataframe
df = pd.DataFrame(columns=ColNames)
for x,y,z in os.walk('C:/BD/KEN'):
for i in z:
if i.split()[1] in names:
print(i)
try:
temp = pd.read_excel('C:/BD/KEN'+i)
except:
print('ALERT')
df.append([temp])
df.to_excel('C:/BD/TOTAL.xlsx', index=None)
print('DONE")
os.walk produces tuples (folder path, folder name, file name)
so 'z' is the file name as str
Upvotes: 0
Reputation: 164673
My advice is not to continually append to an existing dataframe.
It is much more efficient to read your dataframes into a list, then concatenate them in one call:
dfs = [pd.read_excel(path[i]) for i in range(1, 10)]
df = pd.concat(dfs, ignore_index=True)
Alternative syntax:
dfs = list(map(pd.read_excel, path[:10]))
Upvotes: 2
Reputation: 1345
I thought about this answer.
temp=pd.read_excel(path[0])
for i in range(1,2):
print(i)
temp1 = pd.read_excel(path[i])
temp=temp.append(temp1)
does it make sense to do for loop that way?
Upvotes: 0