Reputation: 55022
I have multiple excel files in the following format:
ID | Name | Prop1 | Prop2 | User
Data from excel1:
ID | Name | Prop1 | Prop2 | Prop3 | User
1 | test | | | | John
Data from Excel2:
ID | Name | Prop1 | Prop2 | Prop3 | User
1 | test | a | b | | John
Data from Excel3:
ID | Name | Prop1 | Prop2 | Prop3 | User
1 | test | | | c | John
What I want to do is to combine these cells.
Desired output:
ID | Name | Prop1 | Prop2 | Prop3 | User
1 | test | a | b | c | John
If the cell is empty on a file and the other file has value in it, I would like to replace it.
is there any easy way to accomplish this?
Thanks.
Upvotes: 1
Views: 101
Reputation: 862406
You can create list of all DataFrames by glob
, for final df
need combine_first
with reduce:
import glob
from functools import reduce
files = glob.glob('files/*.xlsx')
dfs = [pd.read_excel(fp).set_index(['ID','Name','User']) for fp in files]
df1 = reduce(lambda l,r: pd.DataFrame.combine_first(l,r), dfs)
print (df1)
Prop1 Prop2 Prop3
ID Name User
1 test John a b c
EDIT: If dont need combine files with NaN
s solution is simplier:
import glob
files = glob.glob('files/*.xlsx')
df = pd.concat([pd.read_excel(fp) for fp in files],ignore_index=True)
Upvotes: 4
Reputation: 20302
This should do what you want.
import pandas as pd
import numpy as np
import glob
glob.glob("C:/your_path_here/*.xlsx")
all_data = pd.DataFrame()
for f in glob.glob("C:/your_path_here/*.xlsx"):
df = pd.read_excel(f)
all_data = all_data.append(df,ignore_index=True)
print(all_data)
Upvotes: 0
Reputation: 1340
try below:
df1 = pd.read_excel('Excel1.xlsx', sheetname='Sheet1');
df2= pd.read_excel('Excel2.xlsx', sheetname='Sheet1');
df3 = pd.read_excel('Excel3.xlsx', sheetname='Sheet1')
mylist = [df1,df2,d3]
df = pd.merge(df1, df2, on=['ID','USER'])
df = pd.merge(df, df3, on=['ID','USER'])
writer = pd.ExcelWriter('pandas_simple.xlsx', engine='xlsxwriter')
# Convert the dataframe to an XlsxWriter Excel object.
df.to_excel(writer, sheet_name='Sheet1')
# Close the Pandas Excel writer and output the Excel file.
writer.save()
I haven't got the chance to test this , but it should work .
Upvotes: 1