Reputation: 127
I have 3 dataframes with the same formats and columns and each data frame has a column criterion 1. Each data frame is filled with different persons and I want to combine the non-NaN values of this column in 3 data frames and then store it in an excel file such that NaNs are blank value in excel. How can I do this in Python? Moreover, I want to know if there is any row that is filled by at least two people or not?
DataFrame 1
ID Criterion 1
1021 10
1022 NaN
1023 NaN
1024 NaN
1025 NaN
DataFrame 2
ID Criterion 1
1021 NaN
1022 15
1023 NaN
1024 NaN
1025 NaN
DataFrame 3
ID Criterion 1
1021 NaN
1022 NaN
1023 NaN
1024 NaN
1025 34
Joint DataFrame
ID Criterion 1
1021 10
1022 15
1023 NaN
1024 NaN
1025 34
Excel File DataFrame 1
ID Criterion 1
1021 10
1022 15
1023
1024
1025 34
Upvotes: 0
Views: 426
Reputation: 862511
Idea is raise error, if there is 2 or more values for same ID for Criterion 1
column in custom function:
def func(dfs):
dfs = [x.set_index('ID') for x in dfs]
df = pd.concat([x['Criterion 1'] for x in dfs], 1)
m = df.count(axis=1).gt(1)
L = m.index[m].astype(str)
if m.any():
raise ValueError(f'For ID(s) {", ".join(L)} are multiple values')
return df.groupby(axis=1, level=0).first()
df = func([df1, df2, df3])
print (df)
Criterion 1
ID
1021 10.0
1022 15.0
1023 NaN
1024 NaN
1025 34.0
Last, write this data frame into an Excel file format:
df.to_excel(file)
Upvotes: 1