Amin
Amin

Reputation: 127

Using python to merge multiple columns with non-NaN values

I have 3 dataframes with the same formats and columns and each data frame has a column criterion 1. Each data frame is filled with different persons and I want to combine the non-NaN values of this column in 3 data frames and then store it in an excel file such that NaNs are blank value in excel. How can I do this in Python? Moreover, I want to know if there is any row that is filled by at least two people or not?

DataFrame 1

ID       Criterion 1
1021     10
1022     NaN
1023     NaN
1024     NaN
1025     NaN

DataFrame 2

ID       Criterion 1
1021     NaN
1022     15
1023     NaN
1024     NaN
1025     NaN

DataFrame 3

ID       Criterion 1
1021     NaN
1022     NaN
1023     NaN
1024     NaN
1025     34

Joint DataFrame

ID       Criterion 1
1021     10
1022     15
1023     NaN
1024     NaN
1025     34

Excel File DataFrame 1

ID       Criterion 1
1021     10
1022     15
1023     
1024     
1025     34

Upvotes: 0

Views: 426

Answers (1)

jezrael
jezrael

Reputation: 862511

Idea is raise error, if there is 2 or more values for same ID for Criterion 1 column in custom function:

def func(dfs):

    dfs = [x.set_index('ID') for x in dfs]

    df = pd.concat([x['Criterion 1'] for x in dfs], 1)
    m = df.count(axis=1).gt(1)
    L = m.index[m].astype(str)
    if m.any():
        raise ValueError(f'For ID(s) {", ".join(L)} are multiple values')

    return df.groupby(axis=1, level=0).first()


df = func([df1, df2, df3])
print (df)
      Criterion 1
ID               
1021         10.0
1022         15.0
1023          NaN
1024          NaN
1025         34.0

Last, write this data frame into an Excel file format:

df.to_excel(file)

Upvotes: 1

Related Questions