Reputation: 91
Hihi, i wonder if it is possible to join 2 tables - and add columns value with same column name? for example: df1 join df2 by ['Name']
for common columns, would want to combine and add columns to 1 instead of creating 2 columns with Value_1_x & Value_1_y
##expected output like df_final['Value_1'] = (df1['Value_1]' + df2['Value_1'])
for uncommon columns, want to keep the column with original name and value
##expected output like df_final['Value_2'] = df2['Value_2']
since i'm not sure the future df1 & df2 dimension, and it is not always named Value_1 & Value_2, so i wonder if it is possible to process the "adding" while in joining.
df1
Name Category Value_1
Boo A 1
Boo B 2
Boo C 3
df2
Name Type Value_1 Value_2
Boo x 0 Null
Boo y 1 4
Expected Output
Name Category Type Value_1 Value_2
Boo A x 1 Null
Boo A y 2 4
Boo B x 2 Null
Boo B y 3 4
Boo C x 3 Null
Boo C y 4 4
Upvotes: 2
Views: 434
Reputation: 862481
Unfortunately in join or merge is not possible use sum
, possible solution is processing after merging all duplicated numeric columns - it means columns ends with _
with columns without _
, specified in parameter suffixes
:
df = df1.merge(df2, on='Name', suffixes=('_', ''))
#get only numeric df
df1 = df.select_dtypes(np.number)
#filter numeric columns ends with '_'
for c in df1.columns[df1.columns.str.endswith('_')]:
#pop add column and delete
df[c.rstrip('_')] += df.pop(c)
print (df)
Name Category Type Value_1 Value_2
0 Boo A x 1 Null
1 Boo A y 2 4
2 Boo B x 2 Null
3 Boo B y 3 4
4 Boo C x 3 Null
5 Boo C y 4 4
Upvotes: 1