Reputation: 543
Following are my dataframes:
df1:
pri sec0 sec1 sec2
ACL EMR DFG XHD
ABC MKB JKL KLF
XYZ LMN SDF GHY
df2:
name loc
ACL 12
EMR 23
DFG 431
XHD 48
ABC 55
MKB 699
JKL 70
KLF 82
XYZ 93
LMN 10
SDF 235
GHY 53
I'm trying to join each of the pri,sec0,sec1... columns of df1 with name column of df2 and want to append loc column to each of the matched elements of pri,sec0,sec1...(as pri_loc,sec0_loc... respectively)
following is the required output:
pri pri_loc sec0 sec0_loc sec1 sec1_loc sec2 sec2_loc
ACL 12 EMR 23 DFG 431 XHD 48
ABC 55 MKB 699 JKL 70 KLF 82
XYZ 93 LMN 10 SDF 235 GHY 53
Can I get suggestions on how to achieve this..?
Upvotes: 1
Views: 9817
Reputation: 863166
You can join
new DataFrame
created by replace
all values by Series
created by set_index
, then add add_suffix
for change columns names and last sort_index
for sorting columns:
df=df1.join(df1.replace(df2.set_index('name')['loc']).add_suffix('_loc')).sort_index(axis=1)
print (df)
pri pri_loc sec0 sec0_loc sec1 sec1_loc sec2 sec2_loc
0 ACL 12 EMR 23 DFG 431 XHD 48
1 ABC 55 MKB 699 JKL 70 KLF 82
2 XYZ 93 LMN 10 SDF 235 GHY 53
Detail:
print (df1.replace(df2.set_index('name')['loc']).add_suffix('_loc'))
pri_loc sec0_loc sec1_loc sec2_loc
0 12 23 431 48
1 55 699 70 82
2 93 10 235 53
EDIT:
if I have one more column say
ternary
indf1
and I do not want to join it with name of df2 but I want to reatin as it is in output
Then use set_index
first and last reset_index
:
df1 = df1.set_index('ternary')
df = (df1.join(df1.replace(df2.set_index('name')['loc']).add_suffix('_loc'))
.sort_index(axis=1)
.reset_index())
print (df)
Upvotes: 1