Reputation: 1553
I have two dataframes, first is:
df1
col_one col_two
ABBC1 (1, 2, 3)
DFFG2 (3, 5, 1)
JJKS3 (5, 2, 5)
df2
col_1
operate ABBC1 1 to 2, JJKS3 3 to 5
operate JJKS3, FOM
Desired output df2:
col_1 col_2
operate ABBC1, to 2 (1, 2, 3)
operate JJKS3, FOM (5, 2, 5)
I have tried multiple ways but the closest I get is:
for values, map_col in df1[['col_one', 'col_two']].values:
for val in df2['col_1']:
if ("%s" %values) in df2['col_1'] :
df2['col_2'] = "%s" %(map_col,)
And I thought this will be good, but I got exactly the same values for all rows.
Any help is welcomed. Thanks
Upvotes: 0
Views: 501
Reputation: 863301
You can create new column by extract
and merge by it:
pat = '|'.join(r"\b{}\b".format(x) for x in df1['col_one'].unique())
df2['col_one'] = df2['col_1'].str.extract('(' + pat + ')')
print (df2)
col_1 col_one
0 operate ABBC1, to 2 ABBC1
1 operate JJKS3, FOM JJKS3
df = df1.merge(df2, on='col_one')
print (df)
col_one col_two col_1
0 ABBC1 (1, 2, 3) operate ABBC1, to 2
1 JJKS3 (5, 2, 5) operate JJKS3, FOM
EDIT:
If match multiple values is possible use findall
and create new DataFrame
:
pat = '|'.join(r"\b{}\b".format(x) for x in df1['col_one'].unique())
s = df2['col_1'].str.findall('(' + pat + ')')
print (s)
0 [ABBC1, JJKS3]
1 [JJKS3]
Name: col_1, dtype: object
lens = s.str.len()
a = np.repeat(df2['col_1'], lens)
b = np.concatenate(s)
df2 = pd.DataFrame({'col_1':a, 'col_one':b})
print (df2)
col_1 col_one
0 operate ABBC1, to 2 JJKS3 3 to 5 ABBC1
0 operate ABBC1, to 2 JJKS3 3 to 5 JJKS3
1 operate JJKS3, FOM JJKS3
df = df1.merge(df2, on='col_one')
print (df)
col_one col_two col_1
0 ABBC1 (1, 2, 3) operate ABBC1, to 2 JJKS3 3 to 5
1 JJKS3 (5, 2, 5) operate ABBC1, to 2 JJKS3 3 to 5
2 JJKS3 (5, 2, 5) operate JJKS3, FOM
Upvotes: 1