Reputation: 59
I'm trying to do the following:
Given a row in df1, if str(row['code']) is in any rows for df2['code'], then I would like all those rows in df2['lamer_url_1'] and df2['shopee_url_1'] to take the corresponding values as from df1. Then carry on with the next row for df1['code']...
'''
==============
Initial Tables:
df1
code lamer_url_1 shopee_url_1
0 L61B18H089 b a
1 L61S19H014 e d
2 L61S19H015 z y
df2
code lamer_url_1 shopee_url_1 lamer_url_2 shopee_url_2
0 L61B18H089-F1424 NaN NaN NaN NaN
1 L61S19H014-S1500 NaN NaN NaN NaN
2 L61B18H089-F1424 NaN NaN NaN NaN
==============
Expected output:
df2
code lamer_url_1 shopee_url_1 lamer_url_2 shopee_url_2
0 L61B18H089-F1424 b a NaN NaN
1 L61S19H014-S1500 e d NaN NaN
2 L61B18H089-F1424 b a NaN NaN
'''
Upvotes: 1
Views: 131
Reputation: 1730
I assumed that common part of "code" from "df2" are chars before "-". I also assumed that from "df1" we want 'lamer_url_1', 'shopee_url_1' and from "df2" we want 'lamer_url_2', 'shopee_url_2' (correct me in comment if I am wrong so I can polish code):
df1.set_index(df1['code'], inplace=True)
df2.set_index(df2['code'].apply(lambda x: x.split('-')[0]), inplace=True)
df2.index.names = ['code_join']
df3 = pd.merge(df2[['code', 'lamer_url_2', 'shopee_url_2']],
df1[['lamer_url_1', 'shopee_url_1']],
left_index=True, right_index=True)
Upvotes: 1