Anh Quoc Vo
Anh Quoc Vo

Reputation: 59

Check whether a dataframe cell contains value that is in another dataframe's cell

I'm trying to do the following:

Given a row in df1, if str(row['code']) is in any rows for df2['code'], then I would like all those rows in df2['lamer_url_1'] and df2['shopee_url_1'] to take the corresponding values as from df1. Then carry on with the next row for df1['code']...

'''

==============

Initial Tables:

df1

     code                  lamer_url_1                 shopee_url_1

0  L61B18H089                       b                            a

1  L61S19H014                       e                            d

2  L61S19H015                       z                            y

df2

  code             lamer_url_1   shopee_url_1   lamer_url_2  shopee_url_2

0 L61B18H089-F1424         NaN           NaN          NaN           NaN

1 L61S19H014-S1500         NaN           NaN          NaN           NaN

2 L61B18H089-F1424         NaN           NaN          NaN           NaN

==============

Expected output:

df2

   code              lamer_url_1  shopee_url_1  lamer_url_2  shopee_url_2
0  L61B18H089-F1424           b             a          NaN           NaN

1  L61S19H014-S1500           e             d          NaN           NaN

2  L61B18H089-F1424           b             a          NaN           NaN

'''

Upvotes: 1

Views: 131

Answers (1)

Hubert Dudek
Hubert Dudek

Reputation: 1730

I assumed that common part of "code" from "df2" are chars before "-". I also assumed that from "df1" we want 'lamer_url_1', 'shopee_url_1' and from "df2" we want 'lamer_url_2', 'shopee_url_2' (correct me in comment if I am wrong so I can polish code):

df1.set_index(df1['code'], inplace=True)
df2.set_index(df2['code'].apply(lambda x: x.split('-')[0]), inplace=True)
df2.index.names = ['code_join']

df3 = pd.merge(df2[['code', 'lamer_url_2', 'shopee_url_2']],
               df1[['lamer_url_1', 'shopee_url_1']],
               left_index=True, right_index=True)

Upvotes: 1

Related Questions