Reputation: 951
If I have 2 panda df, I want to let A left join B on 2 conditions, 1) A.id = B.id 2) A.x in B.x_set. Can anyone help me on this? :
A:
id x
1 a
2 b
3 c
B:
id x_set detail
1 a,b,c x
1 d y
2 a,c z
2 d m
2 b n
3 a i
3 b,c j
The final table should be like this:
id x detail
1 a x
2 b n
3 c j
Upvotes: 1
Views: 1592
Reputation: 3752
If using pandas==0.25
, you can:
pd.merge
B['x_set'] = B['x_set'].apply(lambda x: x.split(','))
B = B.explode('x_set')
A.merge(B, left_on=['id','x'], right_on=['id','x_set'])
Out[11]:
id x x_set detail
0 1 a a x
1 2 b b n
2 3 c c j
If pandas<0.25
:
id
and detail
using pd.Series.repeat
B['x_set'] = B['x_set'].apply(lambda x: x.split(','))
len_set = B['x_set'].apply(len).values
values = B['x_set'].values.flatten().tolist()
flat_results = [item for sublist in values for item in sublist]
new_B = pd.DataFrame(flat_results, columns=['x'])
new_B['id'] = B['id'].repeat(len_set).values
new_B['detail'] = B['detail'].repeat(len_set).values
A.merge(new_B, on=['id','x'])
Out[32]:
id x detail
0 1 a x
1 2 b n
2 3 c j
Upvotes: 2