Reputation: 1139
I have left and right table and I need to merge FileStamp values from both in this manner: take all values from left table and from right table missing in left table, join it by 'date':
import pandas as pd
left = pd.DataFrame({'FileStamp': ['T101', 'T102', 'T103', 'T104'], 'date': [20180101, 20180102, 20180103, 20180104]})
right = pd.DataFrame({'FileStamp': ['T501', 'T502'], 'date': [20180104, 20180105]})
Something like
result = pd.merge(left, right, how='outer', on='date')
but 'outer' is not good idea.
Desired output should look like
FileStamp_x date FileStamp_y
0 T101 20180101 NaN
1 T102 20180102 NaN
2 T103 20180103 NaN
3 T104 20180104 NaN
4 NaN 20180105 T502
Is there any simple way how to achieve desired output?
Upvotes: 1
Views: 1121
Reputation: 863146
Use filtering by isin
before merge
:
r = right[~right['date'].isin(left['date'])]
print (r)
FileStamp date
1 T502 20180105
result = pd.merge(left, r, how='outer', on='date')
print (result)
FileStamp_x date FileStamp_y
0 T101 20180101 NaN
1 T102 20180102 NaN
2 T103 20180103 NaN
3 T104 20180104 NaN
4 NaN 20180105 T502
Upvotes: 3
Reputation: 28367
You can adjust the values after the merge
:
result = pd.merge(left, right, how='outer', on='date')
result['FileStamp_y'] = np.where(result['FileStamp_x'].isnull(), result['FileStamp_y'], np.nan)
Result:
FileStamp_x date FileStamp_y
0 T101 20180101 NaN
1 T102 20180102 NaN
2 T103 20180103 NaN
3 T104 20180104 NaN
4 NaN 20180105 T502
Upvotes: 0