Reputation: 2024
I am trying to merge two pandas DataFrames with one of many relationship.
import pandas as pd
df1 = pd.DataFrame({'name': ['AA', 'BB', 'CC'],
'col1': [1, 2, 3],
'col2': [1, 2, 3] })
df2 = pd.DataFrame({'name': ['AA', 'AA', 'BB'],
'col1': [1, 2, 3],
'col2': [1, 2, 3] })
df_merged = pd.merge(
df1,
df2,
left_on = 'name',
right_on = 'name',
how = "inner"
)
Two questions.
pd.merge
without inserting new rows in df1
? Shape of df1
needs not change. name
is unique in df1
.For rows with one-to-many relationship, I'd like join the first row from df2
.
col1.x, col2.x, col1.y, col2.y
? I'd like only copy of those columns.Upvotes: 0
Views: 2503
Reputation: 26676
Use left join and drop duplicates
df1.merge(df2, how='left', on='name').drop_duplicates(subset='name',keep='first')
Upvotes: 1