Reputation: 28139
If I have 2 lists or data frame (pandas) in python how do I merge / match / join them?
For example:
List / DF 1:
Table_Name Table_Alias
tab_1 t1
tab_2 t2
tab_3 t3
List / DF 2:
Table_Alias Variable_Name
t1 Owner
t1 Owner_Id
t2 Purchase_date
t3 Maintenance_cost
Desired Result:
Table_Name Table_Alias Variable_Name
tab_1 t1 Owner
tab_1 t1 Owner_Id
tab_2 t2 Purchase_date
tab_3 t3 Maintenance_cost
NOTE : If I was doing this in R, I'd use something like:
df3 <- merge(df1, df2, by = 'Table_Alias', all.y = T)
What's the best way to do this in python?
Upvotes: 1
Views: 199
Reputation: 3690
I would simply use pd.merge(df1, df2, how='outer',on='alias')
df1 = pd.DataFrame({ "table_name":['tab1',"tab2","tab3"],"talias ['t1','t2','t3']})
df2 = pd.DataFrame({"talias":['t1',"t1","t2",'t3'], "vname,['Owner','Owner_Id','Purchase_date','Maintenance_cost']})
pd.merge(df1,df2,how='outer', on='talias')
Out:
Table_Alias Table_Name Variable_Name
0 t1 tab1 Owner
1 t1 tab1 Owner_Id
2 t2 tab2 Purchase_date
3 t3 tab3 Maintenance_cost
Upvotes: -1
Reputation: 393933
You want an 'outer' merge
:
In [9]:
df.merge(df1, how='outer')
Out[9]:
Table_Name Table_Alias Variable_Name
0 tab_1 t1 Owner
1 tab_1 t1 Owner_Id
2 tab_2 t2 Purchase_date
3 tab_3 t3 Maintenance_cost
It will match on overlapping columns from both dfs and return the union of the matching rows.
Upvotes: 2