screechOwl
screechOwl

Reputation: 28139

Python merge 2 lists / SQL JOIN

If I have 2 lists or data frame (pandas) in python how do I merge / match / join them?

For example:

List / DF 1:

Table_Name  Table_Alias
  tab_1          t1
  tab_2          t2
  tab_3          t3

List / DF 2:

Table_Alias   Variable_Name
    t1            Owner
    t1            Owner_Id
    t2            Purchase_date
    t3            Maintenance_cost

Desired Result:

Table_Name   Table_Alias   Variable_Name
   tab_1         t1            Owner
   tab_1         t1            Owner_Id
   tab_2         t2            Purchase_date
   tab_3         t3            Maintenance_cost

NOTE : If I was doing this in R, I'd use something like:

df3 <- merge(df1, df2, by = 'Table_Alias', all.y = T)

What's the best way to do this in python?

Upvotes: 1

Views: 199

Answers (2)

Yonas Kassa
Yonas Kassa

Reputation: 3690

I would simply use pd.merge(df1, df2, how='outer',on='alias')

df1 = pd.DataFrame({ "table_name":['tab1',"tab2","tab3"],"talias ['t1','t2','t3']})
df2 = pd.DataFrame({"talias":['t1',"t1","t2",'t3'], "vname,['Owner','Owner_Id','Purchase_date','Maintenance_cost']})


pd.merge(df1,df2,how='outer', on='talias')


Out:
    Table_Alias Table_Name  Variable_Name
0   t1  tab1    Owner
1   t1  tab1    Owner_Id
2   t2  tab2    Purchase_date
3   t3  tab3    Maintenance_cost

Upvotes: -1

EdChum
EdChum

Reputation: 393933

You want an 'outer' merge:

In [9]:
df.merge(df1, how='outer')

Out[9]:
  Table_Name Table_Alias     Variable_Name
0      tab_1          t1             Owner
1      tab_1          t1          Owner_Id
2      tab_2          t2     Purchase_date
3      tab_3          t3  Maintenance_cost

It will match on overlapping columns from both dfs and return the union of the matching rows.

Upvotes: 2

Related Questions