Yoav Sheetrit
Yoav Sheetrit

Reputation: 149

Join DataFrames by a matching key

I'm trying to join two DataFrames that has a matching key. Currently, I've tried with all three possible methods: df.merge, df.join, df.concat but with no luck.

#DataFrame 1: 
#  Timestamp  PageId LoadDuration
#  01/01/2019 1      10
#  01/01/2019 2      20
#  01/01/2019 3      30

#DataFrame 2:
#  Timestamp   PageId    QueryCount
#  01/01/2019  1         5
#  01/01/2019  2         3
#  01/01/2019  3         4 

And what I'm trying to get, as a result DataFrame:

#Result DataFrame:
#  Timestamp  PageId LoadDuration QueryCount
#  01/01/2019 1      10           5
#  01/01/2019 2      20           3
#  01/01/2019 3      30           4

This is my concatenation line:

finalized_grouped_by_df = pd.concat([df1,df2])

But I'm getting something like the following result DataFrame:

#Current Result DataFrame:
#                    LoadDuration
#  (01/01/2019,1)      10
#  (01/01/2019,2)      20
#  (01/01/2019,3)      30
#                    QueryCount
#  (01/01/2019,1)         5
#  (01/01/2019,2)         3
#  (01/01/2019,3)         4 

Any idea how can I join both DataFrames by their matching PageId key?

Upvotes: 0

Views: 868

Answers (3)

Chris McGovern
Chris McGovern

Reputation: 1

I try to avoid merging when I just want to bring in one column. This creates a dictionary with PageId as the key and QueryCount as the values, then creates your QueryCount column off the dictionary.

queryCountDict = df2.set_index('QueryCount')['PageId'].to_dict()

df1['QueryCount'] = df1['PageId'].map(queryCountDict)

Upvotes: 0

Terry
Terry

Reputation: 2811

You can do it using merge

df1.merge(df2 , on = ['Timestamp','PageId'])

Upvotes: 0

No_body
No_body

Reputation: 842

Try

 pd.merge(df1,df2 , on = 'PageId' , how = 'inner')

Upvotes: 2

Related Questions