How do I create a new column based on matching values in two different dataframes?

Question

I have two dataframes:

df1 (a row for every event that happens in the game)

Date	Game	Event Type	Player	Time
02/28/10	USA vs Canada	Faceoff	Sidney Crosby	20:00
02/28/10	USA vs Canada	Pass	Drew Doughty	19:59
02/28/10	USA vs Canada	Pass	Scott Niedermayer	19:42
02/28/10	USA vs Canada	Shot	Sidney Crosby	18:57
02/28/10	USA vs Canada	Takeaway	Dany Heatley	18:49
02/28/10	USA vs Canada	Shot	Dany Heatley	18:02
02/28/10	USA vs Canada	Shot	Sidney Crosby	17:37

df2

Player
Sidney Crosby
Dany Heatley
Scott Niedermayer
Drew Doughty

How do I create a new column in df2 that matches the Player column in each dataframe and counts each row where the Event Type in df1 is "Shot"?

This is the output I would look for in this example:

Player	Shots
Sidney Crosby	2
Dany Heatley	1
Scott Niedermayer	0
Drew Doughty	0

I'm new to Python, so I apologize if there's an easy answer that I'm missing. Thank you!

Quang Hoang · Accepted Answer

You can filter the df1 for shot, then do a value count:

shots = df1.loc[df1['Event Type']=='shot', 'Player'].value_counts()

df2['shots'] = df2['Player'].map(shots)
# or using reindex with `fill_value` option
# shots.reindex(df2['Player'], fill_value=0).values

Bonus: Use crosstab and merge to get all statistics at once:

df2.merge(pd.crosstab(df1['Player'], df1['Event Type']),
          on='Player', how='left')

How do I create a new column based on matching values in two different dataframes?

Answers (1)

Related Questions