Reputation: 962
Suppose I have a df -
Player Challenge Description
James ABC Desc1
Bob ABC Desc1
Bob XYZ Desc X
Bob ABX101 Desc4
Alex XYZ Desc X
Mark ABC123 Desc 123
Jessica ABC123 Desc 123
Lynn XYZ Desc X
Bob ABX101 Desc4
Alex ABX101 Desc 4
Mark ABC Desc 1
Lynn ABC Desc 1
Mark POQ Desc 3
Mark XYZ Desc X
Mark ABC Desc 1
I can group these by Player and challenge using groupby
-
df.groupby(by=['Player', 'Challenge'])
but how can I get something like a count of the challenges for each player (possibly in the next column) and then average the challenges per player?
Upvotes: 0
Views: 47
Reputation: 30940
Use:
count_challenge=df.groupby('Player').Challenge.count()
print(count_challenge)
Player
Alex 2
Bob 4
James 1
Jessica 1
Lynn 2
Mark 5
Name: Challenge, dtype: int64
If you don't want count duplicates:
count_challenge=df.drop_duplicates(['Challenge','Player']).groupby('Player').Challenge.count()
print(count_challenge)
Player
Alex 2
Bob 3
James 1
Jessica 1
Lynn 2
Mark 4
Name: Challenge, dtype: int64
Then you can calculate the mean:
count_challenge.mean()
if you want how many challenges of each type for each player
count_differents_challenge=df.groupby('Player').Challenge.value_counts()
print(count_differents_challenge)
Player Challenge
Alex ABX101 1
XYZ 1
Bob ABX101 2
ABC 1
XYZ 1
James ABC 1
Jessica ABC123 1
Lynn ABC 1
XYZ 1
Mark ABC 2
ABC123 1
POQ 1
XYZ 1
Name: Challenge, dtype: int64
Upvotes: 2