Reputation: 3659
winner loser winner_matches loser_matches
Dave Harry 1 1
Jim Dave 1 2
Dave Steve 3 1
I'm trying to build a running count of how many matches a player has participated in based on their name's appearance in either the winner or loser column (ie, Dave
above has a running count of 3
since he's been in every match). I'm new to pandas and have tried a few combinations of cumcount
and groupby
but I'm not sure if I just need to manually loop over the dataset and store all the names myself.
EDIT: to clarify, I need the running totals in the dataframe as shown above and not just a Series printed out later on! Thanks
Upvotes: 0
Views: 594
Reputation: 862641
First create MultiIndex Series
by DataFrame.stack
, then GroupBy.cumcount
, for DataFrame
add unstack
with add_suffix
:
print (df)
winner loser
0 Dave Harry
1 Jim Dave
2 Dave Steve
s = df.stack()
#if multiple columns in original df
#s = df[['winner','loser']].stack()
df1 = s.groupby(s).cumcount().add(1).unstack().add_suffix('_matches')
print (df1)
winner_matches loser_matches
0 1 1
1 1 2
2 3 1
Last append to original DataFrame
by join
:
df = df.join(df1)
print (df)
winner loser winner_matches loser_matches
0 Dave Harry 1 1
1 Jim Dave 1 2
2 Dave Steve 3 1
Upvotes: 1