pandas groupby values in different column

Question

I have this data frame

frame =  pd.DataFrame({'player1' : ['Joe', 'Steve', 'Bill', 'Doug', 'Steve','Bill','Joe','Steve'],
                      'player2' : ['Bill', 'Doug', 'Steve', 'Joe', 'Bill', 'Steve', 'Doug', 'Bill'],
                      'winner' : ['Joe','Steve' , 'Steve','Doug', 'Bill', 'Steve', 'Doug', 'Steve'],
                      'loser' : ['Bill', 'Doug', 'Bill', 'Joe', 'Steve', 'Bill', 'Joe', 'Bill'],
                       'ones' : 1})

I can keep a running total of how many times the winner has won by doing this.

frame['winners_wins'] = frame.groupby('winner')['ones'].cumsum()

I would like to keep a running count of player1's wins and losses and the same for player2. I think I should be able to do this with the groupby function but I don't know how to write it.

edit:

I didn't say it very the well the first time. I would like to keep track for each individual player. So the desired output would be:

player1 player2 winner  loser   player1_wins  player2_wins
 Joe     Bill     Joe    Bill       1             0
 Steve   Doug     Steve  Doug       1             0
 Bill    Steve    Steve  Bill       0             2
 Doug    Joe      Doug    Joe       1             1
 Steve   Bill     Bill    Steve     2             1 
 Bill    Steve    Steve   Bill      1             3
 Joe     Doug     Doug    Joe       1             2   
 Steve   Bill     Steve   Bill      3             1

unutbu · Accepted Answer

It looks like you want a running total of player1's and player2's wins. Here is a pretty mundane way to do it which uses Python more than Pandas.

Calculations that require stepping through the rows in sequence and using previous results to calculate the next row tend to not to be conducive to Pandas/Numpy operations -- cumsum being an exception. So I don't think there is a slick way to do this using Pandas operations, but I could be wrong.

import pandas as pd
import collections

df = pd.DataFrame({'player1' : ['Joe', 'Steve', 'Bill', 'Doug',
                      'Steve','Bill','Joe','Steve'], 'player2' : ['Bill',
                      'Doug', 'Steve', 'Joe', 'Bill', 'Steve', 'Doug', 'Bill'],
                      'winner' : ['Joe','Steve' , 'Steve','Doug', 'Bill',
                      'Steve', 'Doug', 'Steve'], 'loser' : ['Bill', 'Doug',
                      'Bill', 'Joe', 'Steve', 'Bill', 'Joe', 'Bill'], },
                  columns = ['player1', 'player2', 'winner', 'loser'])

wins = collections.Counter()
def count_wins():
    for idx, row in df.iterrows():
        wins[row['winner']] += 1
        yield wins[row['player1']], wins[row['player2']]
df['player1_wins'], df['player2_wins'] = zip(*list(count_wins()))
print(df)

prints

  player1 player2 winner  loser  player1_wins  player2_wins
0     Joe    Bill    Joe   Bill             1             0
1   Steve    Doug  Steve   Doug             1             0
2    Bill   Steve  Steve   Bill             0             2
3    Doug     Joe   Doug    Joe             1             1
4   Steve    Bill   Bill  Steve             2             1
5    Bill   Steve  Steve   Bill             1             3
6     Joe    Doug   Doug    Joe             1             2
7   Steve    Bill  Steve   Bill             4             1

pandas groupby values in different column

Answers (2)

Related Questions