Reputation: 797
Say I have two columns like this:
home_team away_team
SWE DEN
NOR GER
SWE NOR
GER DEN
GER SWE
and want to create two new columns that count the games played by the home_team and away_team, like this:
home_team away_team games_HomeTeam games_AwayTeam
SWE DEN 1 1
NOR GER 1 1
SWE NOR 2 2
GER DEN 2 2
GER FRA 3 1
Upvotes: 0
Views: 94
Reputation: 61910
You could do something like this:
flatten = [e for p in zip(df.home_team, df.away_team) for e in p]
counts = pd.DataFrame((pd.Series(flatten).groupby(flatten).cumcount() + 1).values.reshape(-1, 2),
columns=['games_HomeTeam', 'games_AwayTeam'])
print(pd.concat([df, counts], axis=1))
Output
home_team away_team games_HomeTeam games_AwayTeam
0 1 2 1 1
1 3 4 1 1
2 1 3 2 2
3 2 4 2 2
4 1 5 3 1
First flatten the two columns, then group and cumcount followed by a reshape. Finally concat with df
.
Upvotes: 2