Reputation: 529
I have this Dataframe
Unnamed: 0 Datetime HomeTeam AwayTeam Ball PossessionMatch_H Ball PossessionMatch_A
0 0 2021-05-24 02:30:00 U. De Chile Everton 68 32
1 1 2021-05-23 21:00:00 Huachipato Colo Colo 48 52
2 2 2021-05-23 18:30:00 Melipilla Antofagasta 47 53
3 3 2021-05-23 02:30:00 U. Espanola U. Catolica 37 63
4 4 2021-05-23 00:00:00 S. Wanderers O'Higgins 29 71
... ... ... ... ... ... ...
57 57 2021-03-28 15:45:00 Palestino Antofagasta 58 42
58 58 2021-03-28 01:00:00 U. Espanola S. Wanderers 50 50
59 59 2021-03-27 22:30:00 Colo Colo Union La Calera 58 42
60 60 2021-03-27 20:00:00 Everton O'Higgins 54 46
61 61 2021-03-27 15:00:00 Curico Unido Melipilla 41 59
I want to split it in multiple dataframes and apply two criteria in "HomeTeam" and "AwayTeam", then calculate the average of Ball Possession and put it in a new column "Ball PossessionMatch_H/MP" if the team is in "HomeTeam" and "Ball PossessionMatch_A/MP" if the team is in "AwayTeam"
Code:
hometeam_count = df.groupby("HomeTeam")["Ball PossessionMatch_H"].count()
hometeam_sum = df.groupby("HomeTeam")["Ball PossessionMatch_H"].sum()
awayteam_count = df.groupby("AwayTeam")["Ball PossessionMatch_A"].count()
awayteam_sum = df.groupby("AwayTeam")["Ball PossessionMatch_A"].sum()
df["Ball PossessionMatch_H/MP"] = df["HomeTeam"].apply(lambda x: ((hometeam_sum.loc[x] if x in hometeam_sum.index else 0) + (awayteam_sum.loc[x] if x in awayteam_sum.index else 0)) / ((hometeam_count.loc[x] if x in hometeam_count.index else 0) + (awayteam_count.loc[x] if x in awayteam_count.index else 0)))
df["Ball PossessionMatch_A/MP"] = df["AwayTeam"].apply(lambda x: ((hometeam_sum.loc[x] if x in hometeam_sum.index else 0) + (awayteam_sum.loc[x] if x in awayteam_sum.index else 0)) / ((hometeam_count.loc[x] if x in hometeam_count.index else 0) + (awayteam_count.loc[x] if x in awayteam_count.index else 0)))
Solution:
hometeam_count_ = df.groupby("HomeTeam").apply(lambda x: x.iloc[1:, :]["Ball PossessionMatch_H"].count())
hometeam_sum_ = df.groupby("HomeTeam").apply(lambda x: x.iloc[1:, :]["Ball PossessionMatch_H"].sum())
awayteam_count_ = df.groupby("AwayTeam").apply(lambda x: x.iloc[:, :]["Ball PossessionMatch_A"].count())
awayteam_sum_ = df.groupby("AwayTeam").apply(lambda x: x.iloc[:, :]["Ball PossessionMatch_A"].sum())
hometeam_count__ = df.groupby("HomeTeam").apply(lambda x: x.iloc[:, :]["Ball PossessionMatch_H"].count())
hometeam_sum__ = df.groupby("HomeTeam").apply(lambda x: x.iloc[:, :]["Ball PossessionMatch_H"].sum())
awayteam_count__ = df.groupby("AwayTeam").apply(lambda x: x.iloc[1:, :]["Ball PossessionMatch_A"].count())
awayteam_sum__ = df.groupby("AwayTeam").apply(lambda x: x.iloc[1:, :]["Ball PossessionMatch_A"].sum())
df["Ball PossessionMatch_H/MP"] = df["HomeTeam"].apply(lambda x: ((hometeam_sum_.loc[x] if x in hometeam_sum_.index else 0) + (awayteam_sum_.loc[x] if x in awayteam_sum_.index else 0)) / ((hometeam_count_.loc[x] if x in hometeam_count_.index else 0) + (awayteam_count_.loc[x] if x in awayteam_count_.index else 0)))
df["Ball PossessionMatch_A/MP"] = df["AwayTeam"].apply(lambda x: ((hometeam_sum__.loc[x] if x in hometeam_sum__.index else 0) + (awayteam_sum__.loc[x] if x in awayteam_sum__.index else 0)) / ((hometeam_count__.loc[x] if x in hometeam_count__.index else 0) + (awayteam_count__.loc[x] if x in awayteam_count__.index else 0)))
Upvotes: 0
Views: 122
Reputation: 2421
Perhaps this is what you are looking for?
hometeam_count = df.groupby("HomeTeam").apply(
lambda x: x.iloc[1:, :]["Ball PossessionMatch_H"].count()
)
Upvotes: 1