Reputation: 19
I am trying to create a function that will take the DataFrame as an argument and then count the total amount of "league points" each team has and then return a dictionary of sorted values by score. I have created functions that get me to the desired output, but I just want to know how to wrap it all together under one function.
The function needs to start like this:
func(df, w=3, l=0, d=1):
I have made a simple dataset as the actual problem is a larger dataset. each row is a football game between the two teams:
import pandas as pd
import numpy as np
data = {'Home Team': ['Swindon', 'Bath', 'Northampton','Manchester', "Newcastle", 'Reigate'],
'Away Team': ['Reigate', 'Manchester', 'Newcastle','Swindon', 'Bath', 'Northampton'],
'Home Goals':[3,1,3,0,1,2],
'Away Goals':[2,1,4,1,0,1],}
df = pd.DataFrame (data, columns = ['Home Team','Away Team','Home Goals','Away Goals','Home League Points', 'Away League Points'])
I have created a function that takes a row and will tell us how many league points each team earns from the game. I then applied this to the whole dataset which creates new columns:
Creating the function:
def leaguetablescore(row, w=3, l=0, d=1):
if row['Home Goals'] < row['Away Goals'] :
return [l , w]
elif row['Home Goals'] == row['Away Goals']:
return [d , d]
else:
return [w , l]
Applying the function to the dataframe:
df[['Home League Points', 'Away League Points']] = df.apply(lambda x : leaguetablescore(x), axis = 1,
result_type ='expand')
I have then created a function that will return a dictionary of total league points that each team has using this new dataset with the new columns:
def TeamsPointsDict(df, teamslist):
mydictionary = dict()
for team in range(len(teamslist)):
mydictionary[teamslist[team]] = (df.loc[df['Home Team'] == teamslist[team],
'Home League Points'].sum()) + (df.loc[df['Away Team'] == teamslist[team],
'Away League Points'].sum())
return(mydictionary)
Output (and also my intended output from the singlular function that i am trying to create although would need to be sorted):
{'Swindon': 6,
'Bath': 1,
'Northampton': 0,
'Manchester': 1,
'Newcastle': 6,
'Reigate': 3}
however, I am curious too how I could input my dataframe into a function and it return a dictionary (or dataframe) using a SINGLE function. I know I have all these steps created but is there a way I could define a function that would do it all from input?. The reason I need the function to start in the specified way is because I am hoping to that I will be able to alter the number of league points each team earns from a Win (w), Draw (d) or Loss (l) if I need to.
I hope this makes sense and any advice would be much appreciated! I've had fun trying to figure this out so I hope you do to. :)
PS
please let me know if the format of this post is ok as I am relatively new to stack overflow!
Upvotes: 0
Views: 1669
Reputation: 14949
Do you want this -
home_df = df[['Home Team','Home League Points']].rename(columns = {'Home Team': 'Team', 'Home League Points': 'Points'})
away_df = df[['Away Team','Away League Points']].rename(columns = {'Away Team': 'Team', 'Away League Points': 'Points'})
result = dict(pd.concat([home_df,away_df]).groupby('Team',as_index=False).sum().to_dict(orient='split')['data'])
Idea is to create 2 separate df home/away and concat them(Row wise) -> then use groupby sum -> Finally, convert the data to dict.
Output-
{'Bath': 1,
'Manchester': 1,
'Newcastle': 6,
'Northampton': 0,
'Reigate': 3,
'Swindon': 6}
Upvotes: 1
Reputation: 28630
I'm not quite sure what you are asking, but what I think you are saying is, you want the points awarded for a win, loss, draw to be configurable?
The 2 other functions I think are fine. To combine them into one, just create a function that takes in the dataframe then does that work.
Also as a side note, dictionaries/json structures are no inherently "ordered". Now python does have some ways to "sort" them, but just know that technically, they have no order.
so given:
import pandas as pd
import numpy as np
data = {'Home Team': ['Swindon', 'Bath', 'Northampton','Manchester', "Newcastle", 'Reigate'],
'Away Team': ['Reigate', 'Manchester', 'Newcastle','Swindon', 'Bath', 'Northampton'],
'Home Goals':[3,1,3,0,1,2],
'Away Goals':[2,1,4,1,0,1],}
df = pd.DataFrame (data, columns = ['Home Team','Away Team','Home Goals','Away Goals'])
def leaguetablescore(row, points_awarded):
if row['Home Goals'] < row['Away Goals'] :
return [points_awarded['l'] , points_awarded['w']]
elif row['Home Goals'] == row['Away Goals']:
return [points_awarded['d'], points_awarded['d']]
else:
return [points_awarded['w'], points_awarded['l']]
def TeamsPointsDict(df, teamslist):
mydictionary = dict()
for team in range(len(teamslist)):
mydictionary[teamslist[team]] = (df.loc[df['Home Team'] == teamslist[team],
'Home League Points'].sum()) + (df.loc[df['Away Team'] == teamslist[team],
'Away League Points'].sum())
return(mydictionary)
Make a dictioary or some other way to stroe what values you want for win, loss, draw. Then the function that takes in the dataframe and does the work:
points_awarded = {'w':3,
'l':0,
'd':1}
def get_dict(df):
df[['Home League Points', 'Away League Points']] = df.apply(lambda x : leaguetablescore(x,points_awarded), axis = 1,
result_type ='expand')
teamslist = list(df['Home Team']) + list(df['Away Team'])
mydictionary = TeamsPointsDict(df, list(set(teamslist)))
return mydictionary
dictionary = get_dict(df)
print(dictionary)
Output:
{'Reigate': 3, 'Bath': 1, 'Northampton': 0, 'Manchester': 1, 'Swindon': 6, 'Newcastle': 6}
Upvotes: 1