Reputation: 25
so I'm brand new to Python and have a question regarding pulling data from a df based on the previous date for a given team. This is an example of what I'm looking at;
import Pandas as pd
df = pd.DataFrame({'Team':['Duke', 'Duke', 'Duke', 'Kentucky', 'Kentucky'],
'Date':['1-1-20', '1-3-20', '1-7-20', '1-8-20', '1-11-20'],
'Points Scored':['85', '90', '75', '73', '82']})
df
Team Date Points Scored
0 Duke 1-1-20 85
1 Duke 1-3-20 90
2 Duke 1-7-20 75
3 Kentucky 1-8-20 73
4 Kentucky 1-11-20 82
The desired output would pull the points scored for each team from the previous date. If for example it's the first game of the year it'd output a 0. I've tried to use the merge function/shifting dates but it seems so messy. Was wondering if anyone has an easier way to get the following output and would apply to a larger data set with all ~353 teams;
Team Date Points Scored Previous Game Points Scored
0 Duke 1-1-20 85 0
1 Duke 1-3-20 90 85
2 Duke 1-7-20 75 90
3 Kentucky 1-8-20 73 0
4 Kentucky 1-11-20 82 73
Thanks in advance for the help!
Upvotes: 2
Views: 138
Reputation: 2643
import pandas as pd
df = pd.DataFrame({'Team':['Duke', 'Duke', 'Duke', 'Kentucky', 'Kentucky'],
'Date':['1-1-20', '1-3-20', '1-7-20', '1-8-20', '1-11-20'],
'Points Scored':['85', '90', '75', '73', '82']})
df['Previous Game Points Scored'] = (df
.groupby('Team')['Points Scored']
.transform('shift')
.fillna(0))
print(df)
Team Date Points Scored Previous Game Points Scored
0 Duke 1-1-20 85 0
1 Duke 1-3-20 90 85
2 Duke 1-7-20 75 90
3 Kentucky 1-8-20 73 0
4 Kentucky 1-11-20 82 73
Upvotes: 2