Reputation: 73
df1=
Date Team1 Team2
6/1 Boston New York
6/13 New York Boston
6/27 Boston New York
I am trying to calculate the number of days since the last time Boston appeared in either column, but I can only figure out how to look it up within one column, using df1['Days since Boston played'] = df1.groupby('Team1')['Date'].diff().fillna(0)
What I would like the output to be:
Date Team1 Team2 Days since Boston played
6/1 Boston New York 0
6/13 New York Boston 12
6/27 Boston New York 14
EDIT - expanding the dataframe to learn how this can be applied to all teams, not just one What I would like the output to be:
Date Team1 Team2 Days since **Team1** played
6/1 Boston New York 0
6/13 New York Chicago 12
6/27 Boston New York 14
6/28 Chicago Boston 15
Upvotes: 1
Views: 45
Reputation: 38415
Use isin to check if Boston is present in either team1 or team2 and find timedelta
df['Date'] = pd.to_datetime(df['Date'], format = '%m/%d')
df.loc[df.isin(['Boston']).any(1),'Days since Boston played'] = df.loc[df.isin(['Boston']).any(1), 'Date'].diff().dt.days
Date Team1 Team2 Days since Boston played
0 1900-06-01 Boston New York NaN
1 1900-06-13 New York Boston 12.0
2 1900-06-27 Boston New York 14.0
If you want the date column back in original format, you can use strftime
df['Date'] = df['Date'].dt.strftime('%m/%d')
Date Team1 Team2 Days since Boston played
0 06/01 Boston New York NaN
1 06/13 New York Boston 12.0
2 06/27 Boston New York 14.0
Upvotes: 2
Reputation: 1267
You can do a groupby on Team1 and then do a difference on the dates:
# Note: you should post a creatable example in your post next time
data = {
'Date': ['2018-06-01', '2018-06-13', '2018-06-27'],
'Team1':['Boston', 'New York', 'Boston'],
'Team2':['New York', 'Boston', 'New York']
}
df = pd.DataFrame(data)
df['Date'] = pd.to_datetime(df['Date'])
df['Time between games'] = df.groupby('Team1')['Date'].diff().fillna(0)
This will actually give you the difference between games for all home teams.
Upvotes: 0