Reputation: 2287
I have a pandas DataFrame with a column "StartTime" that could be any datetime value. I would like to create a second column that gives the StartTime relative to the beginning of the week (i.e., 12am on the previous Sunday). For example, this post is 5 days, 14 hours since the beginning of this week.
StartTime
1 2007-01-19 15:59:24
2 2007-03-01 04:16:08
3 2006-11-08 20:47:14
4 2008-09-06 23:57:35
5 2007-02-17 18:57:32
6 2006-12-09 12:30:49
7 2006-11-11 11:21:34
I can do this, but it's pretty dang slow:
def time_since_week_beg(x):
y = x.to_datetime()
return pd.Timedelta(days=y.weekday(),
hours=y.hour,
minutes=y.minute,
seconds=y.second
)
df['dt'] = df.StartTime.apply(time_since_week_beg)
What I want is something like this, that doesn't result in an error:
df['dt'] = pd.Timedelta(days=df.StartTime.dt.dayofweek,
hours=df.StartTime.dt.hour,
minute=df.StartTime.dt.minute,
second=df.StartTime.dt.second
)
TypeError: Invalid type <class 'pandas.core.series.Series'>. Must be int or float.
Any thoughts?
Upvotes: 0
Views: 1668
Reputation: 109528
You can use a list comprehension:
df['dt'] = [pd.Timedelta(days=ts.dayofweek,
hours=ts.hour,
minutes=ts.minute,
seconds=ts.second)
for ts in df.StartTime]
>>> df
StartTime dt
0 2007-01-19 15:59:24 4 days 15:59:24
1 2007-03-01 04:16:08 3 days 04:16:08
2 2006-11-08 20:47:14 2 days 20:47:14
3 2008-09-06 23:57:35 5 days 23:57:35
4 2007-02-17 18:57:32 5 days 18:57:32
5 2006-12-09 12:30:49 5 days 12:30:49
6 2006-11-11 11:21:34 5 days 11:21:34
Depending on the format of StartTime
, you may need:
...for ts in pd.to_datetime(df.StartTime)
Upvotes: 2