Reputation: 2330
I'm trying to convert a time series index to a seconds of the day i.e. so that the seconds increases from 0-86399 as the day progresses. I currently can recover the time of the day, but am having trouble converting this to seconds in a vectorized way:
df['timeofday'] = df.index.time
Any ideas? Thanks.
Upvotes: 2
Views: 3437
Reputation: 3439
I modified KarlD's answer for datetime with time zone:
d = pd.DataFrame({"t_naive":pd.date_range("20160101","20160102", freq = "2H")})
d['t_utc'] = d['t_naive'].dt.tz_localize("UTC")
d['t_ct'] = d['t_utc'].dt.tz_convert("America/Chicago")
print(d.head())
# t_naive t_utc t_ct
# 0 2016-01-01 00:00:00 2016-01-01 00:00:00+00:00 2015-12-31 18:00:00-06:00
# 1 2016-01-01 02:00:00 2016-01-01 02:00:00+00:00 2015-12-31 20:00:00-06:00
# 2 2016-01-01 04:00:00 2016-01-01 04:00:00+00:00 2015-12-31 22:00:00-06:00
# 3 2016-01-01 06:00:00 2016-01-01 06:00:00+00:00 2016-01-01 00:00:00-06:00
# 4 2016-01-01 08:00:00 2016-01-01 08:00:00+00:00 2016-01-01 02:00:00-06:00
The answer by KarlD gives sec of day in UTC
s0 = (d["t_naive"].values - d["t_naive"].values.astype('datetime64[D]'))/np.timedelta64(1,'s')
s0
# array([ 0., 7200., 14400., 21600., 28800., 36000., 43200.,
# 50400., 57600., 64800., 72000., 79200., 0.])
s1 = (d["t_ct"].values - d["t_ct"].values.astype('datetime64[D]'))/np.timedelta64(1,'s')
s1
# array([ 0., 7200., 14400., 21600., 28800., 36000., 43200.,
# 50400., 57600., 64800., 72000., 79200., 0.])
For sec of day in local time, use:
s2 = (d["t_ct"].view("int64") - d["t_ct"].dt.normalize().view("int64"))//pd.Timedelta(1, unit='s')
#use d.index.normalize() for index
s2.values
# array([64800, 72000, 79200, 0, 7200, 14400, 21600, 28800, 36000,
# 43200, 50400, 57600, 64800], dtype=int64)
or,
s3 = d["t_ct"].dt.hour*60*60 + d["t_ct"].dt.minute*60+ d["t_ct"].dt.second
s3.values
# array([64800, 72000, 79200, 0, 7200, 14400, 21600, 28800, 36000,
# 43200, 50400, 57600, 64800], dtype=int64)
Upvotes: 0
Reputation: 15718
May be a bit overdone, but this would be my answer:
from pandas import date_range, Series, to_datetime
# Some test data
rng = date_range('1/1/2011 01:01:01', periods=3, freq='s')
df = Series(randn(len(rng)), index=rng).to_frame()
def sec_in_day(timestamp):
date = timestamp.date() # We get the date less the time
elapsed_time = timestamp.to_datetime() - to_datetime(date) # We get the time
return elapsed_time.total_seconds()
Series(df.index).apply(sec_in_day)
Upvotes: 3
Reputation: 13757
As @Jeff points out my original answer misunderstood what you were doing. But the following should work and it is vectorized. My answer relies on numpy datetime64
operations (subtract the beginning of the day from the current datetime64
and the divide through with a timedelta64
to get seconds):
>>> df
A
2011-01-01 00:00:00 -0.112448
2011-01-01 01:00:00 1.006958
2011-01-01 02:00:00 -0.056194
2011-01-01 03:00:00 0.777821
2011-01-01 04:00:00 -0.552584
2011-01-01 05:00:00 0.156198
2011-01-01 06:00:00 0.848857
2011-01-01 07:00:00 0.248990
2011-01-01 08:00:00 0.524785
2011-01-01 09:00:00 1.510011
2011-01-01 10:00:00 -0.332266
2011-01-01 11:00:00 -0.909849
2011-01-01 12:00:00 -1.275335
2011-01-01 13:00:00 1.361837
2011-01-01 14:00:00 1.924534
2011-01-01 15:00:00 0.618478
df['sec'] = (df.index.values
- df.index.values.astype('datetime64[D]'))/np.timedelta64(1,'s')
A sec
2011-01-01 00:00:00 -0.112448 0
2011-01-01 01:00:00 1.006958 3600
2011-01-01 02:00:00 -0.056194 7200
2011-01-01 03:00:00 0.777821 10800
2011-01-01 04:00:00 -0.552584 14400
2011-01-01 05:00:00 0.156198 18000
2011-01-01 06:00:00 0.848857 21600
2011-01-01 07:00:00 0.248990 25200
2011-01-01 08:00:00 0.524785 28800
2011-01-01 09:00:00 1.510011 32400
2011-01-01 10:00:00 -0.332266 36000
2011-01-01 11:00:00 -0.909849 39600
2011-01-01 12:00:00 -1.275335 43200
2011-01-01 13:00:00 1.361837 46800
2011-01-01 14:00:00 1.924534 50400
2011-01-01 15:00:00 0.618478 54000
Upvotes: 6