Reputation: 15
I have a dataframe looking like this
Hour Minute Second Value
0 14.0 57.0 17.0 0.0
1 14.0 57.0 18.0 0.0
2 14.0 57.0 19.0 138.6
3 14.0 57.0 20.0 138.6
4 14.0 57.0 21.0 138.6
5 14.0 57.0 22.0 138.6
I want to combine the hour/minute/second columns into a timestamp index. I have a date i want to use. I managed to do this using df.apply
with datetime.datetime.combine(mydate, datetime.time(hour, min, sec))
but it is too slow.
Is there a way to do this efficiently using built in pandas functions?
Upvotes: 1
Views: 713
Reputation: 150785
Another option is to multiply Hour
and Minute
with respective numbers, convert the sum to timedelta
and add to the date:
mydate = pd.to_datetime('2020-02-05')
df['timestamp'] = pd.to_timedelta(df.Hour*3600+df.Minute*60+df.Second,
unit='sec').add(mydate)
Output:
Hour Minute Second Value timestamp
0 14.0 57.0 17.0 0.0 2020-02-05 14:57:17
1 14.0 57.0 18.0 0.0 2020-02-05 14:57:18
2 14.0 57.0 19.0 138.6 2020-02-05 14:57:19
3 14.0 57.0 20.0 138.6 2020-02-05 14:57:20
4 14.0 57.0 21.0 138.6 2020-02-05 14:57:21
5 14.0 57.0 22.0 138.6 2020-02-05 14:57:22
0 2020-02-05 14:57:17
1 2020-02-05 14:57:18
2 2020-02-05 14:57:19
3 2020-02-05 14:57:20
4 2020-02-05 14:57:21
5 2020-02-05 14:57:22
dtype: datetime64[ns]
Upvotes: 0
Reputation: 863216
Idea is multiple Hour
and Minutes
, sum and add string datetime in to_datetime
:
s = df['Hour'].mul(10000) + df['Minute'].mul(100) + df['Second']
df['date'] = pd.to_datetime('2015-01-01 ' + s.astype(str), format='%Y-%m-%d %H%M%S.%f')
print (df)
Hour Minute Second Value date
0 14.0 57.0 17.0 0.0 2015-01-01 14:57:17
1 14.0 57.0 18.0 0.0 2015-01-01 14:57:18
2 14.0 57.0 19.0 138.6 2015-01-01 14:57:19
3 14.0 57.0 20.0 138.6 2015-01-01 14:57:20
4 14.0 57.0 21.0 138.6 2015-01-01 14:57:21
5 14.0 57.0 22.0 138.6 2015-01-01 14:57:22
Upvotes: 1