Reputation: 373
I'm a bit new to this and I'm working with datetime data in python. Two questions:
I have a time column that is associated with events, but I'm having difficulty declaring it as time using datetime.time. I have a time column that is formatted like this:
0 11:17:43
1 06:00:00
2 06:30:35
3 02:00:00
4 23:00:00
5 13:20:49
6 19:30:00
and am attempting to declare it as a time object
data['timeobject'] = datetime.time(data['start_time'], axis = 1)
But am getting this error message:
TypeError: cannot convert the series to class 'int'
Also, I'd like to take this time object and use as a dependent variable using scikit-learn classification or regression.
How do I declare it a time object, and would there be any issues running it through scikit-learn models to predict when an event might happen?
Thank you!
Upvotes: 0
Views: 68
Reputation: 164793
datetime.time
does not work in a vectorised fashion. Pandas top-level pd.to_timedelta
does, and it accepts a wide range of formats, including strings in the format you have supplied. Given a dataframe with column 'td'
:
df['td'] = pd.to_timedelta(df['td'])
print(df)
0 11:17:43
1 06:00:00
2 06:30:35
3 02:00:00
4 23:00:00
5 13:20:49
6 19:30:00
Name: td, dtype: timedelta64[ns]
Underlying the resultant series is an integer array via np.timedelta64
. You should expect this to work well with the scikit-learn framework.
Upvotes: 1
Reputation: 13700
You should use pandas.to_datetime, and not the standard library datetime
data['timeobject'] = pd.to_datetime(data['start_time'], format='%h:%m:%s')
Upvotes: 1