Reputation: 135
I want the time without the date in Pandas. I want to keep the time as dtype datetime64[ns] and not as an object so that I can determine periods between times.
The closest I have gotten is as follows, but it gives back the date in a new column not the time as needed as dtype datetime.
df_pres_mf['time'] = pd.to_datetime(df_pres_mf['time'], format ='%H:%M', errors = 'coerce') # returns date (1900-01-01) and actual time as a dtype datetime64[ns] format
df_pres_mf['just_time'] = df_pres_mf['time'].dt.date
df_pres_mf['normalised_time'] = df_pres_mf['time'].dt.normalize()
df_pres_mf.head()
Returns the date as 1900-01-01 and not the time that is needed.
Edit: Data
time
1900-01-01 11:16:00
1900-01-01 15:20:00
1900-01-01 09:55:00
1900-01-01 12:01:00
Upvotes: 3
Views: 2482
Reputation: 880
You could do it like Vishnudev suggested but then you would have dtype: object (or even strings, after using dt.strftime), which you said you didn't want.
What you are looking for doesn't exist, but the closest thing that I can get you is converting to timedeltas. Which won't seem like a solution at first but is actually very useful.
Convert it like this:
# sample df
df
>>
time
0 2021-02-07 09:22:00
1 2021-05-10 19:45:00
2 2021-01-14 06:53:00
3 2021-05-27 13:42:00
4 2021-01-18 17:28:00
df["timed"] = df.time - df.time.dt.normalize()
df
>>
time timed
0 2021-02-07 09:22:00 0 days 09:22:00 # this is just the time difference
1 2021-05-10 19:45:00 0 days 19:45:00 # since midnight, which is essentially the
2 2021-01-14 06:53:00 0 days 06:53:00 # same thing as regular time, except
3 2021-05-27 13:42:00 0 days 13:42:00 # that you can go over 24 hours
4 2021-01-18 17:28:00 0 days 17:28:00
this allows you to calculate periods between times like this:
# subtract the last time from the current
df["difference"] = df.timed - df.timed.shift()
df
Out[48]:
time timed difference
0 2021-02-07 09:22:00 0 days 09:22:00 NaT
1 2021-05-10 19:45:00 0 days 19:45:00 0 days 10:23:00
2 2021-01-14 06:53:00 0 days 06:53:00 -1 days +11:08:00 # <-- this is because the last
3 2021-05-27 13:42:00 0 days 13:42:00 0 days 06:49:00 # time was later than the current
4 2021-01-18 17:28:00 0 days 17:28:00 0 days 03:46:00 # (see below)
to get rid of odd differences, make it absolute:
df["abs_difference"] = df.difference.abs()
df
>>
time timed difference abs_difference
0 2021-02-07 09:22:00 0 days 09:22:00 NaT NaT
1 2021-05-10 19:45:00 0 days 19:45:00 0 days 10:23:00 0 days 10:23:00
2 2021-01-14 06:53:00 0 days 06:53:00 -1 days +11:08:00 0 days 12:52:00 ### <<--
3 2021-05-27 13:42:00 0 days 13:42:00 0 days 06:49:00 0 days 06:49:00
4 2021-01-18 17:28:00 0 days 17:28:00 0 days 03:46:00 0 days 03:46:00
Upvotes: 1
Reputation: 10970
Use proper formatting according to your date format and convert to datetime
df['time'] = pd.to_datetime(df['time'], format='%Y-%m-%d %H:%M:%S')
Format according to the preferred format
df['time'].dt.strftime('%H:%M')
Output
0 11:16
1 15:20
2 09:55
3 12:01
Name: time, dtype: object
Upvotes: 1