Reputation: 8844
I have a column I_DATE
of type string (object) in a dataframe called train
as show below.
I_DATE
28-03-2012 2:15:00 PM
28-03-2012 2:17:28 PM
28-03-2012 2:50:50 PM
How to convert I_DATE
from string to datetime format & specify the format of input string.
Also, how to filter rows based on a range of dates in pandas?
Upvotes: 126
Views: 348844
Reputation: 394041
Use to_datetime
. There is no need to specify the format in this case since the parser is able to figure it out.
In [51]:
pd.to_datetime(df['I_DATE'])
Out[51]:
0 2012-03-28 14:15:00
1 2012-03-28 14:17:28
2 2012-03-28 14:50:50
Name: I_DATE, dtype: datetime64[ns]
To access the date/day/time component use the dt
accessor:
In [54]:
df['I_DATE'].dt.date
Out[54]:
0 2012-03-28
1 2012-03-28
2 2012-03-28
dtype: object
In [56]:
df['I_DATE'].dt.time
Out[56]:
0 14:15:00
1 14:17:28
2 14:50:50
dtype: object
You can use strings to filter as an example:
In [59]:
df = pd.DataFrame({'date':pd.date_range(start = dt.datetime(2015,1,1), end = dt.datetime.now())})
df[(df['date'] > '2015-02-04') & (df['date'] < '2015-02-10')]
Out[59]:
date
35 2015-02-05
36 2015-02-06
37 2015-02-07
38 2015-02-08
39 2015-02-09
Upvotes: 194
Reputation: 29
I believe this is a better approach. Sometimes, the dates can be weird in pandas, and this is my go-to solution (bulletproof).
df['I_DATE'] = pd.to_datetime(df['I_DATE'], format='mixed', errors='coerce')
df[(df['I_DATE'] >= pd.to_datetime('2023-07-01')) & (df['I_DATE'] <= pd.to_datetime('2023-09-30'))]
Upvotes: 0
Reputation: 23121
For a datetime in AM/PM format, the time format is '%I:%M:%S %p'
. See all possible format combinations at https://strftime.org/. N.B. If you have time component as in the OP, the conversion will be done much, much faster if you pass the format=
(see here for more info).
df['I_DATE'] = pd.to_datetime(df['I_DATE'], format='%d-%m-%Y %I:%M:%S %p')
To filter a datetime using a range, you can use query
:
df = pd.DataFrame({'date': pd.date_range('2015-01-01', '2015-04-01')})
df.query("'2015-02-04' < date < '2015-02-10'")
or use between
to create a mask and filter.
df[df['date'].between('2015-02-04', '2015-02-10')]
Upvotes: 8
Reputation: 1301
Approach: 1
Given original string
format: 2019/03/04 00:08:48
you can use
updated_df = df['timestamp'].astype('datetime64[ns]')
The result will be in this datetime
format: 2019-03-04 00:08:48
Approach: 2
updated_df = df.astype({'timestamp':'datetime64[ns]'})
Upvotes: 24