Afshin Oroojlooy
Afshin Oroojlooy

Reputation: 1434

convert string date to datetime format in pandas

I have a Pandas dataFrame, with a column of Date:

    ID  Amount   raw-Date   ZIP transaction-ID  Date    flag
749     145552  $100.00 1/15/2018   27614-7901  1342-P0192-F43  1/15/2018   1.0
1307    145552  $100.00 3/15/2018   27614-7901  1342-P0192-F43  3/15/2018   1.0
1672    145552  $100.00 2/15/2018   27614-7901  1342-P0192-F43  2/15/2018   1.0
3508    145552  $100.00 4/15/2018   27614-7901  1342-P0192-F43  4/15/2018   1.0
4144    145552  $250.00 4/24/2018   27614-7901  1234-O8910-B32  4/24/2018   1.0
4145    145552  $100.00 4/24/2018   27614-7901  1234-O8910-B32  4/24/2018   1.0
4787    145552  $100.00 5/15/2018   27614-7901  1342-P0192-F43  5/15/2018   1.0
8350    145552  $212.44 12/21/2018  27614-7901  1342-P0192-F43  12/21/2018  1.0

When I sort them by the Date column, i.e., using data.sort_values('Date'), I get:

ID  Amount   raw-Date   ZIP transaction-ID  Date    flag
749     145552  $100.00 1/15/2018   27614-7901  1342-P0192-F43  1/15/2018   1.0
8350    145552  $212.44 12/21/2018  27614-7901  1342-P0192-F43  12/21/2018  1.0
1672    145552  $100.00 2/15/2018   27614-7901  1342-P0192-F43  2/15/2018   1.0
1307    145552  $100.00 3/15/2018   27614-7901  1342-P0192-F43  3/15/2018   1.0
3508    145552  $100.00 4/15/2018   27614-7901  1342-P0192-F43  4/15/2018   1.0
4144    145552  $250.00 4/24/2018   27614-7901  1234-O8910-B32  4/24/2018   1.0
4145    145552  $100.00 4/24/2018   27614-7901  1234-O8910-B32  4/24/2018   1.0
4787    145552  $100.00 5/15/2018   27614-7901  1342-P0192-F43  5/15/2018   1.0

in which apparently it sorts dates as a string. I tried pd.to_datetime(data['Date']) and again got same sorted result:

    ID  Amount  raw-Date    ZIP Appeal ID   Date    flag 
749     145552  $100.00 1/15/2018   27614-7901  1342-P0192-F43  2018-01-15  1.0
8350    145552  $212.44 12/21/2018  27614-7901  1342-P0192-F43  2018-12-21  1.0
1672    145552  $100.00 2/15/2018   27614-7901  1342-P0192-F43  2018-02-15  1.0
1307    145552  $100.00 3/15/2018   27614-7901  1342-P0192-F43  2018-03-15  1.0
3508    145552  $100.00 4/15/2018   27614-7901  1342-P0192-F43  2018-04-15  1.0
4144    145552  $250.00 4/24/2018   27614-7901  1234-O8910-B32  2018-04-24  1.0
4145    145552  $100.00 4/24/2018   27614-7901  1234-O8910-B32  2018-04-24  1.0
4787    145552  $100.00 5/15/2018   27614-7901  1342-P0192-F43  2018-05-15  1.0

I appreciate any help.

Upvotes: 0

Views: 123

Answers (1)

Quang Hoang
Quang Hoang

Reputation: 150735

Your data has duplicated column name Date, which is discouraged. In this case: df['Date'] would give a dataframe with two columns, and pd.to_datetime(df['Date']) would fail.

That said, you can do an apply:

df['Date'] = df['Date'].apply(pd.to_datetime)

After that, df.Date.dtypes would give:

Date    datetime64[ns]
Date    datetime64[ns]
dtype: object

Upvotes: 2

Related Questions