Reputation: 1434
I have a Pandas dataFrame, with a column of Date:
ID Amount raw-Date ZIP transaction-ID Date flag
749 145552 $100.00 1/15/2018 27614-7901 1342-P0192-F43 1/15/2018 1.0
1307 145552 $100.00 3/15/2018 27614-7901 1342-P0192-F43 3/15/2018 1.0
1672 145552 $100.00 2/15/2018 27614-7901 1342-P0192-F43 2/15/2018 1.0
3508 145552 $100.00 4/15/2018 27614-7901 1342-P0192-F43 4/15/2018 1.0
4144 145552 $250.00 4/24/2018 27614-7901 1234-O8910-B32 4/24/2018 1.0
4145 145552 $100.00 4/24/2018 27614-7901 1234-O8910-B32 4/24/2018 1.0
4787 145552 $100.00 5/15/2018 27614-7901 1342-P0192-F43 5/15/2018 1.0
8350 145552 $212.44 12/21/2018 27614-7901 1342-P0192-F43 12/21/2018 1.0
When I sort them by the Date column, i.e., using data.sort_values('Date')
, I get:
ID Amount raw-Date ZIP transaction-ID Date flag
749 145552 $100.00 1/15/2018 27614-7901 1342-P0192-F43 1/15/2018 1.0
8350 145552 $212.44 12/21/2018 27614-7901 1342-P0192-F43 12/21/2018 1.0
1672 145552 $100.00 2/15/2018 27614-7901 1342-P0192-F43 2/15/2018 1.0
1307 145552 $100.00 3/15/2018 27614-7901 1342-P0192-F43 3/15/2018 1.0
3508 145552 $100.00 4/15/2018 27614-7901 1342-P0192-F43 4/15/2018 1.0
4144 145552 $250.00 4/24/2018 27614-7901 1234-O8910-B32 4/24/2018 1.0
4145 145552 $100.00 4/24/2018 27614-7901 1234-O8910-B32 4/24/2018 1.0
4787 145552 $100.00 5/15/2018 27614-7901 1342-P0192-F43 5/15/2018 1.0
in which apparently it sorts dates as a string. I tried pd.to_datetime(data['Date'])
and again got same sorted result:
ID Amount raw-Date ZIP Appeal ID Date flag
749 145552 $100.00 1/15/2018 27614-7901 1342-P0192-F43 2018-01-15 1.0
8350 145552 $212.44 12/21/2018 27614-7901 1342-P0192-F43 2018-12-21 1.0
1672 145552 $100.00 2/15/2018 27614-7901 1342-P0192-F43 2018-02-15 1.0
1307 145552 $100.00 3/15/2018 27614-7901 1342-P0192-F43 2018-03-15 1.0
3508 145552 $100.00 4/15/2018 27614-7901 1342-P0192-F43 2018-04-15 1.0
4144 145552 $250.00 4/24/2018 27614-7901 1234-O8910-B32 2018-04-24 1.0
4145 145552 $100.00 4/24/2018 27614-7901 1234-O8910-B32 2018-04-24 1.0
4787 145552 $100.00 5/15/2018 27614-7901 1342-P0192-F43 2018-05-15 1.0
I appreciate any help.
Upvotes: 0
Views: 123
Reputation: 150735
Your data has duplicated column name Date
, which is discouraged. In this case: df['Date']
would give a dataframe with two columns, and pd.to_datetime(df['Date'])
would fail.
That said, you can do an apply
:
df['Date'] = df['Date'].apply(pd.to_datetime)
After that, df.Date.dtypes
would give:
Date datetime64[ns]
Date datetime64[ns]
dtype: object
Upvotes: 2