Reputation: 693
I have a dataframe with a date column. I want to turn this date column into my index. When I change the date column into pd.to_datetime(df['Date'], errors='raise', dayfirst=True)
I get:
df1.head()
Out[60]:
Date Open High Low Close Volume Market Cap
0 2018-03-14 0.789569 0.799080 0.676010 0.701902 479149000 30865600000
1 2018-03-13 0.798451 0.805729 0.778471 0.789711 279679000 31213000000
2 2018-12-03 0.832127 0.838328 0.787882 0.801048 355031000 32529500000
3 2018-11-03 0.795765 0.840407 0.775737 0.831122 472972000 31108000000
4 2018-10-03 0.854872 0.860443 0.793736 0.796627 402670000 33418600000
The format of Date originally is string dd-mm-yyyy, but as you can see, the tranformation to datetime messes things up from the 2nd row on. How can I get consistent datetimes?
Edit: I think I solved it. Using the answers below about format I found out the error was in a package that I used to generate the data (\[cryptocmd\]
). I changed the format to %Y-%m-%d in the utils script of the package and now it seems to work fine.
Upvotes: 0
Views: 225
Reputation: 13175
According to the docs:
dayfirst : boolean, default False
Specify a date parse order if arg is str or its list-likes. If True, parses dates with the day first, eg 10/11/12 is parsed as 2012-11-10. Warning: dayfirst=True is not strict, but will prefer to parse with day first (this is a known bug, based on dateutil behavior).
Emphasis mine. Since you apparently know that your format is "dd-mm-yyyy" you should specify it explicitly:
df['Date'] = pd.to_datetime(df['Date'], format='%d-%m-%Y', errors='raise')
Upvotes: 1