CMorgan
CMorgan

Reputation: 693

dataframe datetimeindex changes

I have a dataframe with a date column. I want to turn this date column into my index. When I change the date column into pd.to_datetime(df['Date'], errors='raise', dayfirst=True) I get:

df1.head()
Out[60]: 
        Date      Open      High       Low     Close     Volume   Market Cap
0 2018-03-14  0.789569  0.799080  0.676010  0.701902  479149000  30865600000
1 2018-03-13  0.798451  0.805729  0.778471  0.789711  279679000  31213000000
2 2018-12-03  0.832127  0.838328  0.787882  0.801048  355031000  32529500000
3 2018-11-03  0.795765  0.840407  0.775737  0.831122  472972000  31108000000
4 2018-10-03  0.854872  0.860443  0.793736  0.796627  402670000  33418600000

The format of Date originally is string dd-mm-yyyy, but as you can see, the tranformation to datetime messes things up from the 2nd row on. How can I get consistent datetimes?

Edit: I think I solved it. Using the answers below about format I found out the error was in a package that I used to generate the data (\[cryptocmd\]). I changed the format to %Y-%m-%d in the utils script of the package and now it seems to work fine.

Upvotes: 0

Views: 225

Answers (1)

roganjosh
roganjosh

Reputation: 13175

According to the docs:

dayfirst : boolean, default False

Specify a date parse order if arg is str or its list-likes. If True, parses dates with the day first, eg 10/11/12 is parsed as 2012-11-10. Warning: dayfirst=True is not strict, but will prefer to parse with day first (this is a known bug, based on dateutil behavior).

Emphasis mine. Since you apparently know that your format is "dd-mm-yyyy" you should specify it explicitly:

 df['Date'] = pd.to_datetime(df['Date'], format='%d-%m-%Y', errors='raise')

Upvotes: 1

Related Questions