Reputation: 786
I'm using Pandas version 0.12.0 to import a csv file with dates
The dates are in the following format 'SEP2005'
using pandas to read the csv file:
import pandas as pd
DF = pd.read_csv('mydata.csv')
mydata.head()
Out[40]:
Date Quantity
0 APR2002 282.0000
1 APR2002 NaN
2 APR2002 0.0000
3 APR2002 20.2253
4 APR2002 55.6853
I then turn the Date Column to the index using the follow:
mydata.index = pd.to_datetime(mydata.pop('Date'))
Here is what is very strange in the past it has parsed my dates and turned the format into
2002-04-15 which is what I want. Then I would just make sure the days where set the the last day of the month:
mydate.index = mydata.index.to_period('M').to_timestamp('M')
Pandas in the past has done a great job of picking the best date format.
However, When I do this now I'm getting my DataFrame
back with the same text "APR2002"
As you would guess the last to_period
will not work on that.
I have not change my code and I have not updated Pandas so I'm not sure where this change in coming from.
I'm not sure if I care too much about the why. What I really need help with is how do I format the index column to reflect Year-Month-Day or %Y%m%d
as in 2005-04-30
I'm coming from R so any help would be huge!
Upvotes: 3
Views: 2683
Reputation: 20341
You could try
pd.to_datetime(mydata.pop('Date'), format="%b%Y")
but that would expect the date to appear like Apr2002
(note not all caps).
You can specify a datetime format using the format string, and the format string will accept strftime arguments (defined here). There is some pandas documentation on this too.
Upvotes: 2