Reputation: 1033
I have a CSV with some data that looks like such:
I have many of these files, and I want to read them into DataFrame:
df = pd.read_csv(filepath, engine='c')
df['closingDate'] = pd.to_datetime(df['closingDate'], format='%dd-%mmm-%yy')
df['Fut Expiration Date'] = pd.to_datetime(df['Fut Expiration Date'], format='%d-%m-%yy')
I've tried a multitude of formats, but none seem to work. Is there an alternative?
Upvotes: 6
Views: 7382
Reputation: 476584
Actually you do not need to specify the format here. The format is unambiguous, if we convert it without specifying a format, we get:
>>> df
Date
0 1-Dec-99
1 1-Jul-99
2 1-Jun-99
3 1-Nov-99
4 1-Oct-99
5 1-Sep-99
6 2-Aug-99
7 2-Dec-99
>>> pd.to_datetime(df['Date'])
0 1999-12-01
1 1999-07-01
2 1999-06-01
3 1999-11-01
4 1999-10-01
5 1999-09-01
6 1999-08-02
7 1999-12-02
Name: Date, dtype: datetime64[ns]
Alternatively, we can look up the format in the documentation of the datetime
module [Python-doc]. We here se that:
%d Day of the month as a zero-padded 01, 02, ., 31 decimal number. %b Month as locale's abbreviated name. Jan, Feb, ., Dec (en_US); Jan, Feb, ., Dez (de_DE) %y Year without century as a 00, 01, ., 99 zero-padded decimal number.
So we can specify the format as:
>>> pd.to_datetime(df['Date'], format='%d-%b-%y')
0 1999-12-01
1 1999-07-01
2 1999-06-01
3 1999-11-01
4 1999-10-01
5 1999-09-01
6 1999-08-02
7 1999-12-02
Name: Date, dtype: datetime64[ns]
Upvotes: 8
Reputation: 919
Use %b
for a three letter month. Please see the Python strftime
reference: http://strftime.org/
I think you want: w
for the day, b
for the month, and yy
for the year.
I'm assuming the days aren't zero padded, if the days are zero padded then use d
instead of w
.
Upvotes: 2
Reputation: 51335
Check out the directives for datetimes here. The following should work, using 3 letter months and 2 digit years:
df['Fut Expiration Date'] = pd.to_datetime(df['Fut Expiration Date'], format='%d-%b-%y')
Upvotes: 2