Évariste Galois
Évariste Galois

Reputation: 1033

pandas - converting d-mmm-yy to datetime object

I have a CSV with some data that looks like such:

excel doc

I have many of these files, and I want to read them into DataFrame:

df = pd.read_csv(filepath, engine='c')
df['closingDate'] = pd.to_datetime(df['closingDate'], format='%dd-%mmm-%yy')
df['Fut Expiration Date'] = pd.to_datetime(df['Fut Expiration Date'], format='%d-%m-%yy')

I've tried a multitude of formats, but none seem to work. Is there an alternative?

Upvotes: 6

Views: 7382

Answers (3)

willeM_ Van Onsem
willeM_ Van Onsem

Reputation: 476584

Actually you do not need to specify the format here. The format is unambiguous, if we convert it without specifying a format, we get:

>>> df
       Date
0  1-Dec-99
1  1-Jul-99
2  1-Jun-99
3  1-Nov-99
4  1-Oct-99
5  1-Sep-99
6  2-Aug-99
7  2-Dec-99
>>> pd.to_datetime(df['Date'])
0   1999-12-01
1   1999-07-01
2   1999-06-01
3   1999-11-01
4   1999-10-01
5   1999-09-01
6   1999-08-02
7   1999-12-02
Name: Date, dtype: datetime64[ns]

Alternatively, we can look up the format in the documentation of the datetime module [Python-doc]. We here se that:

%d   Day of the month as a zero-padded       01, 02, ., 31
     decimal number.

%b   Month as locale's abbreviated name.     Jan, Feb, ., Dec (en_US);
                                             Jan, Feb, ., Dez (de_DE)

%y   Year without century as a               00, 01, ., 99
     zero-padded decimal number.

So we can specify the format as:

>>> pd.to_datetime(df['Date'], format='%d-%b-%y')
0   1999-12-01
1   1999-07-01
2   1999-06-01
3   1999-11-01
4   1999-10-01
5   1999-09-01
6   1999-08-02
7   1999-12-02
Name: Date, dtype: datetime64[ns]

Upvotes: 8

Owen
Owen

Reputation: 919

Use %b for a three letter month. Please see the Python strftime reference: http://strftime.org/

I think you want: w for the day, b for the month, and yy for the year.

I'm assuming the days aren't zero padded, if the days are zero padded then use d instead of w.

Upvotes: 2

sacuL
sacuL

Reputation: 51335

Check out the directives for datetimes here. The following should work, using 3 letter months and 2 digit years:

df['Fut Expiration Date'] = pd.to_datetime(df['Fut Expiration Date'], format='%d-%b-%y')

Upvotes: 2

Related Questions