steven
steven

Reputation: 11

Convert string date variable to datetime.datetime date variable in pandas

I have a string date. I know how to convert it to datetime.datetime object(when there is no missing!!!) but the problem is I have some missing values. And I couldn't do it.

let's say the input_date is the raw date variable which is string. I want to produce input_date_fmt variable which will be datetime.datetime .I am trying to run the following

DF['input_date_fmt'] = np.array([datetime.datetime.strptime(x, "%m/%d/%Y").date()
                                 for x in DF['input_date']])

But the error is

ValueError: time data 'nan' does not match format '%m/%d/%Y'

Can anyone please help?

Upvotes: 1

Views: 3226

Answers (2)

roman
roman

Reputation: 117337

If you have string values 'nan' in your dataframe:

>>> df = pd.DataFrame({'input_date':['01/01/2003', '02/29/2012', 'nan', '03/01/1995']})
>>> df
   input_date
0  01/01/2003
1  02/29/2012
2         nan
3  03/01/1995

you can convert it to NaN before converting to date:

>>> df.ix[df['input_date'] == 'nan', 'input_date'] = np.NaN
>>> df
   input_date
0  01/01/2003
1  02/29/2012
2         NaN
3  03/01/1995

And then you can do your conversion. But easier way would be to use vectorized operation to_datetime to convert strings to datetime:

>>> df = pd.DataFrame({'input_date':['01/01/2003', '02/29/2012', 'nan', '03/01/1995']})
>>> pd.to_datetime(df['input_date'])
0   2003-01-01 00:00:00
1   2012-02-29 00:00:00
2                   NaT
3   1995-03-01 00:00:00

Upvotes: 2

Konstantin Kovrizhnykh
Konstantin Kovrizhnykh

Reputation: 191

You can use regexp to parse only valid dates:

DF['input_date_fmt'] = np.array([datetime.datetime.strptime(x, "%m/%d/%Y").date()
                             for x in DF['input_date']] if re.match('(0[1-9]|[12][0-9]|3[01])\/(0[1-9]|1[012])\/(19|20)\d\d', x))

But I'm agree with Satoru.Logic. What are you going to do with invalid values.

Upvotes: 0

Related Questions