NZ_DJ
NZ_DJ

Reputation: 341

Coverting string value to datetime

I currently have a dataframe which has a column containing date-time values as object datatype.

    col1    col2            col3
0    A       10     2016-06-05 11:00:00
0    B       11     2016-06-04 00:00:00
0    C       12     2016-06-02 05:00:00
0    D       13     2016-06-03 02:00:00

What Im trying to do is convert the col3 into date-time values so that it would just give me:

 Year-Month-Day-Hour

For some datetime feature engineering later on. When I try:

df['col3'] = pd.to_datetime(df['col3'])

I get this error:

OutOfBoundsDatetime: Out of bounds nanosecond timestamp: 3008-07-25 00:00:00

Any ideas?

Thanks

Upvotes: 3

Views: 54

Answers (1)

jezrael
jezrael

Reputation: 862641

You can use parameter errors='coerce' for convert values outside limits to NaT:

print (df)
  col1  col2                 col3
0    A    10  2016-06-05 11:00:00
0    B    11  2016-06-04 00:00:00
0    C    12  2016-06-02 05:00:00
0    D    13  3008-07-25 00:00:00

df['col3'] = pd.to_datetime(df['col3'], errors='coerce')
print (df)
  col1  col2                col3
0    A    10 2016-06-05 11:00:00
0    B    11 2016-06-04 00:00:00
0    C    12 2016-06-02 05:00:00
0    D    13                 NaT

Timestamp limitation:

In [68]: pd.Timestamp.min
Out[68]: Timestamp('1677-09-21 00:12:43.145225')

In [69]: pd.Timestamp.max
Out[69]: Timestamp('2262-04-11 23:47:16.854775807')

Also is possible create Periods, but not easy from strings:

def conv(x):
    return pd.Period(year = int(x[:4]), 
                     month = int(x[5:7]), 
                     day = int(x[8:10]),
                     hour = int(x[11:13]), freq='H')

df['col3'] = df['col3'].apply(conv)

print (df)
  col1  col2             col3
0    A    10 2016-06-05 11:00
0    B    11 2016-06-04 00:00
0    C    12 2016-06-02 05:00
0    D    13 3008-07-25 00:00

Upvotes: 3

Related Questions