Reputation: 341
I currently have a dataframe which has a column containing date-time values as object datatype.
col1 col2 col3
0 A 10 2016-06-05 11:00:00
0 B 11 2016-06-04 00:00:00
0 C 12 2016-06-02 05:00:00
0 D 13 2016-06-03 02:00:00
What Im trying to do is convert the col3 into date-time values so that it would just give me:
Year-Month-Day-Hour
For some datetime feature engineering later on. When I try:
df['col3'] = pd.to_datetime(df['col3'])
I get this error:
OutOfBoundsDatetime: Out of bounds nanosecond timestamp: 3008-07-25 00:00:00
Any ideas?
Thanks
Upvotes: 3
Views: 54
Reputation: 862641
You can use parameter errors='coerce'
for convert values outside limits to NaT
:
print (df)
col1 col2 col3
0 A 10 2016-06-05 11:00:00
0 B 11 2016-06-04 00:00:00
0 C 12 2016-06-02 05:00:00
0 D 13 3008-07-25 00:00:00
df['col3'] = pd.to_datetime(df['col3'], errors='coerce')
print (df)
col1 col2 col3
0 A 10 2016-06-05 11:00:00
0 B 11 2016-06-04 00:00:00
0 C 12 2016-06-02 05:00:00
0 D 13 NaT
In [68]: pd.Timestamp.min
Out[68]: Timestamp('1677-09-21 00:12:43.145225')
In [69]: pd.Timestamp.max
Out[69]: Timestamp('2262-04-11 23:47:16.854775807')
Also is possible create Periods, but not easy from strings:
def conv(x):
return pd.Period(year = int(x[:4]),
month = int(x[5:7]),
day = int(x[8:10]),
hour = int(x[11:13]), freq='H')
df['col3'] = df['col3'].apply(conv)
print (df)
col1 col2 col3
0 A 10 2016-06-05 11:00
0 B 11 2016-06-04 00:00
0 C 12 2016-06-02 05:00
0 D 13 3008-07-25 00:00
Upvotes: 3