jason
jason

Reputation: 3512

pandas timestamp subtraction

I have a timestamp column like this.

In [493]: df_data['last_seen'][:5]
Out[493]: 
1   1838-10-31 01:36:32.493180416
2   1826-08-10 09:38:02.493180416
3   1839-05-04 21:14:42.493180416
4   1831-06-11 17:44:24.493180416
5   1820-01-26 10:32:07.493180416
Name: last_seen

I want the number of hours that has passed since the most recent time stamp for each row. So I write

df['last_seen'] = df['last_seen'] - df['last_seen'].max() 

This throws an error.

AttributeError: 'Timestamp' object has no attribute 'dtype'

Note that when I ask for :

>>> type(df['last_seen']) 
>>> pandas.core.series.Series

>>> type(df_data['last_seen'][1])
>>> pandas.tslib.Timestamp

Upvotes: 2

Views: 1577

Answers (2)

Jeff
Jeff

Reputation: 128948

this was a bug

fixed in this PR

https://github.com/pydata/pandas/pull/2899

Upvotes: 1

jason
jason

Reputation: 3512

I had not parsed the dates properly. As you can see it is evident from all those dates in 1838! I used the generic dateutil parser and the above statement works.

pd.read_csv('pet_data.csv', parse_dates=['last_seen'], date_parser=dateutil.parser.parse, skipfooter=1)

Upvotes: 1

Related Questions