Reputation: 3512
I have a timestamp column like this.
In [493]: df_data['last_seen'][:5]
Out[493]:
1 1838-10-31 01:36:32.493180416
2 1826-08-10 09:38:02.493180416
3 1839-05-04 21:14:42.493180416
4 1831-06-11 17:44:24.493180416
5 1820-01-26 10:32:07.493180416
Name: last_seen
I want the number of hours that has passed since the most recent time stamp for each row. So I write
df['last_seen'] = df['last_seen'] - df['last_seen'].max()
This throws an error.
AttributeError: 'Timestamp' object has no attribute 'dtype'
Note that when I ask for :
>>> type(df['last_seen'])
>>> pandas.core.series.Series
>>> type(df_data['last_seen'][1])
>>> pandas.tslib.Timestamp
Upvotes: 2
Views: 1577
Reputation: 128948
this was a bug
fixed in this PR
https://github.com/pydata/pandas/pull/2899
Upvotes: 1
Reputation: 3512
I had not parsed the dates properly. As you can see it is evident from all those dates in 1838! I used the generic dateutil parser and the above statement works.
pd.read_csv('pet_data.csv', parse_dates=['last_seen'], date_parser=dateutil.parser.parse, skipfooter=1)
Upvotes: 1