Reputation: 796
Have a question regarding how to create a day count type of column in pandas. Given a list of dates, I want to be able to calculate the difference from one date to the previous date in days. Now, I can do this with simple subtraction and it will return me a timedelta object I think. What if I just want an integer number of days. Using .days seems to work with two dates but I can't get it work with a column.
Let's say I do,
df['day_count'] = (df['INDEX_DATE'] - df['INDEX_DATE'].shift(1))
INDEX_DATE day_count
0 2009-10-06 NaT
1 2009-10-07 1 days
2 2009-10-08 1 days
3 2009-10-09 1 days
4 2009-10-12 3 days
5 2009-10-13 1 days
I get '1 days'....I only want 1.
I can use .day like this which does return me a number, but it won't work handling an entire column.
(df['INDEX_DATE'][1] - df['INDEX_DATE'][0]).days
If I try something like this:
df['day_count'] = (df['INDEX_DATE'] - df['INDEX_DATE'].shift(1)).days
I get an error of
AttributeError: 'Series' object has no attribute 'days'
I can work around '1 days' but I'm thinking there must be a better way to do this.
Upvotes: 1
Views: 1886
Reputation: 210842
Try this:
In [197]: df['day_count'] = df.INDEX_DATE.diff().dt.days
In [198]: df
Out[198]:
INDEX_DATE day_count
0 2009-10-06 NaN
1 2009-10-07 1.0
2 2009-10-08 1.0
3 2009-10-09 1.0
4 2009-10-12 3.0
5 2009-10-13 1.0
Upvotes: 2