Mark Bakker
Mark Bakker

Reputation: 831

compute time difference of DateTimeIndex

I want to compute the time difference between times in a DateTimeIndex

import pandas as pd
p = pd.DatetimeIndex(['1985-11-14', '1985-11-28', '1985-12-14', '1985-12-28'], dtype='datetime64[ns]')

I can compute the time difference of two times:

p[1] - p[0]

gives

Timedelta('14 days 00:00:00')

But p[1:] - p[:-1] doesn't work and gives

DatetimeIndex(['1985-12-28'], dtype='datetime64[ns]', freq=None)

and a future warning:

FutureWarning: using '-' to provide set differences with datetimelike Indexes is deprecated, use .difference()

Any thought on how how I can (easily) compute the time difference between values in a DateTimeIndex? And why does it work for 1 value, but not for the entire DateTimeIndex?

Upvotes: 22

Views: 19192

Answers (3)

Alexander
Alexander

Reputation: 109706

I used None to fill the first difference value, but I'm sure you can figure out how you would like to deal with that case.

>>> [None] + [p[n] - p[n-1] for n in range(1, len(p))]
[None,
 Timedelta('14 days 00:00:00'),
 Timedelta('16 days 00:00:00'),
 Timedelta('14 days 00:00:00')]

BTW, to just get the day difference:

[None] + [(p[n] - p[n-1]).days for n in range(1, len(p))]
[None, 14, 16, 14]

Upvotes: 0

EdChum
EdChum

Reputation: 394389

Convert the DatetimeIndex to a Series using to_series() and then call diff to calculate inter-row differences:

In [5]:
p.to_series().diff()

Out[5]:
1985-11-14       NaT
1985-11-28   14 days
1985-12-14   16 days
1985-12-28   14 days
dtype: timedelta64[ns]

As to why it failed, the - operator here is attempting to perform a set difference or intersection of your different index ranges, you're trying to subtract the values from one range with another which diff does.

when you did p[1] - p[0] the - is performing a scalar subtraction but when you do this on an index it thinks that you're perform a set operation

Upvotes: 36

johnchase
johnchase

Reputation: 13725

The - operator is working, it's just not doing what you expect. In the second situation it is acting to give the difference of the two datetime indices, that is the value that is in p[1:] but not in p[:-1]

There may be a better solution, but it would work to perform the operation element wise:

[e - k for e,k in zip(p[1:], p[:-1])]

Upvotes: 1

Related Questions