How can a DataFrame be shifted to the nearest time index to the one specified?

Question

I have a DataFrame of values recorded and the index set to DatetimeIndex. A value is recorded approximately every 15 minutes.

I want to add a new column that is the fractional difference of the current value from a value 24 hours previously. Since the values are recorded approximately every fifteen minutes, I want to shift to the time index that is closest to 24 hours previously. If I try to do this exactly, I end up with a whole lot of NaNs:

df["value"] / df["value"].shift(freq = datetime.timedelta(days = -1))

How should this shift be done so that the shift is to the nearest possible time index to the one specified? Is there an alternative, easier way to think about this?

Here is an example that illustrates the issue:

df = pd.DataFrame(
    [
        [pd.Timestamp("2015-07-18 13:53:33.280"), 10],
        [pd.Timestamp("2015-07-19 13:54:03.330"), 20],
        [pd.Timestamp("2015-07-20 13:52:13.350"), 30],
        [pd.Timestamp("2015-07-21 13:56:03.126"), 40],
        [pd.Timestamp("2015-07-22 13:53:51.747"), 50],
        [pd.Timestamp("2015-07-23 13:53:29.346"), 60]
    ],
    columns = [
        "datetime",
        "value"
    ]
)

df.index = df["datetime"]
del df["datetime"]
df.index = pd.to_datetime(df.index.values)

df["change"] = df["value"] / df["value"].shift(freq = datetime.timedelta(days = -1))

piRSquared · Accepted Answer

I'd add one day to the index then use pd.DataFrame.reindex with method='nearest'

df / df.set_index(df.index + pd.offsets.Day()).reindex(df.index, method='nearest')

                            value
2015-07-18 13:53:33.280  1.000000
2015-07-19 13:54:03.330  2.000000
2015-07-20 13:52:13.350  1.500000
2015-07-21 13:56:03.126  1.333333
2015-07-22 13:53:51.747  1.250000
2015-07-23 13:53:29.346  1.200000

You can provide another offset as a tolerance on the method='nearest'

df / df.set_index(df.index + pd.offsets.Day()).reindex(
    df.index, method='nearest', tolerance=pd.offsets.Hour(12))

                            value
2015-07-18 13:53:33.280       NaN
2015-07-19 13:54:03.330  2.000000
2015-07-20 13:52:13.350  1.500000
2015-07-21 13:56:03.126  1.333333
2015-07-22 13:53:51.747  1.250000
2015-07-23 13:53:29.346  1.200000

How can a DataFrame be shifted to the nearest time index to the one specified?

Answers (2)

Related Questions