Reputation: 16508
I have datetime-indexed data in a DataFrame
. For each date, I would like to compute the pct_change
to the number from the previous year.
import pandas as pd
import numpy as np
idx = pd.date_range('2019-01-01', periods=1000)
df = pd.DataFrame(np.arange(1000), index=idx)
However, I do not want to compute "previous year" by the same day and month. Instead, I want to compare each day in 2020
with the first day that has the same day-of-week in 2019
. What do I mean?
In the year 2020, January 1st was a Wednesday. In 2019, the first Wednesday was January 2nd. So difference by day-of-week would be [value at 2020-01-01]/[value at 2019-01-02]
.
So for the comparison 2020-2019, I would simply need to know that there is a positive offset of +1, and hence go back [365-1] days for each comparison. What would be a generic solution for any two years?
Upvotes: 1
Views: 651
Reputation: 150785
This would do:
s = df.groupby([df.index.year, df.index.dayofweek]).cumcount()
shifted = df.groupby([s, df.index.dayofweek]).shift()
shifted.loc['2020-01-01']
# 0 1.0
# Name: 2020-01-01 00:00:00, dtype: float64
Upvotes: 2