Reputation: 2324
I have a pandas
Series
with a (tz-localized) DateTimeIndex
with one value per day:
tmpr
Out[38]:
2018-01-01 00:00:00+01:00 1.810
2018-01-02 00:00:00+01:00 2.405
2018-01-03 00:00:00+01:00 1.495
2018-01-04 00:00:00+01:00 1.600
2018-01-05 00:00:00+01:00 0.545
2020-12-27 00:00:00+01:00 2.655
2020-12-28 00:00:00+01:00 1.705
2020-12-29 00:00:00+01:00 1.255
2020-12-30 00:00:00+01:00 1.405
2020-12-31 00:00:00+01:00 3.000
Freq: D, Name: tmpr, Length: 1096, dtype: float64
which I want to upsample to hourly values, so that each value is repeated 24 times (or 23 or 25, depending on summer/wintertime changeover, but that's a whole other story). Here's what I tried:
tmpr.resample('h').ffill()
Out[39]:
2018-01-01 00:00:00+01:00 1.810
2018-01-01 01:00:00+01:00 1.810
2018-01-01 02:00:00+01:00 1.810
2018-01-01 03:00:00+01:00 1.810
2018-01-01 04:00:00+01:00 1.810
2020-12-30 20:00:00+01:00 1.405
2020-12-30 21:00:00+01:00 1.405
2020-12-30 22:00:00+01:00 1.405
2020-12-30 23:00:00+01:00 1.405
2020-12-31 00:00:00+01:00 3.000
Freq: H, Name: tmpr, Length: 26281, dtype: float64
The problem is the final day: I can't get resample
to include the 23 hours after 0:00
.
Adding a closed
parameter doesn't make a difference, neither when resampling, nor when creating the original timeseries.
(I've tried creating the original Series
with a left or a right-closed index: pd.date_range(start=pd.Timestamp(2018, 1, 1), end=pd.Timestamp(2021, 1, 1), freq='D', closed='left')
and ... end=pd.Timestamp(2020, 12, 31)
, but the resulting Series seems the same.)
I could always append an additinal day (2021-01-01) with a dummy value, and then remove it at the end, but that's terribly hacky.
Any ideas on how to do this the way it was intended?
PS - In a previous project, using a PeriodIndex
instead of a DateTimeIndex
, I had no problems. However, I cannot use that here as those do not support time zone functionality, which I do need.
Upvotes: 3
Views: 651
Reputation: 150785
Since your data is daily, you can do just create new timestamps and reindex
:
new_timestamps = pd.date_range(tmpr.index[0],
tmpr.index[-1]+pd.to_timedelta('23H'),
freq='H')
tmpr.reindex(new_timestamps).ffill()
Output (for the first half of your sample data):
2018-01-01 00:00:00+01:00 1.810
2018-01-01 01:00:00+01:00 1.810
2018-01-01 02:00:00+01:00 1.810
2018-01-01 03:00:00+01:00 1.810
2018-01-01 04:00:00+01:00 1.810
...
2018-01-05 19:00:00+01:00 0.545
2018-01-05 20:00:00+01:00 0.545
2018-01-05 21:00:00+01:00 0.545
2018-01-05 22:00:00+01:00 0.545
2018-01-05 23:00:00+01:00 0.545
Freq: H, Name: tmpr, Length: 120, dtype: float64
Upvotes: 1