Reputation: 3961
I have a Pandas DataFrame that contains half hourly data. I want to shift a small range of datetime indices by 1 hour:
import numpy as np
import pandas as pd
from datetime import datetime, date, timedelta
df = pd.DataFrame()
df['datetime'] = pd.date_range(start='2016-01-01', end='2019-01-01', freq='30T')
df = df.set_index('datetime')
print(df.index)
# shift 1 hour forward for custom date range
date1 = '2016-01-01 00:00:00'
date2 = '2016-03-02 23:49:50'
df[(df.index>=date1) & (df.index<=date2)].index += pd.Timedelta('1H')
#df.index = df[(df.index>=date1) & (df.index<=date2)].index.shift(periods=1,freq='H')
print(df.index)
I tried some different strategies (which don't work) but I want to know if there is a more compact way to do this. Note that this MWE has freq=30T
but my actual data has freq=None
.
Upvotes: 0
Views: 820
Reputation: 35626
The expression df[mask]
generally produces a weakly linked copy of the original DataFrame containing only the filtered values. Meaning the expression df[mask].index
creates a copy, += pd.Timedelta('1H')
updates the index of that copy and then throws the new DataFrame away since there is no variable reference.
One approach here would be to use values
to update the underlying values (since the "Index does not support mutable operations"):
# shift 1 hour forward for custom date range
date1 = '2016-01-01 00:00:00'
date2 = '2016-03-02 23:49:50'
df.index.values[(date1 <= df.index) & (df.index <= date2)] += pd.Timedelta('1H')
df
:
0
datetime
2016-01-01 01:00:00 0 # Now starts at 01:00:00
2016-01-01 01:30:00 1
2016-01-01 02:00:00 2
2016-01-01 02:30:00 3
2016-01-01 03:00:00 4
In terms of a more concise method there aren't really any options when working with the index.
np.where
works but is not necessarily more concise:
date1 = '2016-01-01 00:00:00'
date2 = '2016-03-02 23:49:50'
df.index = np.where((date1 <= df.index) & (df.index <= date2),
df.index + pd.Timedelta('1H'),
df.index)
df.index.name = 'datetime' # add index name back if needed
Converting to_series
then using between
is also an option, but requires making a complete copy of the index which is likely unnecessary:
date1 = '2016-01-01 00:00:00'
date2 = '2016-03-02 23:49:50'
df.index.values[
df.index.to_series().between(date1, date2)
] += pd.Timedelta('1H')
Index.shift
would also work if using the mask on both sides of the equation, but this requires masking the index twice:
date1 = '2016-01-01 00:00:00'
date2 = '2016-03-02 23:49:50'
m = (date1 <= df.index) & (df.index <= date2)
df.index.values[m] = df.index[m].shift(freq='1H')
Upvotes: 4
Reputation: 348
You can probably use pd.index.shift
From the Pandas documentation:
Notes
This method is only implemented for datetime-like index classes, i.e., DatetimeIndex, PeriodIndex and TimedeltaIndex.
Upvotes: 0