Reputation: 725
I would like to shift the values of one column in a data frame by x rows(hours).
For example, in the following dataframe:
ind = pd.date_range('01 / 01 / 2000', periods=5, freq='12H')
df = pd.DataFrame({"A": [1, 2, 3, 4, 5],
"B": [10, 20, 30, 40, 50],
"C": [11, 22, 33, 44, 55],
"D": [12, 24, 51, 36, 2]},
index=ind)
I would like to shift the values in column A by two hours.
I use the following:
mask = (df.columns.isin(['A']))
cols_to_shift = df.columns[mask]
df[cols_to_shift] = df[cols_to_shift].shift(2,freq='H')
However, all column A's values are filled with NA. I guess it is because the values are shifted to hours that do not exist in the index column.
Is there a way to fix it?
This is the input:
And this is the output:
Thanks
Upvotes: 0
Views: 57
Reputation: 18647
IIUC, you could try assigning your shifted
values, then use pandas.concat
to extend your original DataFrame
. I'm also using DataFrame.sort_index
and DataFrame.fillna
here to sort the results and deal with NaN
:
# Example setup
ind = pd.date_range('01 / 01 / 2000', periods=5, freq='12H')
df = pd.DataFrame({"A": [1, 2, 3, 4, 5],
"B": [10, 20, 30, 40, 50],
"C": [11, 22, 33, 44, 55],
"D": [12, 24, 51, 36, 2]},
index=ind)
mask = (df.columns.isin(['A']))
cols_to_shift = df.columns[mask]
shifted = df[cols_to_shift].shift(2, freq='H')
df[cols_to_shift] = shifted
df = pd.concat([df, shifted]).sort_index().fillna(0)
print(df)
[out]
A B C D
2000-01-01 00:00:00 0.0 10.0 11.0 12.0
2000-01-01 02:00:00 1.0 0.0 0.0 0.0
2000-01-01 12:00:00 0.0 20.0 22.0 24.0
2000-01-01 14:00:00 2.0 0.0 0.0 0.0
2000-01-02 00:00:00 0.0 30.0 33.0 51.0
2000-01-02 02:00:00 3.0 0.0 0.0 0.0
2000-01-02 12:00:00 0.0 40.0 44.0 36.0
2000-01-02 14:00:00 4.0 0.0 0.0 0.0
2000-01-03 00:00:00 0.0 50.0 55.0 2.0
2000-01-03 02:00:00 5.0 0.0 0.0 0.0
Upvotes: 2