Reputation: 21
I have this dataframe
hour = [0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23]
visitor = [4,6,2,4,3,7,5,7,8,3,2,8,3,6,4,5,1,8,9,4,2,3,4,1]
df = {"Hour":hour, "Total_Visitor":visitor}
df = pd.DataFrame(df)
print(df)
I applied 6 window rolling sum
df_roll = df.rolling(6, min_periods=6).sum()
print(df_roll)
The first 5 rows will give you NaN value, The problem is I want to know the sum of total visitor from 9pm to 3am, so I have to sum total visitor from hour 21 and then back to hour 0 until 3
How do you do that automatically with rolling?
Upvotes: 2
Views: 1596
Reputation: 863751
I think you need add last N
values, then using rolling
and filter by length of Series
:
N = 6
df_roll = df.iloc[-N:].append(df).rolling(N).sum().iloc[-len(df):]
print (df_roll)
Hour Total_Visitor
0 105.0 18.0
1 87.0 20.0
2 69.0 20.0
3 51.0 21.0
4 33.0 20.0
5 15.0 26.0
6 21.0 27.0
7 27.0 28.0
8 33.0 34.0
9 39.0 33.0
10 45.0 32.0
11 51.0 33.0
12 57.0 31.0
13 63.0 30.0
14 69.0 26.0
15 75.0 28.0
16 81.0 27.0
17 87.0 27.0
18 93.0 33.0
19 99.0 31.0
20 105.0 29.0
21 111.0 27.0
22 117.0 30.0
23 123.0 23.0
Check original solution:
df_roll = df.rolling(6, min_periods=6).sum()
print(df_roll)
Hour Total_Visitor
0 NaN NaN
1 NaN NaN
2 NaN NaN
3 NaN NaN
4 NaN NaN
5 15.0 26.0
6 21.0 27.0
7 27.0 28.0
8 33.0 34.0
9 39.0 33.0
10 45.0 32.0
11 51.0 33.0
12 57.0 31.0
13 63.0 30.0
14 69.0 26.0
15 75.0 28.0
16 81.0 27.0
17 87.0 27.0
18 93.0 33.0
19 99.0 31.0
20 105.0 29.0
21 111.0 27.0
22 117.0 30.0
23 123.0 23.0
Numpy alternative with strides is complicated, but faster if large one Series
:
def rolling_window(a, window):
shape = a.shape[:-1] + (a.shape[-1] - window + 1, window)
strides = a.strides + (a.strides[-1],)
return np.lib.stride_tricks.as_strided(a, shape=shape, strides=strides)
N = 3
x = np.concatenate([fv[-N+1:], fv.to_numpy()])
cv = pd.Series(rolling_window(x, N).sum(axis=1), index=fv.index)
print (cv)
0 5
1 4
2 4
3 6
4 5
dtype: int64
Upvotes: 2
Reputation: 1784
Though you have mentioned a series, see if this is helpful-
import pandas as pd
def cyclic_roll(s, n):
s = s.append(s[:n-1])
result = s.rolling(n).sum()
return result[-n+1:].append(result[n-1:-n+1])
fv = pd.DataFrame([1, 2, 3, 4, 5])
cv = fv.apply(cyclic_roll, n=3)
cv.reset_index(inplace=True, drop=True)
print cv
Output
0
0 10.0
1 8.0
2 6.0
3 9.0
4 12.0
Upvotes: 0