Reputation: 3
Consider the code below which gives the wanted output:
import numpy as np
import pandas as pd
sumvalues = 2
touchdown = 3
arr = np.array([1, 2, 3, 4, 5, 6, 7])
series = pd.Series(arr)
shifted = pd.Series(np.append(series, np.zeros(touchdown)))
rolled = shifted.rolling(window=sumvalues, min_periods=1).sum().fillna(0).astype(float)[touchdown:]
print(rolled.values)
As you can see, i want to shift my values by "touchdown" spots backwards and then compute for every entry the sum of the "sumvalues" preceding entries.
The issue with the code above is that it is slow, e.g we are creating a whole series object just to perform the rolling. Is there any smart(fast) way of achieving the same operations as above?
Tried to play around with the numpy roll function but it is a bit different, also tried the shift in pandas but seems inefficient.
Upvotes: 0
Views: 102
Reputation: 2152
Using padding and convolution
import pandas as pd
import numpy as np
sumvalues = 2
touchdown = 3
arr = np.array([1, 2, 3, 4, 5, 6, 7])
#method 1 :
# Pad the array with zeros at the beginning for rolling window calculation
padded_arr = np.pad(arr, (touchdown, 0), mode='constant') #[0 0 0 1 2 3 4 5 6 7]
# rolling sum with convolution
rolled = np.convolve(padded_arr, np.ones(sumvalues), mode='valid')[touchdown:]
print(rolled)
"""
[ 3. 5. 7. 9. 11. 13.]
"""
Upvotes: 0
Reputation: 79328
You can make use of convolve
after padding the array with zeros
a1 = np.convolve(np.append(arr, np.zeros(touchdown)), np.ones(sumvalues))
a1[touchdown:touchdown + arr.size]
array([ 7., 9., 11., 13., 7., 0., 0.])
NB: In testing the speed of the various methods, the pandas method that OP has seems to outperform the rest when the sumvalues and touchdown are significantly large. Also it still at par with the rest when the values are small. I believe OP should stick to using pandas
Upvotes: 1
Reputation: 262224
You can use a sliding_window_view
with pad
:
from numpy.lib.stride_tricks import sliding_window_view as swv
sumvalues = 2
touchdown = 3
arr = np.array([1, 2, 3, 4, 5, 6, 7])
out = swv(np.pad(arr, (sumvalues-1, touchdown)),
sumvalues).sum(axis=1)[touchdown:]
Which you can further optimize to:
diff = sumvalues-touchdown
out = swv(np.pad(arr[max(0, 1-diff):], (max(0, diff-1), touchdown)),
sumvalues).sum(axis=1)
Output:
array([ 7, 9, 11, 13, 7, 0, 0])
Output with sumvalues = 5 ; touchdown = 0
:
array([ 1, 3, 6, 10, 15, 20, 25])
Output with sumvalues = 3 ; touchdown = 1
:
array([ 3, 6, 9, 12, 15, 18, 13])
Upvotes: 0