spline regressor
spline regressor

Reputation: 3

Efficient shift and roll in numpy without pd.Series

Consider the code below which gives the wanted output:

import numpy as np
import pandas as pd
sumvalues = 2
touchdown = 3
arr = np.array([1, 2, 3, 4, 5, 6, 7])
series = pd.Series(arr)
shifted = pd.Series(np.append(series, np.zeros(touchdown)))
rolled = shifted.rolling(window=sumvalues, min_periods=1).sum().fillna(0).astype(float)[touchdown:]
print(rolled.values)

As you can see, i want to shift my values by "touchdown" spots backwards and then compute for every entry the sum of the "sumvalues" preceding entries.

The issue with the code above is that it is slow, e.g we are creating a whole series object just to perform the rolling. Is there any smart(fast) way of achieving the same operations as above?

Tried to play around with the numpy roll function but it is a bit different, also tried the shift in pandas but seems inefficient.

Upvotes: 0

Views: 102

Answers (3)

Soudipta Dutta
Soudipta Dutta

Reputation: 2152

Using padding and convolution

import pandas as pd
import numpy as np

sumvalues = 2
touchdown = 3
arr = np.array([1, 2, 3, 4, 5, 6, 7])
#method 1 : 
# Pad the array with zeros at the beginning for rolling window calculation
padded_arr = np.pad(arr, (touchdown, 0), mode='constant') #[0 0 0 1 2 3 4 5 6 7]

# rolling sum with convolution
rolled = np.convolve(padded_arr, np.ones(sumvalues), mode='valid')[touchdown:]

print(rolled)
"""
[ 3.  5.  7.  9. 11. 13.]
"""

Upvotes: 0

Onyambu
Onyambu

Reputation: 79328

You can make use of convolve after padding the array with zeros

a1 = np.convolve(np.append(arr, np.zeros(touchdown)), np.ones(sumvalues))
a1[touchdown:touchdown + arr.size]

array([ 7.,  9., 11., 13.,  7.,  0.,  0.])

NB: In testing the speed of the various methods, the pandas method that OP has seems to outperform the rest when the sumvalues and touchdown are significantly large. Also it still at par with the rest when the values are small. I believe OP should stick to using pandas

Upvotes: 1

mozway
mozway

Reputation: 262224

You can use a sliding_window_view with pad:

from numpy.lib.stride_tricks import sliding_window_view as swv

sumvalues = 2
touchdown = 3
arr = np.array([1, 2, 3, 4, 5, 6, 7])

out = swv(np.pad(arr, (sumvalues-1, touchdown)),
          sumvalues).sum(axis=1)[touchdown:]

Which you can further optimize to:

diff = sumvalues-touchdown
out = swv(np.pad(arr[max(0, 1-diff):], (max(0, diff-1), touchdown)),
          sumvalues).sum(axis=1)

Output:

array([ 7,  9, 11, 13,  7,  0,  0])

Output with sumvalues = 5 ; touchdown = 0:

array([ 1,  3,  6, 10, 15, 20, 25])

Output with sumvalues = 3 ; touchdown = 1:

array([ 3,  6,  9, 12, 15, 18, 13])

Upvotes: 0

Related Questions