Mike Tauber
Mike Tauber

Reputation: 59

Numpy Rolling Window With Min Periods & Max (Similar To Pandas Rolling But Numpy)

I am looking to (unable to see this covered already anywhere) create a sliding window using numpy instead of pandas.rolling (primarily for speed). However, the sliding window must also be a function of minimum and maximum number of instances in the window, and return NaN when the window cannot be constructed. This is similar to pandas.rolling with arguments set for window size (maximum) and min_periods. For example:

Set Min_periods = 3 and Max_periods = 7, see below for example of intended window:

index  values  intended_window
0      10      np.nan
1      11      np.nan
2      12      [10,11,12]
3      13      [10,11,12,13]
4      14      [10,11,12,13,14]
5      15      [10,11,12,13,14,15]
6      16      [10,11,12,13,14,15,16]
7      17      [11,12,13,14,15,16,17]
8      18      [12,12,14,15,16,17,18]
9      19      [13,14,15,16,17,18,19]

I see examples of how this sliding window can be constructed when there is no maximum or minimum required for the sliding window e.g.

def rolling_window(a, window):
  shp = a.shape[:-1] + (a.shape[-1] - window + 1, window)
  strides = a.strides + (a.strides[-1],)
  return np.lib.stride_tricks.as_strided(a, shape=shp, strides=strides)

Does anyone know how I can expand this to return windows as in the example above?

Upvotes: 0

Views: 767

Answers (2)

helloWORLD
helloWORLD

Reputation: 135

Please try the following.

def dataframe_striding(dataframe, window):
    '''
    Parameters
    ----------
    dataframe : Input Dataframe, in this case df with columns ['index', 'values'] present.
    window : Tuple denoting the window size.

    Returns
    -------
    dataframe : Pandas Dataframe

    '''
    lower_end, upper_end = window 
    if lower_end > upper_end:
        raise ValueError('Check window size!')

    results = []
    for i in range(dataframe.shape[0]):
        l = [k for k in dataframe['values'][:i+1]]        
        if len(l) < lower_end:                 # checks for minimum window length
            l = np.nan
            results.append(l)
        elif lower_end <= len(l) <= upper_end: # checks for required window length
            results.append(l)
        else:                                  # checks for maximum window length
            l = l[-upper_end:]
            results.append(l)
        
     dataframe['rolling_output'] = results     # appends output to input dataframe
     return dataframe 

# run above function #
final_df = dataframe_striding(df, window = (4,6))

Upvotes: 1

le_camerone
le_camerone

Reputation: 630

values = np.linspace(1, 10, num=10)
window_column = []
for i in range(len(values)):
    if i - 7 < 0:
        t = 0
    else:
        t = i - 7
    window = values[t:i]
    if len(window) < 3:
        window_column.append(np.nan)
    else:
        window_column.append(window)

Upvotes: 0

Related Questions