Reputation: 759

numpy array equivalent of pandas.shift() function?

I have an array

    [False False False ...  True  True  True]

I want to check if the previous value == current value. In pandas, I can use something like...

np.where(df[col name].shift(1).eq(df[col name]), True, False)

I tried using scipy shift but the output isn't correct so maybe I am using it wrong?

np.where(shift(long_gt_price, 1) == (long_gt_price),"-", "Different")

Just to show you what I mean when I say it produces the incorrect output:The left column is the shift(1) and the right column is the unshifted column so the left column should equal the square diagonal up to it at least thats my understanding / what I want the False / True at 5 down on the left and 4 on the right therefore doesnt make any sense to me.

Upvotes: 5

Answers (6)

Soudipta Dutta

Reputation: 2122

import numpy as np

arr = np.array([False, False, False, True, True, True])

comparison = np.insert(arr[1:] == arr[:-1], 0, False)
print(comparison)
'''
[False  True  True False  True  True]
'''

using np.roll :

import numpy as np

arr = np.array([False, False, False, True, True, True])

comparison = arr == np.roll(arr, 1)
# Set the first element to False
comparison[0] = False  

"""
Original array: [False False False  True  True  True]
Comparison:        [False  True  True False  True  True]
"""

Upvotes: 0

user2138149

Reputation: 16484

You are looking for the function numpy.roll.

Example usage:

import numpy
x = numpy.arange(10)
numpy.roll(x, 2)
array([8, 9, 0, 1, 2, 3, 4, 5, 6, 7])
numpy.roll(x, -2)
array([2, 3, 4, 5, 6, 7, 8, 9, 0, 1])

There is a caveat to this. Elements which "roll off" the end of one array are re-introduced at the other end. If this isn't what you want, you will have to set these elements to zero, NaN or some other sensible default. You could also reduce the array length to remove them.

https://numpy.org/doc/stable/reference/generated/numpy.roll.html

Upvotes: 0

Eugene

Reputation: 55

Code below can shift np.ndarray over a given axis. It should be pretty fast. But beware of using input arrays with default fill_value of dtype different from np.floating.

def np_shift(a:np.ndarray, shift_value:int, axis=0, fill_value=np.NaN) -> np.ndarray:
    if shift_value == 0:
        return a
    
    result = np.roll(a=a, shift=shift_value, axis=axis)
    axes = [slice(None)] * a.ndim
    if shift_value > 0:
        axes[axis] = slice(None, shift_value)
    else:
        axes[axis] = slice(shift_value, None)

    result[tuple(axes)] = fill_value

    return result

For example:

a = np.array([i for i in range(100000)], dtype=np.float64)
#Ok
np_shift(a, shift_value=1, axis=0, fill_value=np.NaN) 


#ValueError: cannot convert float NaN to integer
a = np.array([i for i in range(100000)], dtype=np.int32)
np_shift(a, shift_value=1, axis=0, fill_value=np.NaN)


#Ok
np_shift(a, shift_value=1, axis=0, fill_value=-15)

If you don't want to be aware of that issue, you can add checks to dtype. For example:

def np_shift(a:np.ndarray, shift_value:int, axis=0, fill_value=np.NaN) -> np.ndarray:
    if shift_value == 0:
        return a
    
    if not np.issubdtype(a.dtype, np.floating):
        a = a.astype(np.float64)
    
    result = np.roll(a=a, shift=shift_value, axis=axis)
    axes = [slice(None)] * a.ndim
    if shift_value > 0:
        axes[axis] = slice(None, shift_value)
    else:
        axes[axis] = slice(shift_value, None)

    result[tuple(axes)] = fill_value

    return result

Upvotes: 0

Caio Castro

Reputation: 601

A simple function that shifts 1d-arrays, in a similar way to pandas:

def arr_shift(arr: np.ndarray, shift: int) -> np.ndarray:
    if shift == 0:
        return arr
    nas = np.empty(abs(shift))
    nas[:] = np.nan
    if shift > 0:
        res = arr[:-shift]
        return np.concatenate((nas,res))
    res = arr[-shift:]
    return np.concatenate((res,nas))

this is suposed to work with numerical arrays, as the shifted value is replaced by np.NAN. It is trivial to select another "null" value by just filling the nas array with whatever you want.

Upvotes: 1

Partha Mandal

Reputation: 1441

This seems to be what you want:

shift_by = 1
arr = np.array([False, False, False, True, True, True]).tolist() ## array -> list
shift_arr = [np.nan]*shift_by + arr[:-shift_by]
np.equal(arr,shift_arr)

For purely numpy:

shift_by = 1
arr = np.array([False, False, False, True, True, True])
np.concatenate([np.array([False]*shift_by),np.equal(arr[shift_by:],arr[:-shift_by])])

Upvotes: 0

tstanisl

Reputation: 14107

Why not use slicing

arr[1:] == arr[:-1]

Result wouls be slightly shorter array but there is no need to handle border cases.

Upvotes: 2

numpy array equivalent of pandas.shift() function?

Answers (6)

Related Questions