Latif Fetahaj
Latif Fetahaj

Reputation: 13

How can I find and match patterns in a Numpy array?

I have a large array with zeros and ones, array = [1, 0, 1, 0, 0, 0, 0, 1, 1, 0, 1, 1]. How can I find matching patterns like [0, 0], [0, 1], [1, 0], [1, 1] in array.

Upvotes: 1

Views: 2199

Answers (2)

Joe
Joe

Reputation: 7121

You can use a convolution for that, e.g. numpy.convolve:

import numpy as np

data = np.array([1, 0, 1, 0, 0, 0, 0 ,1, 1, 0, 1, 1])

# this fixes the issue that some patterns look identical
# scores due to the multiplication with 0
# e.g. [1, 0, 1] and [1, 1, 1]
# we just replace the 0 by -1
data[data == 0] = -1


kernel = np.array([0, 0, 0, 1, 1, 0, 1, 1])

# same fix for kernel
kernel[kernel == 0] = -1

res = np.convolve(data,kernel, 'full')
print(res)
# >>> [-1  0 -1  2  1  2  5 -2 -2 -2 -2  0 -5 -2  5  0 -1  2  1]

res = np.convolve(data,kernel, 'same')
print(res)
# >>> [ 2  1  2  5 -2 -2 -2 -2  0 -5 -2  5]    

res = np.convolve(data,kernel, 'valid')
print(res)
# >>> [-2 -2 -2 -2  0]

The higher the result the better the match. In your case is should be equal to the number of ones in your pattern and the index can be found using np.argmax().

Look at the keyword mode (full, same, valid) and choose what is best for your case.

There is also scipy.signal.convolve, which might be faster if you are processing lots of data.

Upvotes: 1

Mykola Zotko
Mykola Zotko

Reputation: 17794

You can use this function to create a rolling window array:

def rolling_window(a, window):
    shape = a.shape[:-1] + (a.shape[-1] - window + 1, window)
    strides = a.strides + (a.strides[-1],)
    return np.lib.stride_tricks.as_strided(a, shape=shape, strides=strides)


arr = np.array([1, 1, 1, 0, 0, 0, 0, 1, 1, 0, 1, 1])
pattern = np.array([1, 0, 1])

arr = rolling_window(arr, pattern.shape[0])
print(arr)

Output:

[[1 1 1]
 [1 1 0]
 [1 0 0]
 [0 0 0]
 [0 0 0]
 [0 0 1]
 [0 1 1]
 [1 1 0]
 [1 0 1]
 [0 1 1]]

Then you can look for matches:

(arr == pattern).all(axis=1)
# [False False False False False False False False  True False]

Alternatively, you can use the method rolling in pandas:

(pd.Series(arr).rolling(pattern.shape[0])
    .apply(lambda x: (x == pattern).all())
    .fillna(0).astype('bool'))

Upvotes: 0

Related Questions