Reputation: 13
I have a large array with zeros and ones, array = [1, 0, 1, 0, 0, 0, 0, 1, 1, 0, 1, 1]
. How can I find matching patterns like [0, 0]
, [0, 1]
, [1, 0]
, [1, 1]
in array.
Upvotes: 1
Views: 2199
Reputation: 7121
You can use a convolution for that, e.g. numpy.convolve
:
import numpy as np
data = np.array([1, 0, 1, 0, 0, 0, 0 ,1, 1, 0, 1, 1])
# this fixes the issue that some patterns look identical
# scores due to the multiplication with 0
# e.g. [1, 0, 1] and [1, 1, 1]
# we just replace the 0 by -1
data[data == 0] = -1
kernel = np.array([0, 0, 0, 1, 1, 0, 1, 1])
# same fix for kernel
kernel[kernel == 0] = -1
res = np.convolve(data,kernel, 'full')
print(res)
# >>> [-1 0 -1 2 1 2 5 -2 -2 -2 -2 0 -5 -2 5 0 -1 2 1]
res = np.convolve(data,kernel, 'same')
print(res)
# >>> [ 2 1 2 5 -2 -2 -2 -2 0 -5 -2 5]
res = np.convolve(data,kernel, 'valid')
print(res)
# >>> [-2 -2 -2 -2 0]
The higher the result the better the match. In your case is should be equal to the number of ones in your pattern and the index can be found using np.argmax()
.
Look at the keyword mode
(full, same, valid) and choose what is best for your case.
There is also scipy.signal.convolve
, which might be faster if you are processing lots of data.
Upvotes: 1
Reputation: 17794
You can use this function to create a rolling window array:
def rolling_window(a, window):
shape = a.shape[:-1] + (a.shape[-1] - window + 1, window)
strides = a.strides + (a.strides[-1],)
return np.lib.stride_tricks.as_strided(a, shape=shape, strides=strides)
arr = np.array([1, 1, 1, 0, 0, 0, 0, 1, 1, 0, 1, 1])
pattern = np.array([1, 0, 1])
arr = rolling_window(arr, pattern.shape[0])
print(arr)
Output:
[[1 1 1]
[1 1 0]
[1 0 0]
[0 0 0]
[0 0 0]
[0 0 1]
[0 1 1]
[1 1 0]
[1 0 1]
[0 1 1]]
Then you can look for matches:
(arr == pattern).all(axis=1)
# [False False False False False False False False True False]
Alternatively, you can use the method rolling
in pandas
:
(pd.Series(arr).rolling(pattern.shape[0])
.apply(lambda x: (x == pattern).all())
.fillna(0).astype('bool'))
Upvotes: 0