Reputation: 5184
I want to find out if my numpy vector, needle
, appears inside another vector, haystack
, as a slice, or contiguous sub-vector.
I want a function find(needle, haystack)
that returns true if and only if there are possible integer indexes p and q such that needle
equals haystack[p:q]
, where "equals" means elements are equal at all positions.
Example:
find([2,3,4], [1,2,3,4,5]) == True
find([2,4], [1,2,3,4,5]) == False # not contiguous inside haystack
find([2,3,4], [0,1,2,3]) == False # incomplete
Here I am using lists to simplify the illustration, but really they would be numpy vectors (1-dimensional arrays).
For strings in Python, the equivalent operation is trivial: it's in
: "bcd" in "abcde" == True
.
An appendix on dimensionality.
Dear reader, you might be tempted by similar looking questions, such as testing whether a Numpy array contains a given row, or Checking if a NumPy array contains another array. But we can dismiss this similarity as not being helpful by a consideration of dimensions.
A vector is a one-dimensional array. In numpy
terms a vector of length N will have .shape == (N,)
; its shape has length 1.
The other referenced questions are, generally seeking to find an exact match for a row in a matrix that is 2-dimensional.
I am seeking to slide my 1-dimensional needle along the same axis of my 1-dimensional haystack like a window, until the entire needle matches the portion of the haystack that is visible through the window.
Upvotes: 1
Views: 907
Reputation: 13397
Try with list comprehension:
def find(a,x):
return any([x[i:i+len(a)]==a for i in range(1+len(x)-len(a))])
Outputs:
print(find([2,3,4], [1,2,3,4,5]),
find([2,4], [1,2,3,4,5]),
find([2,3,4], [0,1,2,3]), find([2,3,4], [0,1,2,3,4,5,2,3,4]))
>> True False False True
Upvotes: 0
Reputation: 11602
If you are fine with creating copies of the two arrays, you could fall back on Python in
operator for byte objects:
def find(a, b):
return a.tobytes() in b.tobytes()
print(
find(np.array([2,3,4]), np.array([1,2,3,4,5])),
find(np.array([2,4]), np.array([1,2,3,4,5])),
find(np.array([2,3,4]), np.array([0,1,2,3])),
find(np.array([2,3,4]), np.array([0,1,2,3,4,5,2,3,4])),
)
# True False False True
Upvotes: 1