Guldam Kwak
Guldam Kwak

Reputation: 171

How to check if a list of numpy arrays contains a given test array?

I have a list of numpy arrays, say,

a = [np.random.rand(3, 3), np.random.rand(3, 3), np.random.rand(3, 3)]

and I have a test array, say

b = np.random.rand(3, 3)

I want to check whether a contains b or not. However

b in a 

throws the following error:

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

What is the proper way for what I want?

Upvotes: 4

Views: 4115

Answers (6)

Omar  Al Zeidi
Omar Al Zeidi

Reputation: 11

As highlighted by @jotasi the truth value is ambiguous due to element-wise comparison within the array. There was a previous answer to this question here. Overall your task can be done in various ways:

  1. list-to-array:

You can use the "in" operator by converting the list to a (3,3,3)-shaped array as follows:

    >>> a = [np.random.rand(3, 3), np.random.rand(3, 3), np.random.rand(3, 3)]
    >>> a= np.asarray(a)
    >>> b= a[1].copy()
    >>> b in a
    True
  1. np.all:

    >>> any(np.all((b==a),axis=(1,2)))
    True
    
  2. list-comperhension: This done by iterating over each array:

    >>> any([(b == a_s).all() for a_s in a])
    True
    

Below is a speed comparison of the three approaches above:

Speed Comparison

import numpy as np
import perfplot

perfplot.show(
    setup=lambda n: np.asarray([np.random.rand(3*3).reshape(3,3) for i in range(n)]),
    kernels=[
        lambda a: a[-1] in a,
        lambda a: any(np.all((a[-1]==a),axis=(1,2))),
        lambda a: any([(a[-1] == a_s).all() for a_s in a])
        ],
    labels=[
        'in', 'np.all', 'list_comperhension'
        ],
    n_range=[2**k for k in range(1,20)],
    xlabel='Array size',
    logx=True,
    logy=True,
    )

Upvotes: 1

Upasana Mittal
Upasana Mittal

Reputation: 2680

Use array_equal from numpy

    import numpy as np
    a = [np.random.rand(3,3),np.random.rand(3,3),np.random.rand(3,3)]
    b = np.random.rand(3,3)

    for i in a:
        if np.array_equal(b,i):
            print("yes")

Upvotes: 0

Nils Werner
Nils Werner

Reputation: 36859

You can just make one array of shape (3, 3, 3) out of a:

a = np.asarray(a)

And then compare it with b (we're comparing floats here, so we should use isclose())

np.all(np.isclose(a, b), axis=(1, 2))

For example:

a = [np.random.rand(3,3),np.random.rand(3,3),np.random.rand(3,3)]
a = np.asarray(a)
b = a[1, ...]       # set b to some value we know will yield True

np.all(np.isclose(a, b), axis=(1, 2))
# array([False,  True, False])

Upvotes: 5

jotasi
jotasi

Reputation: 5177

As pointed out in this answer, the documentation states that:

For container types such as list, tuple, set, frozenset, dict, or collections.deque, the expression x in y is equivalent to any(x is e or x == e for e in y).

a[0]==b is an array, though, containing an element-wise comparison of a[0] and b. The overall truth value of this array is obviously ambiguous. Are they the same if all elements match, or if most match of if at least one matches? Therefore, numpy forces you to be explicit in what you mean. What you want to know, is to test whether all elements are the same. You can do that by using numpy's all method:

any((b is e) or (b == e).all() for e in a)

or put in a function:

def numpy_in(arrayToTest, listOfArrays):
    return any((arrayToTest is e) or (arrayToTest == e).all()
               for e in listOfArrays)

Upvotes: 0

samorr
samorr

Reputation: 81

This error is because if a and b are numpy arrays then a == b doesn't return True or False, but array of boolean values after comparing a and b element-wise.

You can try something like this:

np.any([np.all(a_s == b) for a_s in a])
  • [np.all(a_s == b) for a_s in a] Here you are creating list of boolean values, iterating through elements of a and checking if all elements in b and particular element of a are the same.

  • With np.any you can check if any element in your array is True

Upvotes: 0

FHTMitchell
FHTMitchell

Reputation: 12156

Ok so in doesn't work because it's effectively doing

def in_(obj, iterable):
    for elem in iterable:
        if obj == elem:
            return True
    return False

Now, the problem is that for two ndarrays a and b, a == b is an array (try it), not a boolean, so if a == b fails. The solution is do define a new function

def array_in(arr, list_of_arr):
     for elem in list_of_arr:
        if (arr == elem).all():
            return True
     return False

a = [np.arange(5)] * 3
b = np.ones(5)

array_in(b, a) # --> False

Upvotes: 0

Related Questions