user10415648
user10415648

Reputation:

Check, if a variable does not equal to any of the vector's elements

I have a vector dogSpecies showing all four unique dog species under investigation.

#a set of possible dog species
dogSpecies = [1,2,3,4]

I also have a data vector containing integer numbers corresponding to the records of dog species of all dogs tested.

# species of examined dogs
data = np.array(1,1,2,-1,0,2,3,5,4)

Some of the records in data contain values different than 1,2,3 or 4. (Such as -1, 0 or 5). If an element in the data set is not equal to any element of the dogSpecies, such occurrence should be marked in an error evaluation boolean matrix as False.

#initially all the elements of the boolean error evaluation vector are True.

errorEval = np.ones((np.size(data,axis = 0)),dtype=bool)

Ideally my errorEval vector would look like this:

errorEval = np.array[True, True, True, False, False, True, True, False, True]

I want a piece of code that checks if the elements of data are not equal to the elements of dogSpecies vector. My code for some reason marks every single element of the errorEval vector as 'False'.

for i in range(np.size(data, axis = 0)):
# validation of the species
            if (data[i] !=  dogSpecies):
                    errorEval[i] = False

I understand that I cannot compare a single element with a vector of four elements like above, but how do I do this then?

Upvotes: 2

Views: 2010

Answers (3)

Code Pope
Code Pope

Reputation: 5459

As @FHTMitchel stated you have to use in to check if an element is in a list or not.
But you can use list comprehension which is faster as normal loop and shorter:

errorEval = np.array([True if elem in dogSpecies else False for elem in data])

Upvotes: 0

gold_cy
gold_cy

Reputation: 14236

If I understand correctly, you have a dataframe and a list of dog species. This should achieve what you want.

df = pd.DataFrame({'dog': [1,3,4,5,1,1,8,9,0]})

   dog
0    1
1    3
2    4
3    5
4    1
5    1
6    8
7    9
8    0


df['errorEval'] = df['dog'].isin(dogSpecies).astype(int)

   dog  errorEval
0    1          1
1    3          1
2    4          1
3    5          0
4    1          1
5    1          1
6    8          0
7    9          0
8    0          0

df.errorEval.values
# array([1, 1, 1, 0, 1, 1, 0, 0, 0])

If you don't want to create a new column then you can do:

df.assign(errorEval=df['dog'].isin(dogSpecies).astype(int)).errorEval.values
# array([1, 1, 1, 0, 1, 1, 0, 0, 0])

Upvotes: 0

FHTMitchell
FHTMitchell

Reputation: 12156

Isn't this just what you want?

for index, elem in enumerate(data):
    if elem not in dogSpecies:
        errorEval[index] = False

Probably not very fast, it doesn't use any vectorized numpy ufuncs but if the array isn't very large that won't matter. Converting dogSpecies to a set will also speed things up.


As an aside, your python looks very c/java esque. I'd suggest reading the python style guide.

Upvotes: 1

Related Questions