Tezcatlipoca
Tezcatlipoca

Reputation: 51

How to compare each element of two lists in python

I am looking for a Python operator similar to %in% in R. It compares all elements in a list to all elements to another list and returns a boolean array. It would do the following:

a=['word1','word2','word3','word4']
b=['word2','word4']
a *%in%* b
>>False True False True

The closest thing I have found is pd.str.contains but it is not vectorized, i.e. it only looks for one element. Hopefully somebody knows.

Upvotes: 3

Views: 13950

Answers (3)

iBug
iBug

Reputation: 37227

List comprehension:

[item in b for item in a]

This creates a new list in a way similar to the following code:

newList = []
for item in a:
    newList.append(item in b)

where item in b evaluates to True if item exists in b, otherwise it will evaluate to False.


As mentioned in comments (thanks Paul Rooney!), the speed of this can be improved if you make b into a set:

b_set = set(b)
result = [item in b_set for item in a]

This is because the lookup operation item in b takes consistent time if b is a set, while every single item has to be compared until a matching one is found if b is a list.

The speed improvement is not very noticeable if b is small, but for a list b containing hundreds of elements, this can be a freat improvement.

Upvotes: 4

vash_the_stampede
vash_the_stampede

Reputation: 4606

Using list comprehension we can check if the item is in the list of comparison and return True if so else False

l = [*True if i in b else False for i in a]
print(*l)
False True False True

Expanded loop

l = []
for i in a:
    if i in b:
        l.append(True)
    else:
        l.append(False)

Upvotes: 0

Bi Rico
Bi Rico

Reputation: 25813

Because python isn't a primarily numerical or scientific language it doesn't come with some things that are available by default in matlab or R. That being said, almost anything you'd need from those languages is available in the numpy/scipy ecosystem. For example, numpy has an in1d function:

import numpy
a = ['word1','word2','word3','word4']
b = ['word2','word4']

print(numpy.in1d(a, b))
# [False  True False  True]

Upvotes: 1

Related Questions