pogo
pogo

Reputation: 1550

How to check if all items in a list are there in another list?

I have two lists say

List1 = ['a','c','c']
List2 = ['x','b','a','x','c','y','c']

Now I want to find out if all elements of List1 are there in List2. In this case all there are. I can't use the subset function because I can have repeated elements in lists. I can use a for loop to count the number of occurrences of each item in List1 and see if it is less than or equal to the number of occurrences in List2. Is there a better way to do this?

Thanks.

Upvotes: 34

Views: 45493

Answers (6)

poke
poke

Reputation: 387707

When number of occurrences doesn't matter, you can still use the subset functionality, by creating a set on the fly:

>>> list1 = ['a', 'c', 'c']
>>> list2 = ['x', 'b', 'a', 'x', 'c', 'y', 'c']
>>> set(list1).issubset(list2)
True

If you need to check if each element shows up at least as many times in the second list as in the first list, you can make use of the Counter type and define your own subset relation:

>>> from collections import Counter
>>> def counterSubset(list1, list2):
        c1, c2 = Counter(list1), Counter(list2)
        for k, n in c1.items():
            if n > c2[k]:
                return False
        return True
   
>>> counterSubset(list1, list2)
True
>>> counterSubset(list1 + ['a'], list2)
False
>>> counterSubset(list1 + ['z'], list2)
False

If you already have counters (which might be a useful alternative to store your data anyway), you can also just write this as a single line:

>>> all(n <= c2[k] for k, n in c1.items())
True

Upvotes: 52

abarnert
abarnert

Reputation: 365747

I can't use the subset function because I can have repeated elements in lists.

What this means is that you want to treat your lists as multisets rather than sets. The usual way to handle multisets in Python is with collections.Counter:

A Counter is a dict subclass for counting hashable objects. It is an unordered collection where elements are stored as dictionary keys and their counts are stored as dictionary values. Counts are allowed to be any integer value including zero or negative counts. The Counter class is similar to bags or multisets in other languages.

And, while you can implement subset for multisets (implemented with Counter) by looping and comparing counts, as in poke's answer, this is unnecessary—just as you can implement subset for sets (implemented with set or frozenset) by looping and testing in, but it's unnecessary.

The Counter type already implements all the set operators extended in the obvious way for multisets.<1 So you can just write subset in terms of those operators, and it will work for both set and Counter out of the box.

With (multi)set difference:2

def is_subset(c1, c2):
    return not c1 - c2

Or with (multi)set intersection:

def is_subset(c1, c2):
    def c1 & c2 == c1

1. You may be wondering why, if Counter implements the set operators, it doesn't just implement < and <= for proper subset and subset. Although I can't find the email thread, I'm pretty sure this was discussed, and the answer was that "the set operators" are defined as the specific set of operators defined in the initial version of collections.abc.Set (which has since been expanded, IIRC…), not all operators that set happens to include for convenience, in the exact same way that Counter doesn't have named methods like intersection that's friendly to other types than & just because set does.

2. This depends on the fact that collections in Python are expected to be falsey when empty and truthy otherwise. This is documented here for the builtin types, and the fact that bool tests fall back to len is explained here—but it's ultimately just a convention, so that "quasi-collections" like numpy arrays can violate it if they have a good reason. It holds for "real collections" like Counter, OrderedDict, etc. If you're really worried about that, you can write len(c1 - c2) == 0, but note that this is against the spirit of PEP 8.

Upvotes: 1

fferri
fferri

Reputation: 18940

A solution using Counter and the builtin intersection method (note that - is proper multiset difference, not an element-wise subtraction):

from collections import Counter

def is_subset(l1, l2):
    c1, c2 = Counter(l1), Counter(l2)
    return not c1 - c2

Test:

>>> List1 = ['a','c','c']
>>> List2 = ['x','b','a','x','c','y','c']
>>> is_subset(List1, List2)
True

Upvotes: 1

DevPlayer
DevPlayer

Reputation: 5579

Be aware of the following:

>>>listA = ['a', 'a', 'b','b','b','c']
>>>listB = ['b', 'a','a','b','c','d']
>>>all(item in listB for item in listA)
True

If you read the "all" line as you would in English, This is not wrong but can be misleading, as listA has a third 'b' but listB does not.

This also has the same issue:

def list1InList2(list1, list2):
    for item in list1:
        if item not in list2:
            return False
    return True

Just a note. The following does not work:

>>>tupA = (1,2,3,4,5,6,7,8,9)
>>>tupB = (1,2,3,4,5,6,6,7,8,9)
>>>set(tupA) < set(TupB)
False

If you convert the tuples to lists it still does not work. I don't know why strings work but ints do not.

Works but has same issue of not keeping count of element occurances:

>>>set(tupA).issubset(set(tupB))
True

Using sets is not a comprehensive solution for multi-occurrance element matching.

But here is a one-liner solution/adaption to shantanoo's answer without try/except:

all(True if sequenceA.count(item) <= sequenceB.count(item) else False for item in sequenceA)

A builtin function wrapping a list comprehension using a ternary conditional operator. Python is awesome! Note that the "<=" should not be "==".

With this solution sequence A and B can be type tuple and list and other "sequences" with "count" methods. The elements in both sequences can be most types. I would not use this with dicts as it is now, hence the use "sequence" instead of "iterable".

Upvotes: 7

jeffam217
jeffam217

Reputation: 117

This will return true is all the items in List1 are in List2

def list1InList2(list1, list2):
    for item in list1:
        if item not in list2:
            return False
    return True

Upvotes: 0

shantanoo
shantanoo

Reputation: 3704

def check_subset(list1, list2):
    try:
        [list2.remove(x) for x in list1]
        return 'all elements in list1 are in list2'
    except:
        return 'some elements in list1 are not in list2'

Upvotes: -1

Related Questions