user28383
user28383

Reputation: 3

List comparison with comprehension giving inadequate result in Python

Consider following lists with example values

(here's obviously device name, serial number and other values that don't really matter)

I didn't mention that len means the actual list length. So I have 2019 elements and 2100 elements in another.

devices_list_foo = ['1', 'HP monitor 500', '1', 'L9K12AZU', 'foo', 'bar']
>>> len(devices_list_foo)
2019


devices_list_bar = ['london-foobar-street', 'hpmon500', 'L9K12AZU', 'some value']
>>> len(devices_list_bar)
2100

I had to find matches between the two lists and write them to a different list. I did it with the following line:

common_elements = set(i[3] for i in devices_list_foo).intersection(i[2] for i in devices_list_bar)

This gave me 588 common serials between lists. And then I had to make a list of what's left of these 588 - the list of machines. So 2019 - 588 = 1431 and 2100 - 588 = 1512. So I need these 1431 and 1512 machines in lists.

Here's what I tried: Since common_elements is of type set, then I was able to use list comprehension:

devices_missing_list_foo = [x for x in devices_list_foo if x[3] not in common_elements]
>>> len(devices_missing_list_foo)
1347

devices_missing_list_bar = [x for x in devices_list_bar if x[2] not in common_elements]
>>> len(devices_missing_list_bar)
1512

So this 1512 seems to be correct, but why do I see this 1347 instead of 1431. How could I investigate this?

Upvotes: 0

Views: 54

Answers (1)

UlfR
UlfR

Reputation: 4395

I'm am not entirely sure I understand your question. But one pitfall I think is that there seems to be duplicated values in the lists so that the length is less of the set that the original list. Ex:

>>> test_list = [1,2,3,1]
>>> len(test_list)
4
>>> test_set = set(test_list)
>>> len(test_set)
3

Edit

So we know:

  • You have duplicates in your original lists
  • common_elements is a set so there is no duplicates there

But since we do not know how many times a specific value from common_elements exists in the original lists (could be ones, twice or even more) your sums does not add up. Yet another example:

>>> a=[1,1,1,2,3]
>>> b=[3,3,3,4,5]
>>> set(a)
set([1, 2, 3])
>>> set(b)
set([3, 4, 5])
>>> common_elements=set(a).intersection(b)
>>> common_elements
set([3])
>>> a_missing=[x for x in a if x not in common]
>>> b_missing=[x for x in b if x not in common]
>>> a_missing
[1, 1, 1, 2]
>>> b_missing
[4, 5]

NOTE len(b_missing)+len(common_elements) <> len(b)

Upvotes: 2

Related Questions