Reputation: 4621
I have been up far too long tonight working on a long program. But I have hit a simple roadblock. Can any one tell me why this code is working the way it is?
I have two lists. I want list2 to only contain numbers that are not in list1. logically this seems like it should work. But it doest at all. Why?
list1 = [1,2,3,4,5,6,7,8]
list2 = [12,15,16,7,34,23,5,23,76,89,9,45,4]
for ch in list2:
if ch in list1:
list2.remove(ch)
return list2
somehow, this returns: [15, 7, 5, 23, 76, 9, 4]
Why?
and how can I accomplish what I need?
Upvotes: 4
Views: 150
Reputation: 24439
Don't modify a list while iterating over it.
What you want can be directly expressed with list comprehension:
list2 = [ch for ch in list2 if ch not in list1]
It is more readable, and unlike solutions with sets it will not remove duplicates from list2 or change item order.
UPDATE: when list1 is big, creating a set from it will actually speed things up:
list2 = [ch for ch in list2 if ch not in set(list1)]
Upvotes: 5
Reputation: 34324
When you modify a sequence you are iterating over, it will yield unexpected results. I'd do it this way, which takes advantage of fast set
operations.
list2 = list(set(list2) - set(list1))
Whether this is faster or slower than using a list comprehension depends on the sizes of list1
and list2
, and whether you can make one into a set
as part of initialization rather than multiple times in a loop.
Upvotes: 9
Reputation: 5918
This is an interesting point. Let me explain why this happens.
When in python you use for a in list
, python looks sequentially at element 1, element 2 etc. of the list. So it first looks at 12 and removes it. Then it looks at element 2, except now 15 is element 1 and 16 is element 2. It removes 16. So 15 was never checked and is left in the list. Then it similarly skips 7 and removes 34...
The way to avoid this is of course not to iterate on the same list from which elements are being removed. You can make a copy of the second list. Check if a member in this copy is in the first list. If it isn't, remove it from the second list. I am sure some of the suggestions that have been posted will work for you. This was the explanation.
Upvotes: 1