tknickman
tknickman

Reputation: 4621

What is wrong with this python list removal loop?

I have been up far too long tonight working on a long program. But I have hit a simple roadblock. Can any one tell me why this code is working the way it is?

I have two lists. I want list2 to only contain numbers that are not in list1. logically this seems like it should work. But it doest at all. Why?

list1 = [1,2,3,4,5,6,7,8]
list2 = [12,15,16,7,34,23,5,23,76,89,9,45,4]


for ch in list2:
    if ch in list1:
         list2.remove(ch)

return list2

somehow, this returns: [15, 7, 5, 23, 76, 9, 4]

Why?

and how can I accomplish what I need?

Upvotes: 4

Views: 150

Answers (3)

hamstergene
hamstergene

Reputation: 24439

Don't modify a list while iterating over it.

What you want can be directly expressed with list comprehension:

list2 = [ch for ch in list2 if ch not in list1]

It is more readable, and unlike solutions with sets it will not remove duplicates from list2 or change item order.

UPDATE: when list1 is big, creating a set from it will actually speed things up:

list2 = [ch for ch in list2 if ch not in set(list1)]

Upvotes: 5

Michael Hoffman
Michael Hoffman

Reputation: 34324

When you modify a sequence you are iterating over, it will yield unexpected results. I'd do it this way, which takes advantage of fast set operations.

list2 = list(set(list2) - set(list1))

Whether this is faster or slower than using a list comprehension depends on the sizes of list1 and list2, and whether you can make one into a set as part of initialization rather than multiple times in a loop.

Upvotes: 9

Abhranil Das
Abhranil Das

Reputation: 5918

This is an interesting point. Let me explain why this happens.

When in python you use for a in list, python looks sequentially at element 1, element 2 etc. of the list. So it first looks at 12 and removes it. Then it looks at element 2, except now 15 is element 1 and 16 is element 2. It removes 16. So 15 was never checked and is left in the list. Then it similarly skips 7 and removes 34...

The way to avoid this is of course not to iterate on the same list from which elements are being removed. You can make a copy of the second list. Check if a member in this copy is in the first list. If it isn't, remove it from the second list. I am sure some of the suggestions that have been posted will work for you. This was the explanation.

Upvotes: 1

Related Questions