Reputation:

List Duplicate Removal Issue?

I wrote a code that eliminates duplicates from a list in Python. Here it is:

List = [4, 2, 3, 1, 7, 4, 5, 6, 5]
NewList = []

for i in List:
    if List[i] not in NewList:
        NewList.append(i)

print ("Original List:", List)
print ("Reworked List:", NewList)

However the output is:

Original List: [4, 2, 3, 1, 7, 4, 5, 6, 5]
Reworked List: [4, 2, 3, 7, 6]

Why is the 1 missing from the output?

Upvotes: 0

Answers (4)

jpp

Reputation: 164773

Your method for iterating over lists is not correct. Your code currently iterates over elements, but then does not use that element in your logic. Your code doesn't error because the values of your list happen also to be valid list indices.

You have a few options:

#1 Iterate over elements directly

Use elements of a list as you iterate over them directly:

NewList = []
for el in L:
    if el not in NewList:
        NewList.append(i)

#2 Iterate over list index

This is often considered anti-pattern, but is not invalid. You can iterate over the range of the size of the list and then use list indexing:

NewList = []
for idx in range(len(L)):
    if L[idx] not in NewList:
        NewList.append(i)

In both cases, notice how we avoid naming variables after built-ins. Don't use list or List, you can use L instead.

#3 unique_everseen

It's more efficient to implement hashing for O(1) lookup complexity. There is a unique_everseen recipe in the itertools docs, replicated in 3rd party toolz.unique. This works by using a seen set and tracking items as you iterate.

from toolz import unique

NewList = list(unique(L))

Upvotes: 0

MisterMiyagi

Reputation: 51999

Your code is not doing what you think it does. Your problem are these two constructs:

for i in List:  # 1
    if List[i]  # 2

Here you are using i to represent the elements inside the list: 4, 2, 3, ...
Here you are using i to represent the indices of the List: 0, 1, 2, ...

Obviously, 1. and 2. are not compatible. In short, your check is performed for a different element than the one you put in your list.

You can fix this by treating i consistently at both steps:

for i in List:
    if i not in NewList:
         NewList.append(i)

Upvotes: 0

Arbaz Siddiqui

Reputation: 481

Using set() kills the order. You can try this :

>>> from collections import OrderedDict
>>> NewList = list(OrderedDict.fromkeys(List))

Upvotes: 1

Bernhard

Reputation: 1273

You missunderstood how for loops in python work. If you write for i in List: i will have the values from the list one after another, so in your case 4, 2, 3 ...

I assume you thought it'd be counting up.

You have several different ways of removing duplicates from lists in python that you don't need to write yourself, like converting it to a set and back to a list.

list(set(List))

Also you should read Pep8 and name your variables differently, but that just btw.

Also if you really want a loop with indices, you can use enumerate in python.

for idx, value in enumerate(myList):
    print(idx)
    print(myList[idx])

Upvotes: 0

List Duplicate Removal Issue?

Answers (4)

#1 Iterate over elements directly

#2 Iterate over list index

#3 unique_everseen

Related Questions