H. Hasin
H. Hasin

Reputation: 197

Python - list comprehension , 2D list

I'm trying to figure out how to delete duplicates from 2D list. Let's say for example:

x= [[1,2], [3,2]]

I want the result:

[1, 2, 3]

in this order.

Actually I don't understand why my code doesn't do that :

def removeDuplicates(listNumbers):
    finalList=[]
    finalList=[number for numbers in listNumbers for number in numbers if number not in finalList]
    return finalList

If I should write it in nested for-loop form it'd look same

def removeDuplicates(listNumbers):
    finalList=[]
    for numbers in listNumbers:
        for number in numbers:
            if number not in finalList:
                finalList.append(number)
    return finalList

"Problem" is that this code runs perfectly. Second problem is that order is important. Thanks

Upvotes: 1

Views: 2290

Answers (4)

Iron Fist
Iron Fist

Reputation: 10951

finalList is always an empty list on your list-comprehension even though you think it's appending during that to it, which is not the same exact case as the second code (double for loop).

What I would do instead, is use set:

>>> set(i for sub_l in x for i in sub_l)
{1, 2, 3}

EDIT: Otherway, if order matters and approaching your try:

>>> final_list = []
>>> x_flat = [i for sub_l in x for i in sub_l]
>>> list(filter(lambda x: f.append(x) if x not in final_list else None, x_flat))
[] #useless list thrown away and consumesn memory
>>> f
[1, 2, 3]

Or

>>> list(map(lambda x: final_list.append(x) if x not in final_list else None, x_flat))
[None, None, None, None] #useless list thrown away and consumesn memory
>>> f
[1, 2, 3]

EDIT2: As mentioned by timgeb, obviously the map & filter will throw away lists that are at the end useless and worse than that, they consume memory. So, I would go with the nested for loop as you did in your last code example, but if you want it with the list comprehension approach than:

>>> x_flat = [i for sub_l in x for i in sub_l]
>>> final_list = []
>>> for number in x_flat:
        if number not in final_list:
            finalList.append(number)

Upvotes: 3

timgeb
timgeb

Reputation: 78670

You declare finalList as the empty list first, so

if number not in finalList

will be False all the time.

The right hand side of your comprehension will be evaluated before the assignment takes place.

Iterate over the iterator chain.from_iterable gives you and remove duplicates in the usual way:

>>> from itertools import chain
>>> x=[[1,2],[3,2]]
>>> 
>>> seen = set()
>>> result = []
>>> for item in chain.from_iterable(x):
...     if item not in seen:
...         result.append(item)
...         seen.add(item)
... 
>>> result
[1, 2, 3]

Further reading: How do you remove duplicates from a list in Python whilst preserving order?

edit:

You don't need the import to flatten the list, you could just use the generator

(item for sublist in x for item in sublist)

instead of chain.from_iterable(x).

Upvotes: 1

Julien Spronck
Julien Spronck

Reputation: 15423

There is no way in Python to refer to the current comprehesion. In fact, if you remove the line finalList=[], which does nothing, you would get an error.

You can do it in two steps:

finalList = [number for numbers in listNumbers for number in numbers]
finalList = list(set(finalList))

or if you want a one-liner:

finalList = list(set(number for numbers in listNumbers for number in numbers))

Upvotes: 0

Ilja
Ilja

Reputation: 2114

The expression on the right-hand-side is evalueated first, before assigning the result of this list comprehension to the finalList. Whereas in your second approach you write to this list all the time between the iterations. That's the difference.

That may be similar to the considerations why the manuals warn about unexpected behaviour when writing to the iterated iterable inside a for loop.

you could use the built-in set()-method to remove duplicates (you have to do flatten() on your list before)

Upvotes: 1

Related Questions