Reputation: 197
I'm trying to figure out how to delete duplicates from 2D list. Let's say for example:
x= [[1,2], [3,2]]
I want the result:
[1, 2, 3]
in this order.
Actually I don't understand why my code doesn't do that :
def removeDuplicates(listNumbers):
finalList=[]
finalList=[number for numbers in listNumbers for number in numbers if number not in finalList]
return finalList
If I should write it in nested for-loop form it'd look same
def removeDuplicates(listNumbers):
finalList=[]
for numbers in listNumbers:
for number in numbers:
if number not in finalList:
finalList.append(number)
return finalList
"Problem" is that this code runs perfectly. Second problem is that order is important. Thanks
Upvotes: 1
Views: 2290
Reputation: 10951
finalList
is always an empty list on your list-comprehension even though you think it's appending during that to it, which is not the same exact case as the second code (double for
loop).
What I would do instead, is use set
:
>>> set(i for sub_l in x for i in sub_l)
{1, 2, 3}
EDIT: Otherway, if order matters and approaching your try:
>>> final_list = []
>>> x_flat = [i for sub_l in x for i in sub_l]
>>> list(filter(lambda x: f.append(x) if x not in final_list else None, x_flat))
[] #useless list thrown away and consumesn memory
>>> f
[1, 2, 3]
Or
>>> list(map(lambda x: final_list.append(x) if x not in final_list else None, x_flat))
[None, None, None, None] #useless list thrown away and consumesn memory
>>> f
[1, 2, 3]
EDIT2:
As mentioned by timgeb, obviously the map
& filter
will throw away lists that are at the end useless and worse than that, they consume memory. So, I would go with the nested for
loop as you did in your last code example, but if you want it with the list comprehension approach than:
>>> x_flat = [i for sub_l in x for i in sub_l]
>>> final_list = []
>>> for number in x_flat:
if number not in final_list:
finalList.append(number)
Upvotes: 3
Reputation: 78670
You declare finalList
as the empty list first, so
if number not in finalList
will be False
all the time.
The right hand side of your comprehension will be evaluated before the assignment takes place.
Iterate over the iterator chain.from_iterable
gives you and remove duplicates in the usual way:
>>> from itertools import chain
>>> x=[[1,2],[3,2]]
>>>
>>> seen = set()
>>> result = []
>>> for item in chain.from_iterable(x):
... if item not in seen:
... result.append(item)
... seen.add(item)
...
>>> result
[1, 2, 3]
Further reading: How do you remove duplicates from a list in Python whilst preserving order?
edit:
You don't need the import to flatten the list, you could just use the generator
(item for sublist in x for item in sublist)
instead of chain.from_iterable(x)
.
Upvotes: 1
Reputation: 15423
There is no way in Python to refer to the current comprehesion. In fact, if you remove the line finalList=[]
, which does nothing, you would get an error.
You can do it in two steps:
finalList = [number for numbers in listNumbers for number in numbers]
finalList = list(set(finalList))
or if you want a one-liner:
finalList = list(set(number for numbers in listNumbers for number in numbers))
Upvotes: 0
Reputation: 2114
The expression on the right-hand-side is evalueated first, before assigning the result of this list comprehension to the finalList. Whereas in your second approach you write to this list all the time between the iterations. That's the difference.
That may be similar to the considerations why the manuals warn about unexpected behaviour when writing to the iterated iterable inside a for loop.
you could use the built-in set()
-method to remove duplicates (you have to do flatten()
on your list before)
Upvotes: 1