jlengrand
jlengrand

Reputation: 12827

weird behaviour while removing duplicates in list

I have a list of integers.

What I would like to do is to sort them and remove all duplicates. I saw two different solutions on the internet. Both seem to give the same result which is not the one I expect.

a = integer_combinations(5, 5)
print a
>>[4, 8, 16, 32, 9, 27, 81, 243, 16, 64, 256, 1024, 25, 125, 625, 3125]

b = sorted(a)
print b
>>[4, 8, 9, 16, 16, 25, 27, 32, 64, 81, 125, 243, 256, 625, 1024, 3125]

c = dict().fromkeys(sorted(a)).keys()
print c
>> [32, 64, 4, 1024, 625, 8, 9, 256, 16, 81, 243, 3125, 25, 27, 125]

Another method, using sets:

d = list(set(b))
print d
>> [32, 64, 4, 1024, 625, 8, 9, 256, 16, 81, 243, 3125, 25, 27, 125]   

What I expect is :
>>[4, 8, 9, 16, 25, 27, 32, 64, 81, 125, 243, 256, 625, 1024, 3125]

Would someone know the reason of this behaviour?

Thanks!

Upvotes: 2

Views: 140

Answers (5)

Samvel
Samvel

Reputation: 182

set() is an unordered collection. Like dictionary it permutes keys on purpose for fast access. Therefore: list(set(...)) returns list of unsorted items. Use instead:

sorted(set(...))

if you need ordered sequence.

Upvotes: 3

chepner
chepner

Reputation: 531878

The keys method returns the keys of a dictionary in an undefined (yet consistent between calls) order, regardless of how the dictionary is created. [EDIT: as pointed out in the comment, the order is consistent as long as the dictionary remains unchanged.]

Upvotes: 0

Praveen Gollakota
Praveen Gollakota

Reputation: 38980

Python set was introduced in version 2.3. Solution proposed by @aix is most Pythonic if you are using Python >=2.3

In your code, the following line ...

c = dict().fromkeys(sorted(a)).keys()

creates a dict with keys from a and values default to None. And then, just retrieves the keys using keys() method. Since dictionaries have no defined order, the elements are retrieved randomly. You need to resort them. In any case, you should really use sorted(set(a)) as proposed already.

Upvotes: 2

NPE
NPE

Reputation: 500703

Here is what I would use:

>>> a = [4, 8, 16, 32, 9, 27, 81, 243, 16, 64, 256, 1024, 25, 125, 625, 3125]
>>> sorted(set(a))
[4, 8, 9, 16, 25, 27, 32, 64, 81, 125, 243, 256, 625, 1024, 3125]

The reason your code doesn't work as expected is that dict does not guarantee any particular ordering of its keys. Similarly, set has no guarantees as to the ordering of its elements.

Therefore, the sorting step has to come right at the end.

Upvotes: 7

soulcheck
soulcheck

Reputation: 36777

dictionary doesn't guarantee iterating (and printing) keys in insertion order .

Use collections.OrderedDict for that.

Upvotes: 0

Related Questions