Chris
Chris

Reputation: 710

Time complexity in sorting a list by converting it to a set and back into a list

I recently watched Raymond Hettingers talk about python dictionaries (and by extension sets...) and he mentioned that integers hash to themselves and that adding integers to a dict (or set...) will insert them in order and as long as you don't delete items the order will be preserved in python 3.6 (and probably above?). In the answer to this question it is stated that dictionaries preserve order of insertion, but for sets it seams like integers are ordered according to their value.

Now: according to the time-complexity section of python.org and in more detail here it is stated that the average time-complexity of adding an element to a set is O(1). This means if you have an unsorted list of integers it should be possible to sort them by simply doing:

sorted_list = list(set(unsorted_list))

This seams to be the case as far as I have tested it (did it a few 1000 times with random sequences).

My question is now: Does this mean it is possible to sort integers in python in O(n) time?

To me it would seam so, as it takes O(n) to build the set and O(n) to convert the set back to a list or am I missing something here?

Upvotes: 2

Views: 1554

Answers (3)

MisterMiyagi
MisterMiyagi

Reputation: 50116

No. Sets do not allow sorting integers. While the hashes of integers are well-defined, the order of sets is arbitrary.

The order of sets may vary by implementation, process and instance.

# CPython 3.7.4
>>> list({1, 8})
[8, 1]
>>> list({8, 1})
[8, 1]
# PyPy 3.6.9 (PyPy 7.3.0)
>>> list({1, 8})
[1, 8]
>>> list({8, 1})
[8, 1]
# CPython 2.7.10
>>> list({1, 8})
[8, 1]
>>> list({8, 1})
[8, 1]
# Jython 2.7.1 (java13.0.2)
>>> list({1, 8})
[1, 8]
>>> list({8, 1})
[1, 8]

The order of sets may also depend on the history of an instance.

# CPython 3.7.4
>>> a = {1, 3, 4, 8}
>>> list(a)
[8, 1, 3, 4]
>>> a.add(2)
>>> list(a)
[1, 2, 3, 4, 8]
>>> a.discard(2)
>>> list(a)
[1, 3, 4, 8]

Upvotes: 5

Kelly Bundy
Kelly Bundy

Reputation: 27609

No, not in general. You must've tried it with special cases, for example where the unsorted input list contains all numbers from 0 to n, each once.

Here's a simple case that fails:

>>> list(set([8, 1]))
[8, 1]

Done with CPython 3.8.1 32-bit.

Upvotes: 6

Błotosmętek
Błotosmętek

Reputation: 12927

Generally, O(n) sorting is possible for integers, strings and many other types of data. O(n log n) is the best you can do with sorting algorithms that only use comparisons (>, <, ==) to determine the order of items, but for many types, you're not limited to such algorithms. In particular, see Radix sort for sorting integers.

Upvotes: 4

Related Questions