Why is random.shuffle so much slower than using sorted function?

Question

When using pythons random.shuffle function, I noticed it went significantly faster to use sorted(l, key=lambda _: random.random()) than random.shuffle(l). As far as I understand, both ways produce completely random lists, so why does shuffle take so much longer?

Below are the times using timeit module.

from timeit import timeit
setup = 'import random
l = list(range(1000))'

# 5.542 seconds
print(timeit('random.shuffle(l)', setup=setup, number=10000))

# 1.878 seconds
print(timeit('sorted(l, key=lambda _: random.random())', setup=setup, number=10000))

ShadowRanger · Accepted Answer

On CPython (the reference interpreter) random.shuffle is implemented in Python (and implemented in terms of _randbelow, itself a Python wrapper around getrandbits, the C level function that ultimately implements it, and which can end up being called nearly twice as often as strictly necessary in an effort to ensure the outputs are unbiased); sorted (and random.random) are implemented in C. The overhead of performing work in Python is higher than performing similar work in C.

Why is random.shuffle so much slower than using sorted function?

Answers (1)

Related Questions