mazore
mazore

Reputation: 1024

Why is random.shuffle so much slower than using sorted function?

When using pythons random.shuffle function, I noticed it went significantly faster to use sorted(l, key=lambda _: random.random()) than random.shuffle(l). As far as I understand, both ways produce completely random lists, so why does shuffle take so much longer?

Below are the times using timeit module.

from timeit import timeit
setup = 'import random\nl = list(range(1000))'

# 5.542 seconds
print(timeit('random.shuffle(l)', setup=setup, number=10000))

# 1.878 seconds
print(timeit('sorted(l, key=lambda _: random.random())', setup=setup, number=10000))

Upvotes: 6

Views: 1226

Answers (1)

ShadowRanger
ShadowRanger

Reputation: 155477

On CPython (the reference interpreter) random.shuffle is implemented in Python (and implemented in terms of _randbelow, itself a Python wrapper around getrandbits, the C level function that ultimately implements it, and which can end up being called nearly twice as often as strictly necessary in an effort to ensure the outputs are unbiased); sorted (and random.random) are implemented in C. The overhead of performing work in Python is higher than performing similar work in C.

Upvotes: 5

Related Questions