Reputation: 1024
When using pythons random.shuffle
function, I noticed it went significantly faster to use sorted(l, key=lambda _: random.random())
than random.shuffle(l)
. As far as I understand, both ways produce completely random lists, so why does shuffle
take so much longer?
Below are the times using timeit
module.
from timeit import timeit
setup = 'import random\nl = list(range(1000))'
# 5.542 seconds
print(timeit('random.shuffle(l)', setup=setup, number=10000))
# 1.878 seconds
print(timeit('sorted(l, key=lambda _: random.random())', setup=setup, number=10000))
Upvotes: 6
Views: 1226
Reputation: 155477
On CPython (the reference interpreter) random.shuffle
is implemented in Python (and implemented in terms of _randbelow
, itself a Python wrapper around getrandbits
, the C level function that ultimately implements it, and which can end up being called nearly twice as often as strictly necessary in an effort to ensure the outputs are unbiased); sorted
(and random.random
) are implemented in C. The overhead of performing work in Python is higher than performing similar work in C.
Upvotes: 5