Ollu_
Ollu_

Reputation: 126

Choice vs. Shuffle, Python

I was wondering which of shuffle() and choice() was more efficient. For example, I have a N element list long_list and need a random item from the list, is it quicker to do:

shuffle(long_list)
return long_list[0]

or

return choice(long_list)

Upvotes: 1

Views: 3455

Answers (2)

AlanK
AlanK

Reputation: 9843

random.choice() will always be faster as it does lookups in constant time O(1)

random.shuffle() requires iterating over the full sequence to execute the shuffle and this would be O(n), and then your lookup which is constant time O(1)

The source code is available to read here

Upvotes: 4

Blckknght
Blckknght

Reputation: 104752

It will surely be faster to use random.choice.

You can figure this out pretty easily by looking at the code. (The random module in cpython is mostly implemented in Python, only the low-level random number generation is in C.) Here are the relevant parts, which happen to be right next to each other:

def choice(self, seq):
    """Choose a random element from a non-empty sequence."""
    try:
        i = self._randbelow(len(seq))
    except ValueError:
        raise IndexError('Cannot choose from an empty sequence') from None
    return seq[i]

def shuffle(self, x, random=None):
    """Shuffle list x in place, and return None.
    Optional argument random is a 0-argument function returning a
    random float in [0.0, 1.0); if it is the default None, the
    standard random.random will be used.
    """

    if random is None:
        randbelow = self._randbelow
        for i in reversed(range(1, len(x))):
            # pick an element in x[:i+1] with which to exchange x[i]
            j = randbelow(i+1)
            x[i], x[j] = x[j], x[i]
    else:
        _int = int
        for i in reversed(range(1, len(x))):
            # pick an element in x[:i+1] with which to exchange x[i]
            j = _int(random() * (i+1))
            x[i], x[j] = x[j], x[i]

The choice method only generates one random number, and uses it to index into the sequence it was given. The shuffle method, on the other hand, loops over the length of the sequence and swaps elements around as it goes. So choice will take O(1) time, while shuffle takes O(N) time.

In most cases, if you just need one value, you want to use random.choice. If you want several non-repeating selections, use random.sample. Only if you're going to use all the values should you use random.shuffle. Note that even if you do expect to select many items, you should still use random.choice if you want it to be possible for the same item to be picked more than once. Sampling and shuffling will never repeat items (unless the items were repeated in the input list).

Upvotes: 3

Related Questions