Reputation: 789

Limit the number of repeats in pseudo random python list

For a science experiment, I need to generate a pseudo-random order for administering two different tests, 10 times each. I've used this code:

import random
randy  = [1] * 10 + [2] * 10 
random.shuffle(randy)

This gives me a nicely shuffled order for testing, however I need to ensure that the max number of repeated tests is not larger than 3. In other words, don't administer the "1" test more than 3 times in a row.

Can anyone think of a good way to do this? Shuffling multiple times does not guarantee success. Any way I can robustly check the shuffled list and change it accordingly? Thanks!

Upvotes: 3

Answers (6)

Benoît Valéry

Reputation: 1

Here is an optimized version of Veedrac's answer, where you want to get only one correct list. More interesting if you want to get a sequence on the fly, but less if you want to avoid sequence repetition.

from random import shuffle
from itertools import groupby

def get_binary_sequence(sequence_length, maximum_repetitions):
    order = [True, False] * int(sequence_length/2)
    while True:
        shuffle(order)
        if all(len(list(group)) <= maximum_repetitions _, group in groupby(order)):
            return order

Upvotes: 0

Veedrac

Reputation: 60147

John Y's solution makes you search the whole solution space; although this is bearable it's hardly worth doing. Instead, just sample optimistically:

import random

sequences = []
order = [1, 0] * 10

while len(sequences) < 10:
    random.shuffle(order)

    if order in sequences:
        continue

    sequences.append(order[:])

Then to remove groups of length 4, you could check with something like

from itertools import groupby

while len(sequences) < 10:
    random.shuffle(order)

    if order in sequences:
        continue

    if all(len(list(group)) < 4 for _, group in groupby(order)):
        sequences.append(order[:])

Upvotes: 2

John Y

Reputation: 14529

For a problem of such small size, I disagree with @texasflood's comment that precomputing all the possibilities and then just picking from among them would be grossly inefficient. In fact, the stated parameters are so small that it is very manageable to just generate all the possibilities using sheer brute force, as I'll demonstrate below.

In your particular case, you always only run 20 tests, and you only have 2 possible tests to choose from. So you know that you cannot possibly have more than 2**20 sequences, even with no other constraints. This is only 1048576 possibilities, easily manageable with today's memory.

Further, according to your problem statement, you're constrained to using 10 of one test and 10 of the other. That reduces the number of possibilities to 184756. (Using classical probability counting techniques, this is computed as 20!/(10!*10!).)

And that's before you've even eliminated the possibilities with runs of four (or more) of the same test in a row.

So, my strong recommendation is to just do the work of computing all the usable possibilities, and then use random.choice on this collection of possibilities.

To get you started, here's a simple loop to get all the possible sequences that contain exactly 10 zeros and 10 ones:

sequences = []
for n in range(2**20):
    b = bin(n)[2:].zfill(20)
    if b.count('1') == 10:
        sequences.append(b)

Note that the bin function (which requires Python 2.6 or later) generates a binary string representation of an integer, which starts with '0b' (thus the [2:] to strip it off).

I'll leave it as an exercise for the reader to eliminate the four-in-a-row sequences. (Hint: You can simply refine the sample code I gave above, by testing for the existence of '1111' or the existence of '0000' in the binary string. You will wind up with a total of 66486 usable sequences, quite a small number by today's standards.)

Upvotes: 2

Reut Sharabani

Reputation: 31339

Here is an optimistic-retry strategy:

#!/usr/bin/env python
from random import choice

def added1(lst, bank):
    if len(bank) == 0:
        return lst
    selection = choice(bank)
    lst.append(selection)
    bank.remove(selection)
    if selection == 1:
        return added11(lst, bank)
    return added2(lst, bank)

def added11(lst,bank):
    if len(bank) == 0:
        return lst
    bank.remove(2)
    lst.append(2)
    return added2(lst, bank)

def added2(lst, bank):
    if len(bank) == 0:
        return lst
    selection = choice(bank)
    lst.append(selection)
    bank.remove(selection)
    if selection == 2:
        return added22(lst, bank)
    return added1(lst, bank)

def added22(lst,bank):
    if len(bank) == 0:
        return lst
    bank.remove(1)
    lst.append(1)
    return added1(lst, bank)

def start(lst, bank):
    bank_bkp = bank[:]
    while True:
        try:
            if len(bank) == 0:
                return lst
            selection = choice(bank)
            lst.append(selection)
            bank.remove(selection)
            if selection == 1:
                return added1(lst, bank)
            return added2(lst, bank)
        except:
            # retry
            bank = bank_bkp[:]
            lst = []


print start([], [1] * 10 + [2] * 10)

Output:

[1, 1, 2, 1, 1, 2, 2, 1, 2, 1, 1, 2, 2, 1, 1, 2, 2, 1, 2, 2]

It is based on simple functions representing states in this automaton:

which enforces the rules, and a bank of options. If the bank of options runs out - it tries again.

It could potentially take a lot of time, but it doesn't:

print timeit.repeat('start([], [1] * 10 + [2] * 10)', setup="from __main__ import start", number=10000, repeat=3)

Output:

[0.14524006843566895, 0.14585399627685547, 0.14375996589660645]

Note: This is recursive so a bank with over ~2000 members requires you to explicitly allow deeper recursions.

Upvotes: 2

Peter

Reputation: 3495

This doesn't keep exactly 10 1's and 10 2's so may not be what you're after, but it goes based on the chance (currently 50% for each), and you can add new tests if needed.

import random
from operator import itemgetter

#randy = [ [item,amount], ... ]
randy  = [[1,10],[2,10]]

#This turns the above list into the same format of your 'randy'
itemList = [j for k in[([i[0]]*i[1])for i in randy]for j in k]  

randomList = [-1]  #This stops the check from causing problems at the start
for i in range(len(itemList)):
    while True:
        newChoice = random.choice( itemList )
        if len(set(randomList[-2:]+[newChoice]))-1: #Checks the last 2 values plus the new value aren't all the same
            randomList.append( newChoice )
            break
shuffledList = randomList[1:]

Upvotes: 0

Martijn Arts

Reputation: 743

I would make your own shuffler since it would probably be the fastest and most elegant option:

randy = []

ones = [1] * 10
twos = [2] * 10

for i in range(20):
    if len(randy) > 3 and randy[i-1] == randy[i-2] == randy[i-3]:
        randy.append(ones.pop() if randy[i-1] == 1 else twos.pop())
    else:
        randy.append(random.choice([ones, twos]).pop())

Upvotes: 0

Limit the number of repeats in pseudo random python list

Answers (6)

Related Questions