Reputation: 4167
I have this issue where I would like to shuffle a list, but only do so slightly. Say, I want only a small number of elements to be moved. Is there a simple way to get this done?
Right now the best I can think of is building my own method be hand, but is there some way to use the random
library to do this for me?
Upvotes: 5
Views: 2404
Reputation: 3984
With the same interpretation as pjs' answer (initial ordering more or less preserved), you can move a window of a given width and shuffle the underlying elements. To avoid any bias, move this window randomly.
from random import shuffle
def shuffle_window(array, width):
indexes = list(range(len(array) - width + 1))
shuffle(indexes)
for i in indexes:
window = array[i : i + width]
shuffle(window)
array[i : i + width] = window
Examples:
for width in range(6):
array = list(range(25))
shuffle_window(array, width)
print(array)
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24]
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24]
[2, 0, 4, 1, 3, 7, 5, 6, 8, 11, 9, 12, 10, 15, 13, 14, 17, 16, 19, 18, 20, 23, 21, 24, 22]
[2, 4, 0, 3, 7, 1, 8, 6, 12, 5, 11, 10, 14, 9, 16, 15, 17, 13, 19, 18, 20, 24, 21, 22, 23]
[0, 6, 1, 2, 5, 3, 4, 12, 8, 7, 11, 9, 14, 13, 10, 16, 19, 17, 24, 23, 18, 20, 15, 21, 22]
[3, 2, 14, 6, 1, 9, 8, 12, 13, 0, 4, 15, 5, 17, 11, 16, 7, 23, 21, 20, 10, 18, 19, 22, 24]
Upvotes: 0
Reputation: 141
Just to provide an intuition on a possible use case of "slightly shuffling": Imagine decrypting a cipher text (permutation-only) with MCMC. One can consider an initial guess of mapping informed by prior knowledge. One of the many ways to construct such an informed prior is, e.g. to ensure the ranks of frequency statistics of each letter in the de-ciphered text match the ranks of some frequency statistics of each letter in a large corpus. In this case, one might only want to "slightly shuffle" this initial guess across an ensemble of MCMC runs.
A straight forward implementation, since there is already stochasticity in MCMC, is to only allow for shuffling within certain "window" for each letter. Of course, if done in a sequential manner, there might still be a larger "move" of letters, but in a larger vocabulary, this should provide a slight enough shuffling.
Upvotes: 0
Reputation: 5380
Use the shuffle
method of Python's random
module. It takes a list
and a random
in arguments. Where the random
is a function which should return float number from 0.0 to 1.0. It helps shuffle
to shuffle the given list in a custom way.
You can overwrite that function.
import random
def rand():
return random.random() / 5
arr = [1, 2, 3, 4, 5, 6, 7, 8, 9]
random.shuffle(arr, random=rand)
# OUTPUT: [9, 3, 4, 5, 6, 7, 8, 1, 2]
Upvotes: 0
Reputation: 16174
to show what some of these solutions are doing I find it helps to run a monte-carlo algorithm many times and look at the distribution
first a tidied up version of @meta4's solution as it was the most fleshed out:
from random import randrange
def partial_shuffle(l, factor=5):
n = len(l)
for _ in range(factor):
a, b = randrange(n), randrange(n)
l[b], l[a] = l[a], l[b]
which we can run many times by doing:
import numpy as np
n = 8
orig = list(range(n))
occur = np.zeros((n, n), int)
for _ in range(100000):
x = orig[:]
partial_shuffle(x,1)
occur[orig,x] += 1
if we print out the occurrences table as percentages we get:
[[33.5 9.6 9.5 9.4 9.4 9.6 9.5 9.5]
[ 9.6 33.2 9.7 9.5 9.6 9.6 9.4 9.4]
[ 9.5 9.6 33.2 9.5 9.6 9.5 9.6 9.5]
[ 9.5 9.3 9.6 33.4 9.5 9.5 9.5 9.6]
[ 9.4 9.6 9.4 9.6 33.3 9.5 9.7 9.5]
[ 9.6 9.5 9.6 9.6 9.4 33.3 9.5 9.6]
[ 9.4 9.7 9.5 9.5 9.5 9.6 33.2 9.7]
[ 9.5 9.5 9.6 9.5 9.7 9.5 9.6 33.2]]
each row represents the probability of the item moving to the column. in this case (when n=8
) the algorithm will tend to leave elements where they were ~33% of the time, and then pick the remainder uniformly
I can then run (a tidied up) version of pjs's code with:
from random import gauss
orderliness = 2
occur = np.zeros((n, n), int)
for _ in range(100000):
x = sorted(orig, key=lambda i: gauss(i * orderliness, 1))
occur[orig,x] += 1
which gives very different output:
[[91.9 7.9 0.1 0. 0. 0. 0. 0. ]
[ 7.9 84.1 7.8 0.1 0. 0. 0. 0. ]
[ 0.1 7.8 84.1 7.9 0.1 0. 0. 0. ]
[ 0. 0.1 7.9 84.1 7.7 0.1 0. 0. ]
[ 0. 0. 0.1 7.7 84.2 7.8 0.1 0. ]
[ 0. 0. 0. 0.1 7.9 84.2 7.7 0.1]
[ 0. 0. 0. 0. 0.1 7.7 84.2 7.9]
[ 0. 0. 0. 0. 0. 0.1 7.9 91.9]]
i.e. items tend to remain close to where they started
this sort of table is great at detecting bias in the distribution, which there doesn't seem to be evidence of above. but, for example, with Artyom's solution (shuffle(x, lambda: random() / 5)
) gives the following:
[[ 0. 37.4 0. 0. 0. 16.7 23.8 22.1]
[ 0. 0. 100. 0. 0. 0. 0. 0. ]
[ 0. 0. 0. 100. 0. 0. 0. 0. ]
[ 0. 0. 0. 0. 100. 0. 0. 0. ]
[ 1.7 0. 0. 0. 0. 83.3 11.9 3. ]
[ 9. 7.4 0. 0. 0. 0. 64.2 19.4]
[ 26.7 17.9 0. 0. 0. 0. 0. 55.5]
[ 62.6 37.4 0. 0. 0. 0. 0. 0. ]]
which probably isn't what the OP wanted. the high probability off diagonal represents rotating the array by one element
Upvotes: 3
Reputation: 19853
One interpretation is to strongly or weakly retain the initial ordering. The weakest retention would be a completely random shuffle, the strongest would be to not deviate from the initial ordering.
This can be accomplished by creating a tuple consisting of the original index scaled by a constant, plus some randomness, followed by the value. Sort the tuples, then iterate through to recover the original values in their new order. If the scale factor for the index is near zero, the new order will be random. If it's near 1, things will tend to strongly but not perfectly retain their original ordering. If it's larger, the result becomes unlikely to be shuffled.
import random
orderliness = 0.75
def tuplify(x, y):
return (orderliness * y + random.gauss(0,1), x)
values = [i+1 for i in range(20)]
print(values)
pairs = list(map(tuplify, values, range(len(values))))
pairs.sort()
partially_ordered_values = [p[1] for p in pairs]
print(partially_ordered_values)
This produces, for example:
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20] # initial ordering
[2, 1, 3, 4, 5, 6, 7, 8, 9, 10, 12, 13, 11, 14, 17, 16, 15, 18, 19, 20] # weakly shuffled
Tendency to shuffle would be determined by the relative magnitudes of orderliness
and the standard deviation in random.gauss()
.
Upvotes: 4
Reputation: 733
One could also interpret slightly shuffled in the sense that there is a probability for shuffling elements at every step of the Fisher-Yates algorithm @rossum and @meta4 mentioned (instead of having a fixed number of elements shuffled).
def conditional_fy(l, p):
"""Shuffle elements of a list with a given probability
Args:
l: list
p: shuffle probability
(0: elements are never shuffled,
1: elements are always shuffled)
"""
assert 0 <= p <= 1
for i in range(len(l) - 1, 0, -1):
shuffle = random.random()
if shuffle < p:
j = random.randint(0, i - 1)
l[i], l[j] = l[j], l[i]
Upvotes: 0
Reputation: 448
from random import randint
def partial_shuffle(l, factor=5):
for _ in range(factor):
a, b = randint(0, len(l)), randint(0, len(l)) # pick two random indexes
l[b], l[a] = l[a], l[b] # swap the values at those indexes
return l
This is the partial Fisher-Yates Shuffle @rossum recomended.
''.join(partial_shuffle(list('abcdefghijklmnopqrstuvwxyz'), 2))
This example yields "abcdefnhijklmgopqrsyuvwxtz", from one run, but will yield something else for a different run.
Upvotes: 1
Reputation: 15685
Use a Fisher-Yates shuffle, but do not run it for the entire list. Just run one step for each entry you want moved: 5 steps to move 5 entries, 10 steps to move 10 entries.
Upvotes: -1