Reputation: 31
I have an array of 10 elements and am looking for the fastest way to randomly generate n combinations of size m in Python with those 10 elements.
I have to do that for all possible sizes of the combinations (m goes from 1 to the number of elements) and n also varies, it increases when the number of combination possible increases.
For example if my input array is: [a, b, c, d, e, f, g, h, i, j]
for n = 4 and m = 3 the output should be:
(e, b, c)
(i, a, d)
(g, j, e)
(i, f, e)
There can't be twice the same element in one permutation.
I am aware of the itertool function that generates all combination of a given size, but I only need n combinations, not all. So I am not sure that using itertools is the fastest solution for my problem (since I have then to select randomly n combinations among all those generated).
Upvotes: 2
Views: 871
Reputation: 1684
The more-itertools
library has some useful functions that can index into a set of combinations at high speed.
from more_itertools import combination_index, nth_combination
import random
import pprint
digit_range = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j']
combination_size = 3
sample_size = 10
domain = digit_range * combination_size
last_combination = tuple(digit_range[-1] * combination_size)
last_index = combination_index(last_combination, domain)
sample_indexes = random.sample(range(0, last_index + 1), sample_size)
sample = [
nth_combination(domain, combination_size, index)
for index
in sample_indexes
]
pprint.pprint(sorted(sample), indent=4)
Results:
python rand_combinations.py
[ ('b', 'e', 'i'),
('b', 'f', 'a'),
('b', 'i', 'j'),
('c', 'j', 'd'),
('d', 'j', 'h'),
('f', 'g', 'b'),
('g', 'b', 'c'),
('g', 'f', 'g'),
('h', 'a', 'd'),
('i', 'c', 'g')]
The code above selects thousands of random, unique combinations in a second or so.
Method:
more_itertools.combination_index()
random.sample()
more_itertools.nth_combination()
Upvotes: 1
Reputation: 104569
For a small list, this seems as good as anything else:
def shuffle(items, m=len(items)):
for i in range(m):
j = math.floor(random.random() * len(items))
items[i],items[j] = items[j],items[i]
def makeRandomPermutations(items, n, m):
clone = items[0:]
permutations=[]
for i in range(n):
shuffle(clone,m)
permutations.append(clone[0:m])
return permutations
Example:
>>> items=['a','b','d','e','f','g','h','i','j']
>>> perms = makeRandomPermutations(items,4,3)
>>> for p in perms:
... print(p)
['b', 'i', 'j']
['a', 'g', 'e']
['e', 'j', 'a']
['a', 'd', 'h']
>>>
Upvotes: 0
Reputation: 473
Using itertools
will be the fastest on every platforms because of its efficient implementation. Combinations are not computed upfront but only when iterating.
Doing this should give you the best results :
from itertools import combinations, islice
#: Generates an iterator for 10 elements among all combinations
#: Note that it is not random at all
input_data = [1, 2, 3, 4, 5]
m_size = 3
all_combinations = combinations(input_data, m_size)
subset = islice(all_combinations, 10)
#: To add randomness
from random import shuffle
shuffle(input_data)
all_combinations = combinations(input_data, m_size)
subset = islice(all_combinations, 10)
Upvotes: 0
Reputation: 6224
Unique Element
For this approach make sure to not pass m
(sample size) value greater than the array
(population size) size. Or you can add an if statement to check the m
value.
import random
def generate_combinations(array, m, n):
"""Generate n random combinations of size m"""
if m > len(array): # Changing the m value to array length if it is greater than length of array
m = len(array)
return [random.sample(array, k=m) for _ in range(n)] # for unique element
array = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
n = 5 # Number of combinations
m = 10
print(generate_combinations(array, m, n))
Sample Output
[[8, 1, 2, 5, 4, 3, 6, 7, 10, 9],
[9, 6, 7, 1, 8, 2, 10, 3, 4, 5],
[8, 3, 7, 10, 6, 9, 2, 5, 1, 4],
[10, 8, 9, 2, 6, 5, 7, 4, 1, 3],
[5, 4, 1, 7, 10, 6, 3, 9, 8, 2]]
For non-unique element
import random
def generate_combinations(array, m, n):
"""Generate n random combinations of size m using list comprehension"""
return [random.choices(array, k=m) for _ in range(n)]
array = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
n = 5 # Number of combinations
m = 10
print(generate_combinations(array, m, n))
Sample Output
[[9, 8, 4, 2, 5, 4, 5, 5, 6, 1],
[2, 6, 10, 3, 6, 1, 7, 6, 2, 4],
[1, 9, 7, 9, 8, 2, 10, 1, 1, 7],
[10, 4, 5, 7, 8, 9, 5, 1, 4, 1],
[9, 3, 3, 5, 9, 4, 9, 8, 10, 6]]
or you can use numpy
(Unique)
import numpy as np
def generate_combinations(array, m, n):
if m> len(array):
m = len(array)
return [np.random.choice(array, (1, m), replace=False).tolist()[0] for _ in range(n)]
array = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
n = 5 # Number of combinations
m = 10
print(generate_combinations(array, m, n))
Sample Output
[[6, 7, 9, 10, 8, 4, 1, 2, 3, 5],
[8, 5, 2, 10, 7, 3, 9, 4, 1, 6],
[8, 1, 4, 10, 3, 6, 9, 7, 2, 5],
[7, 4, 8, 5, 3, 9, 6, 10, 1, 2],
[1, 8, 7, 3, 9, 2, 5, 4, 10, 6]]
Upvotes: 2
Reputation: 262204
A comment on the existing answers.
Disclaimer, as originally noted in a comment, my preconception was that codester_09 was the most straightforward and likely the most efficient.
First the incorrect answers:
itertools.combinations
.combinations
and shuffle
)Now a timing of the two correct answers codester_09 and selbie.
I generated random input lists, and set m
and n
to be 1/100th and 1/10th of the length of the input, then computed the total running time and plotted with perfplot
.
I had to wrap selbie's makeRandomPermutation
function to repeat it n
times:
def selbie(array):
m = len(array)//100
n = len(array)//10
return [makeRandomPermutation(array, m) for _ in range(n)]
The winner is the simple: [random.sample(array, k=m) for _ in range(n)]
Upvotes: 1
Reputation: 219
Modifying this solution which originally gives all combinations at time complexity O(N^2). This is the modified version:
cnt=0
def printCombination(arr, size, m,n):
global cnt
data = [0]*m
combinationUtil(arr, data, 0,
size - 1, 0, m,n)
def combinationUtil(arr, data, start,
end, index, m, n):
global cnt
if (index == m):
for j in range(m):
print(data[j], end = " ");
print()
cnt+=1
return
i = start;
while(i <= end and end - i + 1 >= m - index and cnt<n):
data[index] = arr[i]
combinationUtil(arr, data, i + 1,
end, index + 1, m, n)
i += 1
arr = [1, 2, 3, 4, 5]
m = 3
size = len(arr)
n= 5
printCombination(arr, size, m, n)
This will print all combinations if needed but isn't completely random (not sure if that's you want or not)
Upvotes: 0