user14262940
user14262940

Reputation: 1

All Possible Combinations from an array with boundary conditions in Python?

I have a list of 4 items, and I would like to find all possible combinations of these 4 items for a given array length. Ex: if I specify that I want the length to be 16, I want to generate all possible combinations of 16 items using the elements from the original list.

I also have restrictions, so I want each element in the original list to appear a certain number of times in the generated combinations.

Ex: colors = np.array(["red","blue", "green", "neon green"])

I want to generate all possible 16 element arrays that use these given colors in a certain ratio: "red" must appear 4 times, "blue" must appear 4 times, and "green" + "neon green" together must appear 8 times (it doesn't matter if we have 8 greens, 0 neon greens, or vice versa, or a mix of both as long as the sum of both in the combination is 8.

I tried using itertools.product to generate all possible combinations and then looping through each combination to see if it meets my criteria for a "valid array," but for a 16-element long combination, it times out (although this method does work if I want to do 4 or 8-element long combinations).

This is my current code:

def validCombinations(rows, columns):
    possibleCombinations = product(colors,repeat=rows) #generates all the possible combinations
    possibleCombos = [] #possible combinations that match our 1:2:1 ratio restriction
    

    counter = 1

    #loops through each combination and puts each ion in that combination into an array (array a)
    for c in product(possibleCombinations,repeat=columns):
        a = []
        for i in c:
            for j in i:
                a.append(j)


        #if a (the combination) contains a 1:2:1 ratio, then add it to the array of possible combos 
        if a.count("red") == (rows *columns)/4 and a.count("blue") == (rows *columns)/4 and (a.count("green") + a.count("neon green")) == (rows*columns)/2:
              possibleCombos.append(a)

  
    
    print(len(possibleCombos))
    return(possibleCombos)

validCombinations(2,2) #for a list of 4 elements
#validCombinations(4,4) #for a list of 16 elements
#validCombinations(2,4) etc..

Is itertools.product() the right approach, or is there a faster alternative?

Upvotes: 0

Views: 1209

Answers (2)

VirtualScooter
VirtualScooter

Reputation: 1888

Like Alexander Wu is stating, you might want to re-consider your problem definition. Using the itertools.product is most likely the best way to get close, but your current code introduces inefficiencies causing it to run too long for relatively small sizes.

However, if you shoot for having valid combinations, the code below will generate that much more efficiently:

import itertools

colors1 = ('blue', 'red')
colors2 = ('green', 'neon green')

def equalCount(c1, sz, curr):
    """ Even representation of c1 in curr. """
    return not all(curr.count(c) == sz for c in c1)

def validCombinations1(c1, c2, size):
    """ List valid combinations of colors."""
    sz = int(size / 2)
    col_iter1 = itertools.filterfalse(lambda x: equalCount(c1, sz/2, x),
                  itertools.product(c1, repeat=sz))
    col_iter2 = itertools.product(c2, repeat=sz)
    vc = tuple(itertools.product(col_iter1, col_iter2))
    return tuple(v[0]+v[1] for v in vc)

vc1 = validCombinations1(colors1, colors2, 4)
print(vc1)
print(len(vc1))
vc2 = validCombinations1(colors1, colors2, 8)
print(len(vc2))
vc3 = validCombinations1(colors1, colors2, 16)
print(len(vc3))

Output:

# (('blue', 'red', 'green', 'green'),
#  ('blue', 'red', 'green', 'neon green'),
#  ('blue', 'red', 'neon green', 'green'),
#  ('blue', 'red', 'neon green', 'neon green'),
#  ('red', 'blue', 'green', 'green'),
#  ('red', 'blue', 'green', 'neon green'),
#  ('red', 'blue', 'neon green', 'green'),
#  ('red', 'blue', 'neon green', 'neon green'))
# 8
# 48
# 17920

Your code seems to produce duplicate permutations; note that the code above also produces some permutations, but no duplicates. But without knowing a bit more about the "rules", it's difficult to build a more appropriate program.

I ran some timing on my laptop to compare results. The code above, even with permutation generation, compares favorably, with 10 microseconds versus 235 microseconds for size 4.

%timeit validCombinations(2, 2)
235 µs ± 48.1 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
%timeit validCombinations1(colors1, colors2, 4)
7.93 µs ± 608 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
%timeit validPermutations(validCombinations1(colors1, colors2, 4))
10.5 µs ± 1.23 µs per loop (mean ± std. dev. of 7 runs, 100000 loops each)

%timeit validCombinations1(colors1, colors2, 8)
46.4 µs ± 5.58 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
%timeit validCombinations1(colors1, colors2, 16)
5.87 ms ± 499 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

Upvotes: 0

Alexander Wu
Alexander Wu

Reputation: 483

I don't think the problem you're trying to solve is feasible. There is probably a faster way than your current approach, but even the theoretically optimal solution would take very long unless you have an extremely fast computer running in a fast language, and since you're using Python, I assume that is not the case.

You are searching all possible combinations of the four colors. There are a total of 4**16 == 4294967296 combinations (each item has 4 choices, apply the product counting rule), and assuming generating each one takes a millisecond, this would take 1193 hours. Clearly it is not feasible in Python to iterate over all of those.

Even if there were a better way that only generates the combinations that fit your criteria, that is still comb(16, 4) * comb(12, 4) * 2**8 == 230630400 combinations (choose 4 locations out of 16 for red, choose 4 locations out of 12 for blue, then each of the remaining 8 positions can be one of two colors), and again assuming each one takes a millisecond to process, that is 64 hours.

You should consider changing your implementation to not require generating every combination. Perhaps you only need to check whether something is a valid combination. Or maybe you don't actually need such large numbers; if you only input relatively small numbers the code you're currently using isn't that much slower than the optimal.

Upvotes: 4

Related Questions