fred.schwartz
fred.schwartz

Reputation: 2155

Python Combinations and product

I have a list of dataframes columns:

L=[AA ,  AS  ,  AD  , BB  , BC  , CD ,  CF,CG ]

and I need all combinations of items, in no particular order.

However, I can only have one name starting with A in each combination BUT I can have multiple names starting with C or none.

Regarding B's I must have at least 1 B but could have more

So I need all combinations of

A=[AA,AS,AD] #only one of these
B=[BB,BC]  #at least one of these
all_others=[CD,CF,CG]  #All, 1, 2 or none of these

So far I have this code;

from itertools import product

for choices in product(['AA','AS','AD',None],['BB', 'BC', None], ['CD','CF', None],):
    print(' '.join(column for column in choices if column))

This works, however, it only allows for one value starting with C, but I want to product all combinations of C. Can anyone see a good edit I can make?

To summarise; I need all combinations of the names in the list. With the one rule, that you can't have more than 1 variable starting with A and more than one variable starting with B

Upvotes: 3

Views: 2939

Answers (3)

Ben Mares
Ben Mares

Reputation: 2157

Here is a more robust/general way of doing the sort of thing you want. I start by defining a helper function:

from itertools import combinations, chain, product

def subsets_of_length(s, lengths):
    return chain.from_iterable(combinations(s,l) for l in lengths)

It produces the following output:

>>>> list(subsets_of_length(['a','b','c'], range(2,4)))
[('a', 'b'), ('a', 'c'), ('b', 'c'), ('a', 'b', 'c')]

>>>> list(subsets_of_length(['d','e'], range(0,2)))
[(), ('d',), ('e',)]

Now we want to combine two or more subsets as follows

>>>> for choices in product(
         subsets_of_length(['a','b','c'], range(2,4)),
         subsets_of_length(['d','e'], range(0,2)),
     ):
         print(' '.join(str(subset) for subset in choices))

('a', 'b') ()
('a', 'b') ('d',)
('a', 'b') ('e',)
('a', 'c') ()
('a', 'c') ('d',)
('a', 'c') ('e',)
('b', 'c') ()
('b', 'c') ('d',)
('b', 'c') ('e',)
('a', 'b', 'c') ()
('a', 'b', 'c') ('d',)
('a', 'b', 'c') ('e',)

But we want to chain these tuples together. Thus we should do

>>>> for choices in map(chain.from_iterable,product(
         subsets_of_length(['a','b','c'], range(2,4)),
         subsets_of_length(['d','e'], range(0,2)),
     )):
         print(' '.join(column for column in choices if column))

a b
a b d
a b e
a c
a c d
a c e
b c
b c d
b c e
a b c
a b c d
a b c e

The code for the case of your edited question would be:

for choices in map(chain.from_iterable,product(
    subsets_of_length(['AA','AS','AD'], [1]),       #only one of these
    subsets_of_length(['BB','BC'], [1,2]),          #at least one of these
    subsets_of_length(['CD','CF','CG'], [0,1,2,3]), #All, 1, 2 or none of these
)):
    print(' '.join(column for column in choices if column))

Upvotes: 1

Mehrdad Pedramfar
Mehrdad Pedramfar

Reputation: 11073

Try this instead of your for loop:

for choices in itertools.product(['AA','AS','AD',None],['BB', 'BC', None],[' '.join(k) for j in list(itertools.combinations(['CD','CF'],i) for i in range(3)) for k in j]):
    # do what you need

output for choices using print(' '.join(column for column in choices if column)) is :

AA BB
AA BB CD
AA BB CF
AA BB CD CF
AA BC
AA BC CD
AA BC CF
AA BC CD CF
AA
AA CD
AA CF
AA CD CF
AS BB
AS BB CD
AS BB CF
AS BB CD CF
AS BC
AS BC CD
AS BC CF
AS BC CD CF
AS
AS CD
AS CF
AS CD CF
AD BB
AD BB CD
AD BB CF
AD BB CD CF
AD BC
AD BC CD
AD BC CF
AD BC CD CF
AD
AD CD
AD CF
AD CD CF
BB
BB CD
BB CF
BB CD CF
BC
BC CD
BC CF
BC CD CF

CD
CF
CD CF

I recommend you to replace None with '' or remove them.

Upvotes: 2

Ben Mares
Ben Mares

Reputation: 2157

Sure, to express

all_others=[CD,CF,CG]  #All, 1, 2 or none of these

break it up as

all_others=[CD]  #one or none of these
all_others=[CF]  #one or none of these
all_others=[CG]  #one or none of these

Then your code becomes

from itertools import product

for choices in product(['AA','AS','AD',None],['BB', 'BC', None], ['CD', None], ['CF', None], ['CG', None],):
    print(' '.join(column for column in choices if column))

This handles this particular example. However, if you instead have several items starting with C, they can be handled more systematically as follows:

from itertools import product

for choices in product(['AA','AS','AD',None],['BB', 'BC', None], *product(['CD', 'CF', 'CG'], [None]),):
    print(' '.join(column for column in choices if column))

To explain what's going on, taking the product of ['CD', 'CF', 'CG'] with [None] yields an iterator containing

('CD', None), ('CF', None), ('CG', None)

These are precisely the arguments we wish to pass to product The * operator converts the elements inside an iterator into function arguments. Thus the above two code snippets are equivalent.

Upvotes: 1

Related Questions