Reputation: 2155
I have a list of dataframes columns:
L=[AA , AS , AD , BB , BC , CD , CF,CG ]
and I need all combinations of items, in no particular order.
However, I can only have one name starting with A in each combination BUT I can have multiple names starting with C or none.
Regarding B's I must have at least 1 B but could have more
So I need all combinations of
A=[AA,AS,AD] #only one of these
B=[BB,BC] #at least one of these
all_others=[CD,CF,CG] #All, 1, 2 or none of these
So far I have this code;
from itertools import product
for choices in product(['AA','AS','AD',None],['BB', 'BC', None], ['CD','CF', None],):
print(' '.join(column for column in choices if column))
This works, however, it only allows for one value starting with C, but I want to product
all combinations of C. Can anyone see a good edit I can make?
To summarise; I need all combinations of the names in the list. With the one rule, that you can't have more than 1 variable starting with A and more than one variable starting with B
Upvotes: 3
Views: 2939
Reputation: 2157
Here is a more robust/general way of doing the sort of thing you want. I start by defining a helper function:
from itertools import combinations, chain, product
def subsets_of_length(s, lengths):
return chain.from_iterable(combinations(s,l) for l in lengths)
It produces the following output:
>>>> list(subsets_of_length(['a','b','c'], range(2,4)))
[('a', 'b'), ('a', 'c'), ('b', 'c'), ('a', 'b', 'c')]
>>>> list(subsets_of_length(['d','e'], range(0,2)))
[(), ('d',), ('e',)]
Now we want to combine two or more subsets as follows
>>>> for choices in product(
subsets_of_length(['a','b','c'], range(2,4)),
subsets_of_length(['d','e'], range(0,2)),
):
print(' '.join(str(subset) for subset in choices))
('a', 'b') ()
('a', 'b') ('d',)
('a', 'b') ('e',)
('a', 'c') ()
('a', 'c') ('d',)
('a', 'c') ('e',)
('b', 'c') ()
('b', 'c') ('d',)
('b', 'c') ('e',)
('a', 'b', 'c') ()
('a', 'b', 'c') ('d',)
('a', 'b', 'c') ('e',)
But we want to chain these tuples together. Thus we should do
>>>> for choices in map(chain.from_iterable,product(
subsets_of_length(['a','b','c'], range(2,4)),
subsets_of_length(['d','e'], range(0,2)),
)):
print(' '.join(column for column in choices if column))
a b
a b d
a b e
a c
a c d
a c e
b c
b c d
b c e
a b c
a b c d
a b c e
The code for the case of your edited question would be:
for choices in map(chain.from_iterable,product(
subsets_of_length(['AA','AS','AD'], [1]), #only one of these
subsets_of_length(['BB','BC'], [1,2]), #at least one of these
subsets_of_length(['CD','CF','CG'], [0,1,2,3]), #All, 1, 2 or none of these
)):
print(' '.join(column for column in choices if column))
Upvotes: 1
Reputation: 11073
Try this instead of your for
loop:
for choices in itertools.product(['AA','AS','AD',None],['BB', 'BC', None],[' '.join(k) for j in list(itertools.combinations(['CD','CF'],i) for i in range(3)) for k in j]):
# do what you need
output for choices using print(' '.join(column for column in choices if column))
is :
AA BB
AA BB CD
AA BB CF
AA BB CD CF
AA BC
AA BC CD
AA BC CF
AA BC CD CF
AA
AA CD
AA CF
AA CD CF
AS BB
AS BB CD
AS BB CF
AS BB CD CF
AS BC
AS BC CD
AS BC CF
AS BC CD CF
AS
AS CD
AS CF
AS CD CF
AD BB
AD BB CD
AD BB CF
AD BB CD CF
AD BC
AD BC CD
AD BC CF
AD BC CD CF
AD
AD CD
AD CF
AD CD CF
BB
BB CD
BB CF
BB CD CF
BC
BC CD
BC CF
BC CD CF
CD
CF
CD CF
I recommend you to replace None
with ''
or remove them.
Upvotes: 2
Reputation: 2157
Sure, to express
all_others=[CD,CF,CG] #All, 1, 2 or none of these
break it up as
all_others=[CD] #one or none of these
all_others=[CF] #one or none of these
all_others=[CG] #one or none of these
Then your code becomes
from itertools import product
for choices in product(['AA','AS','AD',None],['BB', 'BC', None], ['CD', None], ['CF', None], ['CG', None],):
print(' '.join(column for column in choices if column))
This handles this particular example. However, if you instead have several items starting with C, they can be handled more systematically as follows:
from itertools import product
for choices in product(['AA','AS','AD',None],['BB', 'BC', None], *product(['CD', 'CF', 'CG'], [None]),):
print(' '.join(column for column in choices if column))
To explain what's going on, taking the product of ['CD', 'CF', 'CG']
with [None]
yields an iterator containing
('CD', None), ('CF', None), ('CG', None)
These are precisely the arguments we wish to pass to product
The *
operator converts the elements inside an iterator into function arguments. Thus the above two code snippets are equivalent.
Upvotes: 1