Reputation: 285
I have a nested list and would like to make a product of two items.
test = [[('juice', 'NOUN'), ('orange', 'FLAVOR')],
[('juice', 'NOUN'), ('orange', 'FLAVOR'), ('lemon', 'FLAVOR')],
[('orange', 'FLAVOR'), ('chip', 'NOUN')]]
What I expect is something like this:
[(('juice', 'NOUN'), ('lemon', 'FLAVOR')),
(('juice', 'NOUN'), ('chip', 'NOUN')),
(('orange', 'FLAVOR'), ('lemon', 'FLAVOR')),
(('orange', 'FLAVOR'), ('chip', 'NOUN')),
(('lemon', 'FLAVOR'), ('chip', 'NOUN'))]
That is to say, I would like to get the permutation across lists but only for unique items. I prefer to use itertools
. Previously, I tried list(itertools.product(*test))
But I realized it would produce the product of the length of a nested list...
My current code:
unique_list = list(set(itertools.chain(*test)))
list(itertools.combinations(unique_list, 2))
My thought process is to get the unique items in the nested list first, so the nested list will be [[('juice', 'NOUN'), ('orange', 'FLAVOR')], [('lemon', 'FLAVOR')], [('chip', 'NOUN')]]
and then use the itertools.combinations
to permute. Yet, it will permute within the list (i.e. juice and orange appear together), which I do not want in my results.
Upvotes: 1
Views: 483
Reputation: 15204
This does what you want without fixing the size of the original list to 3:
Input:
test = [[('juice', 'NOUN'), ('orange', 'FLAVOR')],
[('juice', 'NOUN'), ('orange', 'FLAVOR'), ('lemon', 'FLAVOR')],
[('juice', 'NOUN'), ('chip', 'NOUN')]]
First, reformat input to remove duplicates (see note 1):
test = [[x for x in sublist if x not in sum(test[:i], [])] for i, sublist in enumerate(test)]
Finally, get the product of the combinations.
from itertools import combinations, product
for c in combinations(test, 2):
for x in product(*c):
print(x)
which produces:
(('juice', 'NOUN'), ('lemon', 'FLAVOR'))
(('orange', 'FLAVOR'), ('lemon', 'FLAVOR'))
(('juice', 'NOUN'), ('chip', 'NOUN'))
(('orange', 'FLAVOR'), ('chip', 'NOUN'))
(('lemon', 'FLAVOR'), ('chip', 'NOUN'))
sum(test[:i], [])
which "adds" all the previous sublists together to perform one membership check only.There is also a list-comprehension version of the above for compactness and style-points:
res = [x for c in combinations(test, 2) for x in product(*c)]
Upvotes: 1