Bryan
Bryan

Reputation: 6149

Extracting Unique String Combinations from List of List in Python

I'm trying to extract all the unique combinations of strings from a list of lists in Python. For example, in the code below, ['a', 'b','c'] and ['b', 'a', 'c'] are not unique, while ['a', 'b','c'] and ['a', 'e','f'] or ['a', 'b','c'] and ['d', 'e','f'] are unique.

I've tried converting my list of lists to a list of tuples and using sets to compare elements, but all elements are still being returned.

combos = [['a', 'b', 'c'], ['c', 'b', 'a'], ['d', 'e', 'f'], ['c', 'a', 'b'], ['c', 'f', 'b']]

# converting list of list to list of tuples, so they can be converted into a set
combos = [tuple(item) for item in combos]
combos = set(combos)

grouping_list = set()
for combination in combos:
    if combination not in grouping_list:
        grouping_list.add(combination)
##

print grouping_list
 >>> set([('a', 'b', 'c'), ('c', 'a', 'b'), ('d', 'e', 'f'), ('c', 'b', 'a'), ('c', 'f', 'b')])

Upvotes: 0

Views: 140

Answers (3)

Saksham Varma
Saksham Varma

Reputation: 2140

How about this:

combos = [['a', 'b', 'c'], ['c', 'b', 'a'], ['d', 'e', 'f'], ['c', 'a', 'b'], ['c', 'f', 'b']]
print [list(y) for y in set([''.join(sorted(c)) for c in combos])]

Upvotes: 0

ferhatelmas
ferhatelmas

Reputation: 3978

>>> set(tuple(set(combo)) for combo in combos)
{('a', 'c', 'b'), ('c', 'b', 'f'), ('e', 'd', 'f')}

Simple but if we have same elements in the combo, it will return wrong answer. Then, sorting is the way to go as suggested in others.

Upvotes: 1

Joost
Joost

Reputation: 4134

How about sorting, (and using a Counter)?

from collections import Counter

combos = [['a', 'b', 'c'], ['c', 'b', 'a'], ['d', 'e', 'f'], ['c', 'a', 'b'], ['c', 'f', 'b']]
combos = Counter(tuple(sorted(item)) for item in combos)
print(combos)

returns:

Counter({('a', 'b', 'c'): 3, ('d', 'e', 'f'): 1, ('b', 'c', 'f'): 1})

EDIT: I'm not sure if I'm correctly understanding your question. You can use a Counter to count occurances, or use a set if you're only interested in the resulting sets of items, and not in their uniqueness.

Something like:

combos = set(tuple(sorted(item)) for item in combos)

Simply returns

set([('a', 'b', 'c'), ('d', 'e', 'f'), ('b', 'c', 'f')])

Upvotes: 2

Related Questions