Reputation: 100
For example I have a list:
lst = ["abc bca","bca abc","cde def"]
I want to consider the elements "abc bca" and "bca abc" same/duplicate, what should be the approach?
Upvotes: 0
Views: 85
Reputation: 7471
I'm not sure what you mean exactly by "I want to consider the elements the same", but you could use this approach if you wanted to return a set of "unique" items:
original_list = ["abc bca", "bca abc", "cde def"]
modified_list = []
for original_one_item in original_list:
original_one_items = original_one_item.split(' ')
original_one_items.sort()
modified_list.append(" ".join(original_one_items))
modified_list = set(modified_list)
This will remove the "bca abc"
item from the first list and return a set.
Upvotes: 0
Reputation: 118031
>>> [' '.join(j) for j in set(tuple(sorted(i.split())) for i in lst)]
['abc bca', 'cde def']
The way this works is by first spliting the strings on whitespace
>>> [i.split() for i in lst]
[['abc', 'bca'], ['bca', 'abc'], ['cde', 'def']]
Then sort each sublist
>>> [tuple(sorted(i.split())) for i in lst]
[('abc', 'bca'), ('abc', 'bca'), ('cde', 'def')]
Lastly you can create a set
since we converted to tuple
which is hashable (whereas list
is not).
>>> set(tuple(sorted(i.split())) for i in lst)
{('abc', 'bca'), ('cde', 'def')}
The outermost list comprehension simply uses join
to recreate the whitespace-joined original strings.
Upvotes: 3
Reputation: 2361
You can change yours strings to set of words:
>>> lst = ["abc bca","bca abc","cde def"]
>>> new_lst = [frozenset(x.split(' ')) for x in lst]
And then you can use just some method of finding duplicates in the list:
>>> print [item for item, count in collections.Counter(new_lst).items() if count > 1]
[frozenset(['abc', 'bca'])]
>>>
Upvotes: 0
Reputation: 3244
>>> from collections import Counter
>>> lst = ["abc bca","bca abc","cde def"]
>>> c = Counter(lst)
>>> c
Counter({'abc bca': 1, 'cde def': 1, 'bca abc': 1})
>>> for i in c:
... if c[i]>1:
... print i
...
>>> lst = ["abc","bca","bca","abc","cde","def"]
>>> c = Counter(lst)
>>> for i in c:
... if c[i]>1:
... print i
...
abc
bca
>>>
Upvotes: 1