Reputation: 33273
So, I have a list like following
potential_labels = ['foo', 'foo::bar', 'foo::bar::baz', "abc", "abc::cde::def", "bleh"]
The desired_output = ['foo::bar::baz', "abc::cde::def", "bleh"]
This is because.. for root "foo", 'foo::bar::baz' is the longest sequence for "abc", "abc::cde::def", and for "bleh" it "bleh"
Is there any python inbuilt function which does this.. I feel like there is almost something in itertools which does this but cant seem to figure this out.
Upvotes: 2
Views: 609
Reputation: 402813
Option 1
max
+ groupby
should do it.
r = [max(g, key=len) for _, g in \
itertools.groupby(data, key=lambda x: x.split('::')[0])]
r
['foo::bar::baz', 'abc::cde::def', 'bleh']
Option 2
A much simpler solution would involve the collections.OrderedDict
:
from collections import OrderedDict
o = OrderedDict()
for x in data:
o.setdefault(x.split('::')[0], []).append(x)
r = [sorted(o[k], key=len)[-1] for k in o]
r
['foo::bar::baz', 'abc::cde::def', 'bleh']
Not exactly a one liner, but what is pythonic is subjective after all.
Upvotes: 3
Reputation: 3785
You can do a simple list comprehension taking advantage of a condition:
>>> [label for label in potential_labels if "\0".join(potential_labels).count("\0{}".format(label))==1]
['foo::bar::baz', 'abc::cde::def', 'bleh']
Upvotes: 1