frazman
frazman

Reputation: 33273

pythonic way to find all potential longest sequence

So, I have a list like following

potential_labels = ['foo', 'foo::bar', 'foo::bar::baz', "abc", "abc::cde::def", "bleh"]

The desired_output = ['foo::bar::baz', "abc::cde::def", "bleh"]

This is because.. for root "foo", 'foo::bar::baz' is the longest sequence for "abc", "abc::cde::def", and for "bleh" it "bleh"

Is there any python inbuilt function which does this.. I feel like there is almost something in itertools which does this but cant seem to figure this out.

Upvotes: 2

Views: 609

Answers (2)

cs95
cs95

Reputation: 402813

Option 1
max + groupby should do it.

r = [max(g, key=len) for _, g in \
          itertools.groupby(data, key=lambda x: x.split('::')[0])]

r
['foo::bar::baz', 'abc::cde::def', 'bleh']

Option 2
A much simpler solution would involve the collections.OrderedDict:

from collections import OrderedDict

o = OrderedDict()    
for x in data:
    o.setdefault(x.split('::')[0], []).append(x)

r = [sorted(o[k], key=len)[-1] for k in o]

r
['foo::bar::baz', 'abc::cde::def', 'bleh']

Not exactly a one liner, but what is pythonic is subjective after all.

Upvotes: 3

Ivan De Paz Centeno
Ivan De Paz Centeno

Reputation: 3785

You can do a simple list comprehension taking advantage of a condition:

>>> [label for label in potential_labels if "\0".join(potential_labels).count("\0{}".format(label))==1]
['foo::bar::baz', 'abc::cde::def', 'bleh']

Upvotes: 1

Related Questions