Einar
Einar

Reputation: 4933

Find elements of a list that contain substrings from another list in Python

In my code, I have two lists, of different lenghts, which I'll call "main" and "secondary". I need to use elements in secondary to select elements in main. However, secondary contains elements that are just sub-sets of the strings in main. In code:

main = ["pinecone", "treeleaf", "dishwasher"]
secondary = ["pine", "washer", "unrelated", "flowerbed"]

Usually secondary is much longer than main (I mention this in case solutions involve performance penalties). How do I go and select elements in "main" basing on "secondary" with the most efficient (and Pythonic) way possible? If it were a function, I'd expect

>>> selected_items = select_items(main, secondary)
>>> print selected_items
["pinecone", "dishwasher"]

Thanks!

Upvotes: 3

Views: 3711

Answers (3)

Dannid
Dannid

Reputation: 1697

A similar approach works when your main list and secondary list are the same:

In [2]: main = ["pinecone", "treeleaf", "dishwasher"] + ["pine", "washer", "unrelated", "flowerbed"]

In [4]: [x for x in main for y in main if y in x and x != y]
Out[4]: ['pinecone', 'dishwasher']

Note, you can get the partially matching string instead (or even both!):

In [5]: [y for x in main for y in main if y in x and x != y]
Out[5]: ['pine', 'washer']

In [6]: [(y,x) for x in main for y in main if y in x and x != y]
Out[6]: [('pine', 'pinecone'), ('washer', 'dishwasher')]

Upvotes: 0

phihag
phihag

Reputation: 287835

def select_items(strings, substrs):
    return [m for m in strings if any(s in m for s in substrs)]

Upvotes: 0

Michael Markert
Michael Markert

Reputation: 4026

Naive approach:

In [2]: main = ["pinecone", "treeleaf", "dishwasher"]

In [3]: secondary = ["pine", "washer", "unrelated", "flowerbed"]

In [4]: [x for x in main if any(x in y or y in x for y in secondary)]
Out[4]: [u'pinecone', u'dishwasher']

Upvotes: 6

Related Questions