Reputation: 11096
I have a list of strings named strList
which contains around 800,000-2,200,000
elements. Each element contains around 100
characters. I have another list of strings called findStrs
which usually contains less than 5
elements (5 to 10-character strings). I want to select the elements of strList
that contain all of the elements in findStrs
. How can I efficiently do that in Python? Here's how I am doing this but I wonder if there are more efficient solutions using list comprehensions for doing it:
finalStrList = []
for strr in strList:
temp = []
for findStr in findStrs:
if findStr in strr:
temp.append(findStr)
if len(temp) == len(findStrs):
finalStrList.append(str)
print(finalStrList)
I tried to devise a list comprehension-based method as well but, not surprisingly, it does not work:
[strr for strr in strList if [findStr in strr for findStr in findStrs]]
Upvotes: 1
Views: 1169
Reputation: 198
If the match rate is not very high, we can reduce the time complexity.
finalStrList = []
for strr in strList:
flag = True
for findStr in findStrs:
if findStr not in strr:
flag = False
break
if flag:
finalStrList.append(str)
print(finalStrList)
Upvotes: 0
Reputation: 11096
As juanpa.arrivillaga suggested in the comments section I can do what I want easily using the following list comprehension-based solution:
[s for s in strList if all([x in s for x in findStrs])]
Upvotes: 1