Reputation: 197
I have an algorithm whose functionality is dependent upon the number of matching instances (even repeated) between two lists. For example:
a = ["test", "win", "win", "bike", "bike", "bike", "gem", "nine"]
b = ["test", "win", "let", "bike", "four"]
d = set(a).intersection(b)
Would give me:
{"test", "win", "bike"}
The output I would like, would be:
{"test", "win", "win", "bike", "bike", "bike"}
I figure I could just utilize the output list and do a count for how many times each intersected word exists in list a
etc... But this is quite a few extra steps, and I am hoping there is a simpler way to achieve this output.
My question is, based on the provided example lists, how can I achieve the desired second output of:
{"test", "win", "win", "bike", "bike", "bike"}
Upvotes: 1
Views: 72
Reputation: 107134
You can use collections.Counter
to obtain the counts of each distinct item, and use the Counter.elements
method to produce the desired list with items repeated according to counts after filtering them by membership in b
(convert b
to a set for efficient membership lookups):
from collections import Counter
set_b = set(b)
print(list(Counter({k: c for k, c in Counter(a).items() if k in set_b}).elements()))
This outputs:
['test', 'win', 'win', 'bike', 'bike', 'bike']
Upvotes: 1
Reputation: 70003
You should use a list instead of a set, and you can take advantage of the code you have already defined, so:
a = ["test", "win", "win", "bike", "bike", "bike", "gem", "nine"]
b = ["test", "win", "let", "bike", "four"]
s = set(a).intersection(b)
output = []
for e in s:
n = max(a.count(e), b.count(e))
output.extend([e] * n)
>>> output
['win', 'win', 'bike', 'bike', 'bike', 'test']
Basically, what you are doing is to repeat the common elements considering in which list it is repeated the most.
Upvotes: 1