Dawn17
Dawn17

Reputation: 8297

How to find top n overlapping items in two lists of tuples (Python)

Given

a = [('AB', 11), ('CD', 12), ('EF', 13), ('GG', 1332)]

and

b = [('AB', 411), ('XX', 132), ('EF', 113), ('AF', 113), ('FF', 113)]

If n = 3, I want to only consider the top 3 elements in each lists and return tuples that have same first element (the string).

For example, I want to return ['AB','EF'] in this case.

How can I do this?

Upvotes: 1

Views: 150

Answers (4)

Levi Lesches
Levi Lesches

Reputation: 1611

Well first, we can start off with a for loop. We want to loop from 0 to n, check the tuples of a and b at those indices, and then check if that tuple's first element match.

matches = [a [index] [0] for index in range (n) if a [index] [0] == b [index] [0]] 

Which does the same thing as:

matches = []
for index in range (n):
    if a [index] [0] == b [index] [0]: matches.append a [index] [0]

Upvotes: 0

ShmulikA
ShmulikA

Reputation: 3744

using set intersection (better complexity over list's in):

def overlapping(x,y, topn=3):
    return {i[0] for i in x[:topn]} & {i[0] for i in y[:topn]}

overlapping(a,b)

outputs:

{'AB', 'EF'}

Exaplanation:

{i[0] for i in x[:topn]}

set comprehensions, equivalent to set(i[0] for i in x[:topn])

{...} & {...}

set intersection, equivalent to set(..).intersection(set(...))

Upvotes: 0

Stephen Rauch
Stephen Rauch

Reputation: 49812

You could use Counter for this like:

Code:

a = [('AB', 11), ('CD', 12), ('EF', 13), ('GG', 1332)]
b = [('AB', 411), ('XX', 132), ('EF', 113), ('AF', 113), ('FF', 113)]

from collections import Counter
counts = Counter(x[0] for x in a[:3] + b[:3])
print([x for x, c in counts.items() if c == 2])

And without any imports, use a set:

print(set((x[0] for x in a[:3])).intersection(set((x[0] for x in b[:3]))))

Results:

['AB', 'EF']
{'AB', 'EF'}

Upvotes: 3

eagle
eagle

Reputation: 890

Do you mean like this?

def overlapping(n, tups_a, tups_b):
    overlapping = set(map(lambda x: x[0], tups_a[:n])).intersection(set(map(lambda x: x[0], tups_b[:n])))
    return list(overlapping)

overlap = overlapping(3, a, b)

['AB', 'EF']

Upvotes: 1

Related Questions