GnaNoelk
GnaNoelk

Reputation: 53

Finding common elements between tuples in a list in python

If I have a list ts of tuples in python:

ts = [(702,703), (703,704), (803,805), (803,806), (901,903), (902,903)]

How do I obtain a list containing common elements between 2 or more such tuples?

Assume that both the tuples in ts and the the elements in tuples are already numerically sorted.

For this example, the intended output should be:

ts_output = [703, 803, 903]

Below is my working so far:

ts = [(702,703), (703,704), (803,805), (803,806), (901,903), (902,903)]
ts = set(ts)

t1 = set(w for w,x in ts for y,z in ts if w == y) # t1 should only contain 803
print("t1: ", t1)

t2 = set(y for w,x in ts for y,z in ts if x == y) # t2 should only contain 703
print("t2: ", t2)

t3 = set(x for w,x in ts for y,z in ts if x == z) # t3 should only contain 903
print("t3: ", t3)

And this is the corresponding output:

t1: {803, 901, 902, 702, 703}
t2: {703}
t3: {704, 805, 806, 903, 703}

From above, only t2 gave the intended output, but I'm not sure what happened to t1 and t3.

You may use this alternative input to test your code, and it should give the exact same output:

 ts = [(701,703), (702,703), (703,704), (803,805), (803,806), (901,903), (902,903), (903,904)]

Upvotes: 2

Views: 2792

Answers (4)

Rahul
Rahul

Reputation: 11560

import collections

ts = [(702,703), (703,704), (803,805), (803,806), (901,903), (902,903)]
flat_list = [item for sublist in ts for item in sublist]
duplicates = [item for item, count in collections.Counter(flat_list).items() if count > 1]
print(duplicates)

Explanation:

Given your input, you first need to flat your list.

#1 Simple and pythonic
flat_list = [item
                for sublist in ts
                    for item in sublist]

#2 More efficient.
import itertools
flat_list = itertools.chain.from_iterable(ts)

In case of method #1 your flat_list will be list object in in case of method #2 it will be generator object. Both will behave same for iteration.

Now you can count the elements in your flat_list. If they are greater than 1 they are duplicates.

for item, count in collections.Counter(flat_list).items():
    if count > 1:
        print(item)

or you can use more pythonic list comprehension.

duplicates = [item
                 for item, count in collections.Counter(flat_list).items()
                     if count > 1]

Upvotes: 6

jamylak
jamylak

Reputation: 133754

Here's an answer that solves it by only passing through once instead of twice and builds result as it goes (not sure if its faster or slower in practice for super large ts)

>>> from collections import Counter
>>> from itertools import chain
>>> ts = [(702,703), (703,704), (803,805), (803,806), (901,903), (902,903)]
>>> def find_common(ts):
...   c = Counter()
...   for num in chain.from_iterable(ts):
...     c[num] += 1
...     if c[num] == 2:
...       yield num
... 
>>> list(find_common(ts))
[703, 803, 903]

And without Counter

>>> def find_common(ts):
...   seen, dupes = set(), set()
...   for num in chain.from_iterable(ts):
...     if num in seen and num not in dupes:
...       dupes.add(num)
...       yield num
...     seen.add(num)
>>> list(find_common(ts))
[703, 803, 903]

Upvotes: 1

jamylak
jamylak

Reputation: 133754

>>> from collections import Counter
>>> ts = [(702,703), (703,704), (803,805), (803,806), (901,903), (902,903)]
>>> c = Counter(el for t in ts for el in t)
>>> [k for k in c if c[k] >= 2]
[703, 803, 903]

Upvotes: 1

Sohaib Farooqi
Sohaib Farooqi

Reputation: 5676

You need to flatten your list of tuples. You can do this using itertools.chain

>>> from itertools import chain

>>> flat_list = list(chain(*ts))
>>> flat_list
>>> [702, 703, 703, 704, 803, 805, 803, 806, 901, 903, 902, 903]

Or you can also use itertools.chain.from_iterables to do the same like, However this doesn't require iterable unpacking

>>> flat_list = list(itertools.chain.from_iterable(ts))
>>> flat_list
>>> [702, 703, 703, 704, 803, 805, 803, 806, 901, 903, 902, 903]

After this step you can use Collections.Counter to count occurance of each element in flat list and filter once which occurs more than one.

>>> from collections import Counter
>>> c = Counter(flat_list)
>>> c
>>> Counter({803: 2, 903: 2, 703: 2, 704: 1, 805: 1, 806: 1, 901: 1, 902: 1, 702: 1}) 

Then finally filter c

>>> [k for k,v in c.items() if v>1]
>>> [803, 903, 703]

Upvotes: 5

Related Questions