claudiadast
claudiadast

Reputation: 611

How to identify partial tuple matches in a list of tuples

If I have a dictionary consisting of a list of tuples like so:

d = {'ENSG00000186092': [('ENST00000641515', '3'),
                        ('ENST00000641515', '1'),
                        ('ENST00000641515', '2'),
                        ('ENST00000335137', '1')],
    'ENSG00000284662': [('ENST00000332831', '1')],
    'ENSG00000284733': [('ENST00000426406', '1')]}

How can I identify if, for each key, there are tuples whereby the first element of the tuples don't match but the second elements do?

For instance, in the above example, we would only see one "hit", and that'd be for the key ENSG00000186092 because of:

('ENST00000641515', '1')
('ENST00000335137', '1')

Upvotes: 0

Views: 1303

Answers (3)

mujjiga
mujjiga

Reputation: 16896

d = {'ENSG00000186092': [('ENST00000641515', '3'),
                        ('ENST00000641515', '1'),
                        ('ENST00000641515', '2'),
                        ('ENST00000335137', '1')],
    'ENSG00000284662': [('ENST00000332831', '1')],
    'ENSG00000284733': [('ENST00000426406', '1')]}

for k, a in d.items():
    a_s = sorted(a, key=lambda x: (x[1], x[0]))
    for i in range(len(a_s)-1):
        if a_s[i][1] == a_s[i+1][1] and a_s[i][0] != a_s[i+1][0]:
            print (k, a_s[i], a_s[i+1])
  • Sort the tuples by second element then by first element of the tuples so that the tuples come together by second element and then by first element
  • Check the condition between current tuple and next tuple in the sorted list and print them if condition is met
  • Time Complexity if d has k items and list size is 'n' then it is O(k*nlogn) [k for outer loop and nlogn for sorting]

Upvotes: 1

Kirk Strauser
Kirk Strauser

Reputation: 30947

Being very verbose:

d = {
    "ENSG00000186092": [
        ("ENST00000641515", "3"),
        ("ENST00000641515", "1"),
        ("ENST00000641515", "2"),
        ("ENST00000335137", "1"),
    ],
    "ENSG00000284662": [("ENST00000332831", "1")],
    "ENSG00000284733": [("ENST00000426406", "1")],
}

def has_duplicates(list_of_tuples):
    seen = set()
    for _, value in list_of_tuples:
        if value in seen:
            return True
        seen.add(value)
    return False

dupes = [key for key, value in d.items() if has_duplicates(value)]

print(dupes)

The has_duplicates function takes a value from your dictionary. It returns True if any of the tuples in that value have the same second value.

The list comprehension at the return gives you all the keys were has_duplicates is True.

Upvotes: 0

grahamlyons
grahamlyons

Reputation: 717

Would a convoluted list-comprehension be of interest to you?

[
  k for k, v in d.items()
  if any(
    (i, j)
    for i, j in v
    for x, y in v
    if i != x and j == y
  )
]
>>> ['ENSG00000186092']
  1. Loops over the dictionary
  2. Loops over the list of tuples for each key
  3. For each tuple, loops through the same list and checks to see that the first entries don't match but the second do
  4. If there are any found then record that key from step 2.

Upvotes: 1

Related Questions