Debajyoti Sengupta
Debajyoti Sengupta

Reputation: 127

Filtering a (Nx1) list in Python

I have a list of the form

[(2,3),(4,3),(3,4),(1,4),(5,4),(6,5)]

I want to scan the list and return those elements whose (i,1) are repeated. (I apologize I couldn't frame this better)

For example, in the given list the pairs are (2,3),(4,3) and I see that 3 is repeated so I wish to return 2 and 4. Similarly, from (3,4),(1,4),(5,4) I will return 3, 1, and 5 because 4 is repeated.

I have implemented the bubble search but that is obviously very slow.

for i in range(0,p):

    for j in range(i+1,p):
        if (arr[i][1] == arr[j][1]):
            print(arr[i][0],arr[j][0])

How do I go about it?

Upvotes: 2

Views: 109

Answers (2)

Davide Fiocco
Davide Fiocco

Reputation: 5914

Using numpy allows to avoid for loops:

import numpy as np
l = [(2,3),(4,3),(3,4),(1,4),(5,4),(6,5)]

a = np.array(l) 
items, counts = np.unique(a[:,1], return_counts=True)
is_duplicate = np.isin(a[:,1], items[counts > 1]) # get elements that have more than one count
print(a[is_duplicate, 0]) # return elements with duplicates

# tuple(map(tuple, a[is_duplicate, :])) # use this to get tuples in output

(toggle comment to get output in form of tuples)

pandas is another option:

import pandas as pd
l = [(2,3),(4,3),(3,4),(1,4),(5,4),(6,5)]

df = pd.DataFrame(l, columns=list(['first', 'second']))
df.groupby('second').filter(lambda x: len(x) > 1)

Upvotes: 0

jpp
jpp

Reputation: 164623

You can use collections.defaultdict. This will return a mapping from the second item to a list of first items. You can then filter for repetition via a dictionary comprehension.

from collections import defaultdict

lst = [(2,3),(4,3),(3,4),(1,4),(5,4),(6,5)]

d = defaultdict(list)

for i, j in lst:
    d[j].append(i)

print(d)

# defaultdict(list, {3: [2, 4], 4: [3, 1, 5], 5: [6]})

res = {k: v for k, v in d.items() if len(v)>1}

print(res)

# {3: [2, 4], 4: [3, 1, 5]}

Upvotes: 1

Related Questions