Reputation: 462

Count the duplicates in a list of tuples

I have a list of tuples: a = [(1,2),(1,4),(1,2),(6,7),(2,9)] I want to check if one of the individual elements of each tuple matches the same position/element in another tuple, and how many times this occurs.

For example: If only the 1st element in some tuples has a duplicate, return the tuple and how many times it's duplicated. I can do that with the following code:

a = [(1,2), (1,4), (1,2), (6,7), (2,9)]

coll_list = []
for t in a:
    coll_cnt = 0
    for b in a:
        if b[0] == t[0]:
            coll_cnt = coll_cnt + 1
    print "%s,%d" %(t,coll_cnt)
    coll_list.append((t,coll_cnt))

print coll_list

I want to know if there is a more effective way to do this?

Upvotes: 5

Answers (5)

Yonas Kassa

Reputation: 3720

use collections library. In the following code val_1, val_2 give you duplicates of each first elements and second elements of the tuples respectively.

import collections
val_1=collections.Counter([x for (x,y) in a])
val_2=collections.Counter([y for (x,y) in a])

>>> print val_1
<<< Counter({1: 3, 2: 1, 6: 1})

This is the number of occurrences of the first element of each tuple

>>> print val_2
<<< Counter({2: 2, 9: 1, 4: 1, 7: 1})

This is the number of occurrences of the second element of each tuple

Upvotes: 6

Andy

Reputation: 50640

You can use a Counter

from collections import Counter
a = [(1,2),(1,4),(1,2),(6,7),(2,9)]
counter=Counter(a)
print counter

This will output:

Counter({(1, 2): 2, (6, 7): 1, (2, 9): 1, (1, 4): 1})

It is a dictionary like object with the item (tuples in this case) as the key and a value containing the number of times that key was seen. Your (1,2) tuple is seen twice, while all others are only seen once.

>>> counter[(1,2)]
2

If you are interested in each individual portion of the tuple, you can utilize the same logic for each element in the tuple.

first_element = Counter([x for (x,y) in a])
second_element = Counter([y for (x,y) in a])

first_element and second_element now contain a Counter of the number of times values are seen per element in the tuple

>>> first_element
Counter({1: 3, 2: 1, 6: 1})
>>> second_element
Counter({2: 2, 9: 1, 4: 1, 7: 1})

Again, these are dictionary like objects, so you can check how frequent a specific value appeared directly:

>>> first_element[2]
1

In the first element of your list of tuples, the value 2 appeared 1 time.

Upvotes: 13

balgam

Reputation: 1462

Maybe Dictionary can work better. Because in your code, you are traveling the list for twice. And this makes the complexity of your code O(n^2). And this is not a good thing :)

Best way is the travelling for once and to use 1 or 2 conditions for each traverse. Here is the my first solution for such kind of problem.

a = [(1,2),(1,4),(1,2),(6,7),(2,9)]

dict = {}
for (i,j) in a:
    if dict.has_key(i):
            dict[i] += 1
    else:
            dict[i] = 1

print dict

For this code, this will give the output:

{1: 3, 2: 1, 6: 1}

I hope it will be helpful.

Upvotes: 3

tschm

Reputation: 2955

Using pandas this is simple and very fast:

import pandas
print(pandas.Series(data=[(1,2),(1,4),(1,2),(6,7),(2,9)]).value_counts())

(1, 2)    2
(1, 4)    1
(6, 7)    1
(2, 9)    1
dtype: int64

Upvotes: 2

Sudipta

Reputation: 4971

You can make count_map, and store the count of each tuple as the value.

>>> count_map = {}
>>> for t in a:
...     count_map[t] = count_map.get(t, 0)  +1
... 
>>> count_map
{(1, 2): 2, (6, 7): 1, (2, 9): 1, (1, 4): 1}

Upvotes: 4

Count the duplicates in a list of tuples

Answers (5)

Related Questions