How to count an element in a list inside a list in Python

Question

Let's say I have a list like this:

[(9600002, 42, 3),
(9600001, 17, 3),
(9600003, 11, 1),
(9600002, 14, 5),
(9600001, 17, 1),
(9600003, 11, 4),
(9600001, 17, 4),
(9600001, 14, 3),
(9600002, 42, 6),
(9600002, 42, 1)]

The first number is the user_id, the second is the tv_program_code, and the third is the season_id.

My question

How can I find out the program_code with more than 1 season subscribed, and then print the user_id and the tv_program_code? For example:

9600001 17

Or do you have any suggestion of which data structure I should apply?

jpp · Accepted Answer

One method is to use collections.Counter.

The idea is to count the number of series per (user, program) combination using a dictionary.

Then filter for count greater than 1 via a dictionary comprehension.

from collections import Counter

lst = [(9600002, 42, 3), (9600001, 17, 3), (9600003, 11, 1),
       (9600002, 14, 5), (9600001, 17, 1), (9600003, 11, 4),
       (9600001, 17, 4), (9600001, 14, 3), (9600002, 42, 6),
       (9600002, 42, 1)] 

c = Counter()

for user, program, season in lst:
    c[(user, program)] += 1

print(c)

# Counter({(9600002, 42): 3, (9600001, 17): 3, (9600003, 11): 2,
#          (9600002, 14): 1, (9600001, 14): 1})

res = {k: v for k, v in c.items() if v > 1}

print(res)

# {(9600002, 42): 3, (9600001, 17): 3, (9600003, 11): 2}

print(res.keys())

# dict_keys([(9600002, 42), (9600001, 17), (9600003, 11)])

Note on Counter versus defaultdict(int)

Counter is twice as slow as defaultdict(int), see benchmarking below. You can switch easily to defaultdict(int) if performance matters and none of these features are relevant to you:

Missing Counter keys don't get added automatically when querying.
You can add / subtract Counter objects.
Counter offers additional methods, e.g. elements, most_common.

Benchmarking on Python 3.6.2.

from collections import defaultdict, Counter

lst = lst * 100000

def counter(lst):
    c = Counter()
    for user, program, season in lst:
        c[(user, program)] += 1
    return c

def dd(lst):
    d = defaultdict(int)
    for user, program, season in lst:
        d[(user, program)] += 1
    return d

%timeit counter(lst)  # 900 ms
%timeit dd(lst)       # 450 ms

How to count an element in a list inside a list in Python

My question

Answers (2)

Related Questions