kyrenia
kyrenia

Reputation: 5575

Summing list of counters in python

I am looking to sum a list of counters in python. For example to sum:

counter_list = [Counter({"a":1, "b":2}), Counter({"b":3, "c":4})]

to give Counter({'b': 5, 'c': 4, 'a': 1})

I can get the following code to do the summation:

counter_master = Counter()
for element in counter_list:
    counter_master = counter_master + element

But I am confused as to why counter_master = sum(counter_list) results in the error TypeError: unsupported operand type(s) for +: 'int' and 'Counter' ? Given it is possible to add counters together, why is it not possible to sum them?

Upvotes: 37

Views: 10528

Answers (3)

Laura
Laura

Reputation: 1

I wanted to compare the performance of different solutions objectively.

from collections import Counter
from functools import reduce

counter_list = [Counter({'apple': 5, 'banana': 5}), Counter({'apple': 3, 'orange': 3}), Counter({'banana': 4}), Counter({'orange': 4, 'banana': 4})] * 1000

# From Omair S solution.
def sum_counters_binary(counter_list):

    '''
    Recursive counter with a O(log(n)) Complexity
    '''

    if len(counter_list) > 10:

        counter_0 = sum_counters_binary(counter_list[:int(len(counter_list)/2)])
        counter_1 = sum_counters_binary(counter_list[int(len(counter_list)/2):])

        return sum([counter_0, counter_1], Counter())

    else:

        return sum(counter_list, Counter())

print(len(counter_list))
# 4000
%timeit reduce(lambda c1, c2: c1 + c2, counter_list)
# 13.9 ms ± 3.82 ms per loop (mean ± std. dev. of 7 runs, 100 loops each)
%timeit sum(counter_list,Counter())
# 8.52 ms ± 96.5 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
%timeit sum_counters_binary(counter_list)
# 14.4 ms ± 2.73 ms per loop (mean ± std. dev. of 7 runs, 100 loops each)

I'd say proceed with sum(counter_list,Counter()) because it is the cleanest / most Pythonic.

By the way, the sum_counters solution isn't O(log(n)), it is O(n). You are still adding up every Counter, you are just splitting the list of Counters into smaller sublists, and then adding those sublists. The amount of work for each sublist is smaller, but the number of lists to add increases. There isn't any way to get around O(n) here. It is slightly slower because of the function call overhead.

Upvotes: 0

Omair S
Omair S

Reputation: 41

This version was faster in my experience. This is upper bound by O(log(n)).

def sum_counters(counter_list):

    '''
    Recursive counter with a O(log(n)) Complexity
    '''

    if len(counter_list) > 10:

        counter_0 = sum_counters(counter_list[:int(len(counter_list)/2)])
        counter_1 = sum_counters(counter_list[int(len(counter_list)/2):])

        return sum([counter_0, counter_1], Counter())

    else:

        return sum(counter_list, Counter())

global_counter = sum_counters(counter_list)

Upvotes: 4

vaultah
vaultah

Reputation: 46603

The sum function has the optional start argument which defaults to 0. Quoting the linked page:

sum(iterable[, start])

Sums start and the items of an iterable from left to right and returns the total

Set start to (empty) Counter object to avoid the TypeError:

In [5]: sum(counter_list, Counter())
Out[5]: Counter({'b': 5, 'c': 4, 'a': 1})

Upvotes: 75

Related Questions