Reputation: 5575
I am looking to sum a list of counters in python. For example to sum:
counter_list = [Counter({"a":1, "b":2}), Counter({"b":3, "c":4})]
to give Counter({'b': 5, 'c': 4, 'a': 1})
I can get the following code to do the summation:
counter_master = Counter()
for element in counter_list:
counter_master = counter_master + element
But I am confused as to why counter_master = sum(counter_list)
results in the error TypeError: unsupported operand type(s) for +: 'int' and 'Counter'
? Given it is possible to add counters together, why is it not possible to sum them?
Upvotes: 37
Views: 10528
Reputation: 1
I wanted to compare the performance of different solutions objectively.
from collections import Counter
from functools import reduce
counter_list = [Counter({'apple': 5, 'banana': 5}), Counter({'apple': 3, 'orange': 3}), Counter({'banana': 4}), Counter({'orange': 4, 'banana': 4})] * 1000
# From Omair S solution.
def sum_counters_binary(counter_list):
'''
Recursive counter with a O(log(n)) Complexity
'''
if len(counter_list) > 10:
counter_0 = sum_counters_binary(counter_list[:int(len(counter_list)/2)])
counter_1 = sum_counters_binary(counter_list[int(len(counter_list)/2):])
return sum([counter_0, counter_1], Counter())
else:
return sum(counter_list, Counter())
print(len(counter_list))
# 4000
%timeit reduce(lambda c1, c2: c1 + c2, counter_list)
# 13.9 ms ± 3.82 ms per loop (mean ± std. dev. of 7 runs, 100 loops each)
%timeit sum(counter_list,Counter())
# 8.52 ms ± 96.5 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
%timeit sum_counters_binary(counter_list)
# 14.4 ms ± 2.73 ms per loop (mean ± std. dev. of 7 runs, 100 loops each)
I'd say proceed with sum(counter_list,Counter())
because it is the cleanest / most Pythonic.
By the way, the sum_counters
solution isn't O(log(n)), it is O(n). You are still adding up every Counter, you are just splitting the list of Counters into smaller sublists, and then adding those sublists. The amount of work for each sublist is smaller, but the number of lists to add increases. There isn't any way to get around O(n) here. It is slightly slower because of the function call overhead.
Upvotes: 0
Reputation: 41
This version was faster in my experience. This is upper bound by O(log(n)).
def sum_counters(counter_list):
'''
Recursive counter with a O(log(n)) Complexity
'''
if len(counter_list) > 10:
counter_0 = sum_counters(counter_list[:int(len(counter_list)/2)])
counter_1 = sum_counters(counter_list[int(len(counter_list)/2):])
return sum([counter_0, counter_1], Counter())
else:
return sum(counter_list, Counter())
global_counter = sum_counters(counter_list)
Upvotes: 4
Reputation: 46603
The sum
function has the optional start argument which defaults to 0. Quoting the linked page:
sum(iterable[, start])
Sums start and the items of an iterable from left to right and returns the total
Set start to (empty) Counter
object to avoid the TypeError
:
In [5]: sum(counter_list, Counter())
Out[5]: Counter({'b': 5, 'c': 4, 'a': 1})
Upvotes: 75