list group and sum by column

Question

I have the following

    (Decimal('1.000'), Decimal('419.760000'), Decimal('4.197600000'), Decimal('423.957600000'))             
(Decimal('1.000'), Decimal('62.370000'), Decimal('0.623700000'), Decimal('62.993700000'))           
(Decimal('2.000'), Decimal('7.920000'), Decimal('0.079200000'), Decimal('7.999200000'))

And I'd like to group them by the first column and sum the other columns (sum grouped by the first column, summarized for each column separately)...but I don't know how to do that...

I'm new to python...any pointers?

Thanks, BR

Lauritz V. Thaulow · Accepted Answer

Assuming those three tuples are items in a tuple or list numbers:

column_sums = [sum(items) for items in zip(*numbers)]

Rereading your question, I think you may instead mean you want to group all numbers except the first of each row by what the number in the first row is, and then get the sum of each group. If so, do it like this:

from collections import defaultdict

grouped = defaultdict(list)

for tpl in numbers:
    grouped[tpl[0]].extend(tpl[1:])

group_sums = dict((key, sum(lst)) for key, lst in grouped.items())

If you don't need the itermediate grouped variable, you may optimize like this:

group_sums = defaultdict(int)

for tpl in numbers:
    group_sums[tpl[0]] += sum(tpl[1:])

Re: comment

This would have been so much easier if you gave an example of the output you wanted in the first place. For example, you could have added this to your post:

From the above example I want this output:

{Decimal('1.000'): [
   Decimal('482.130000'), Decimal('4.821300000'), Decimal('486.951300000')],
Decimal('2.000'): [
   Decimal('7.920000'), Decimal('0.079200000'), Decimal('7.999200000')]}

Then I could have posted this answer right away:

from itertools import izip_longest

group_sums = {}

for tpl in numbers:
    previous_sum = group_sums.get(tpl[0], [])
    iterator = izip_longest(previous_sum, tpl[1:], fillvalue=0)
    group_sums[tpl[0]] = [prev + num for prev, num in iterator]

This also works if the number of columns varies within a group. Please tell me I've understood the question correctly this time. :)

list group and sum by column

Answers (2)

Related Questions