tim.rohrer
tim.rohrer

Reputation: 295

Sum second value in tuple for each given first value in tuples using Python

I'm working with a large set of records and need to sum a given field for each customer account to reach an overall account balance. While I can probably put the data in any reasonable form, I figured the easiest would be a list of tuples (cust_id,balance_contribution) as I process through each record. After the round of processing, I'd like to add up the second item for each cust_id, and I am trying to do it without looping though the data thousands of time.

As an example, the input data could look like:[(1,125.50),(2,30.00),(1,24.50),(1,-25.00),(2,20.00)]

And I want the output to be something like this:

[(1,125.00),(2,50.00)]

I've read other questions where people have just wanted to add the values of the second element of the tuple using the form of sum(i for i, j in a), but that does separate them by the first element.

This discussion, python sum tuple list based on tuple first value, which puts the values as a list assigned to each key (cust_id) in a dictionary. I suppose then I could figure out how to add each of the values in a list?

Any thoughts on a better approach to this?

Thank you in advance.

Upvotes: 4

Views: 2768

Answers (4)

C.B.
C.B.

Reputation: 8326

Here's an itertools solution:

from itertools import groupby
>>> x
[(1, 125.5), (2, 30.0), (1, 24.5), (1, -25.0), (2, 20.0)]
>>> sorted(x)
[(1, -25.0), (1, 24.5), (1, 125.5), (2, 20.0), (2, 30.0)]
>>> for a,b in groupby(sorted(x), key=lambda item: item[0]): 
    print a, sum([item[1] for item in list(b)])
1 125.0
2 50.0

Upvotes: 1

Uri Goren
Uri Goren

Reputation: 13692

People usually like one-liners in python:

[(uk,sum([vv for kk,vv in data if kk==uk])) for uk in set([k for k,v in data])]

When

data=[(1,125.50),(2,30.00),(1,24.50),(1,-25.00),(3,20.00)]

The output is

[(1, 125.0), (2, 30.0), (3, 20.0)]

Upvotes: 1

bgreen-litl
bgreen-litl

Reputation: 464

import collections

def total(records):
    dct = collections.defaultdict(int)
    for cust_id, contrib in records:
        dct[cust_id] += contrib

    return dct.items()

Upvotes: 4

beiller
beiller

Reputation: 3135

Would the following code be useful?

in_list = [(1,125.50),(2,30.00),(1,24.50),(1,-25.00),(3,20.00)]
totals = {}
for uid, x in in_list :
   if uid not in totals :
      totals[uid] = x
   else :
      totals[uid] += x

print(totals)

output :

{1: 125.0, 2: 30.0, 3: 20.0}

Upvotes: 1

Related Questions