sam cammegh
sam cammegh

Reputation: 55

How can I find the average of each similar entry in a list of tuples?

I have this list of tuples

[('Jem', 10), ('Sam', 10), ('Sam', 2), ('Jem', 9), ('Jem', 10)]

How do I find the average of the numbers coupled with each name, i.e. the average of all the numbers stored in a tuple with Jem, and then output them? In this example, the output would be:

Jem 9.66666666667
Sam 6

Upvotes: 4

Views: 2156

Answers (3)

Finn
Finn

Reputation: 2099

You can also use List comprehensions:

l = [('Jem', 10), ('Sam', 10), ('Sam', 2), ('Jem', 9), ('Jem', 10)]

def avg(l):
    return sum(l)/len(l)

result = [(n, avg([v[1] for v in l if v[0] is n])) for n in set([n[0] for n in l])]
# result is [('Jem', 9.666666666666666), ('Sam', 6.0)]

Upvotes: 0

Adam Smith
Adam Smith

Reputation: 54223

There's a couple ways to do this. One is easy, one is pretty.

Easy:

Use a dictionary! It's easy to build a for loop that goes through your tuples and appends the second element to a dictionary, keyed on the first element.

d = {}
tuples = [('Jem', 10), ('Sam', 10), ('Sam', 2), ('Jem', 9), ('Jem', 10)]
for tuple in tuples:
    key,val = tuple
    d.setdefault(key, []).append(val)

Once it's in a dictionary, you can do:

for name, values in d.items():
    print("{name} {avg}".format(name=name, avg=sum(values)/len(values)))

Pretty:

Use itertools.groupby. This only works if your data is sorted by the key you want to group by (in this case, t[0] for each t in tuples) so it's not ideal in this case, but it's a nice way to highlight the function.

from itertools import groupby

tuples = [('Jem', 10), ('Sam', 10), ('Sam', 2), ('Jem', 9), ('Jem', 10)]
tuples.sort(key=lambda tup: tup[0])
# tuples is now [('Jem', 10), ('Jem', 9), ('Jem', 10), ('Sam', 10), ('Sam', 2)]

groups = groupby(tuples, lambda tup: tup[0])

This builds a structure that looks kind of like:

[('Jem', [('Jem', 10), ('Jem', 9), ('Jem', 10)]),
 ('Sam', [('Sam', 10), ('Sam', 2)])]

We can use that to build our names and averages:

for groupname, grouptuples in groups:
    values = [t[1] for t in groupvalues]
    print("{name} {avg}".format(name=groupname, avg=sum(values)/len(values)))

Upvotes: 5

Cory Kramer
Cory Kramer

Reputation: 117981

Seems like a straight-forward case for collections.defaultdict

from collections import defaultdict
l = [('Jem', 10), ('Sam', 10), ('Sam', 2), ('Jem', 9), ('Jem', 10)]
d = defaultdict(list)
for key, value in l:
    d[key].append(value)

Then calculating the mean

from numpy import mean
for key in d:
    print(key, mean(d[key]))

Output

Jem 9.66666666667
Sam 6.0

Upvotes: 5

Related Questions