Reputation: 2112
I have dict like:
dict = [{'a':2, 'b':3}, {'b':4}, {'a':1, 'c':5}]
I need to get average of all different keys. Result should looks like:
avg = [{'a':1.5, 'b':3.5, 'c':5}]
I can get summary of all keys, but Im failing to realize how can I count same keys in order to get average number.
Upvotes: 2
Views: 5593
Reputation: 152587
This can be easily done with pandas:
>>> import pandas
>>> df = pandas.DataFrame([{'a':2, 'b':3}, {'b':4}, {'a':1, 'c':5}])
>>> df.mean()
a 1.5
b 3.5
c 5.0
dtype: float64
If you need a dictionary as result:
>>> dict(df.mean())
{'a': 1.5, 'b': 3.5, 'c': 5.0}
Upvotes: 6
Reputation: 1773
I thought of adding a unique answer using PyFunctional
from functional import seq
l = [{'a':2, 'b':3}, {'b':4}, {'a':1, 'c':5}]
a = (seq(l)
# convert dictionary to list
.map(lambda d: seq(d).map(lambda k: (k, d[k])))
.flatten()
# append 1 for counter
.map(lambda (k, v): (k, (v, 1)))
# sum of values, and counts
.reduce_by_key(lambda a, b: (a[0]+b[0], a[1]+b[1]))
# average
.map(lambda (k, (v, c)): (k, float(v)/c))
# convert to dict
.to_dict()
)
print(a)
Output
{'a': 1.5, 'c': 5.0, 'b': 3.5}
Upvotes: 1
Reputation: 167
You can use a for loop with a counter and then divide the sum of each by the counter.
Also it is weird you are calling the array/list a dict...
I'd suggest something like this:
Create a new dict: letter_count = {}
-For loop over the current dicts
-Add the letter to the letter count if it doesn't exist
-If it does exist, update the value with the value of the item (+=number) as well as update the counter by one
-Once the for loop is done, divide each value by the counter
-Return the new dict letter_count
Upvotes: 1
Reputation: 152587
You could create an intermediate dictionary that collects all encountered values as lists:
dct = [{'a':2, 'b':3}, {'b':4}, {'a':1, 'c':5}]
from collections import defaultdict
intermediate = defaultdict(list)
for subdict in dct:
for key, value in subdict.items():
intermediate[key].append(value)
# intermediate is now: defaultdict(list, {'a': [2, 1], 'b': [3, 4], 'c': [5]})
And finally calculate the average by dividing the sum of each list by the length of each list:
for key, value in intermediate.items():
print(key, sum(value)/len(value))
which prints:
b 3.5
c 5.0
a 1.5
Upvotes: 2