abcd
abcd

Reputation: 171

Python: Mean of several dictionaries with the same keys

I am trying to find the mean of several dictionaries ( The number of dictionaries will be dependent on user choice) with same keys. Each key is a n dimensional numpy array.

I got my solution using this method

ipython notebook viewer

The function I used is

def metaa(lis,name):
    x = len(lis)
    pr=""
    for i in xrange(x):
        if i == 0:
            pr = pr+name+"["+str(i)+"][x]"
        else:
            pr = pr+"+"+name+"["+str(i)+"][x]"
    pr = "("+pr+")/"+str(x)                 
    return pr

I created dictionaries like this.

import numpy as np
a1 = np.random.randint(100,size=(3,10))
a2 = np.random.randint(100,size=(3,10))
a3 = np.random.randint(100,size=(3,10))
al=[a1,a2,a3]
dicta = {'a1':a1,'a2':a2,'a3':a3}
dictb = {'a1':a1,'a2':a2,'a3':a3}
R = [dicta,dictb]

I used the same values in both dictionaries for testing. I called the function like this.

Res = {}
for x in R[0]:
    Res[x] = eval(metaa(R,'R'))

I think this method is hackish, Is there a better way of solving this?.

Upvotes: 3

Views: 1266

Answers (1)

ojdo
ojdo

Reputation: 8900

Building a string to eval it is not very elegant. Better use reduce in combination with np.add, all enabled by list [] and dict {} comprehensions . First, convert the list of dictionaries R to a dictionary of lists S:

S = {k:[ R[j][k] for j in range(len(R)) ] for k in R[0].keys()}

Now, each key only has a list of "naked" numpy arrays that can be added using np.add and then divided by the length of the individual list:

S = {'a1': [array([[ 32, 120,  80, 380, 360, 212, 188,  56, 312, 112],
                   [388, 348, 196, 236,  60, 200, 224, 208,  24, 104],
                   [324, 296,  24, 52, 220,  12, 104,  52, 232, 196]]),
            array([[ 32, 120,  80, 380, 360, 212, 188,  56, 312, 112],
                   [388, 348, 196, 236,  60, 200, 224, 208,  24, 104],
                   [324, 296,  24, 152, 220,  12, 104,  52, 232, 196]])],
     'a2': [array([[30, 82, 99, 72, 79, 98, 93, 93, 28, 46],
                   [ 8, 17, 50, 59, 85, 73, 48, 97, 87, 41],
                   [98, 36, 27, 55, 98, 39, 73, 51, 27, 33]]),
            array([[30, 82, 99, 72, 79, 98, 93, 93, 28, 46],
                   [ 8, 17, 50, 59, 85, 73, 48, 97, 87, 41],
                   [98, 36, 27, 55, 98, 39, 73, 51, 27, 33]])],
     'a3': [array([[78, 24, 87, 83, 30, 14, 88, 57, 55, 73],
                   [76, 94, 99, 58, 63, 34, 70, 81, 45, 20],
                   [32, 61,  0,  3, 33, 33, 38, 90, 11,  3]]),
            array([[78, 24, 87, 83, 30, 14, 88, 57, 55, 73],
                   [76, 94, 99, 58, 63, 34, 70, 81, 45, 20],
                   [32, 61,  0,  3, 33, 33, 38, 90, 11,  3]])]}

Calculate the mean:

T = {k:( reduce(np.add, v)/len(v) ) for k,v in S.iteritems()}

Now T is a dict of numpy arrays with mean values:

T = {'a1': array([[ 32, 120,  80, 380, 360, 212, 188,  56, 312, 112],
                  [388, 348, 196, 236,  60, 200, 224, 208,  24, 104],
                  [324, 296,  24, 152, 220,  12, 104,  52, 232, 196]]),
     'a2': array([[30, 82, 99, 72, 79, 98, 93, 93, 28, 46],
                  [ 8, 17, 50, 59, 85, 73, 48, 97, 87, 41],
                  [98, 36, 27, 55, 98, 39, 73, 51, 27, 33]]),
     'a3': array([[78, 24, 87, 83, 30, 14, 88, 57, 55, 73],
                  [76, 94, 99, 58, 63, 34, 70, 81, 45, 20],
                  [32, 61,  0,  3, 33, 33, 38, 90, 11,  3]])}

Upvotes: 4

Related Questions