carobnodrvo
carobnodrvo

Reputation: 1051

Compute mean from list of NumPy array of different sizes

What is the most efficient way of computing mean and std of a Python list containing NumPy arrays of different sizes? For example:

l = [np.array([1,2,3]), np.array([4,5,6,7]), np.array([8])]

Using loop and manually adding it up is a valid solution but I am looking for something more sophisticated.

Upvotes: 2

Views: 3446

Answers (3)

hpaulj
hpaulj

Reputation: 231385

In [208]: l = [np.array([1,2,3]), np.array([4,5,6,7]), np.array([8])]  

Making an array from l doesn't do much for us, since the arrays differ in shape:

In [209]: np.array(l)                                                                                        
Out[209]: array([array([1, 2, 3]), array([4, 5, 6, 7]), array([8])], dtype=object)

Out[209] is 1d object dtype. It can't be flattened any further.

hstack is useful, turning the list of arrays into one array:

In [210]: np.hstack(l)                                                                                       
Out[210]: array([1, 2, 3, 4, 5, 6, 7, 8])
In [211]: np.mean(_)                                                                                         
Out[211]: 4.5

If the list contains 2d arrays as revealed in a comment:

In [212]: ll = [np.ones((2,4)), np.zeros((3,4)), np.ones((1,4))*2]                                           
In [213]: ll                                                                                                 
Out[213]: 
[array([[1., 1., 1., 1.],
        [1., 1., 1., 1.]]), array([[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]]), array([[2., 2., 2., 2.]])]
In [214]: np.vstack(ll)                                                                                      
Out[214]: 
array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.],
       [2., 2., 2., 2.]])
In [215]: np.mean(_, axis=0)                                                                                 
Out[215]: array([0.66666667, 0.66666667, 0.66666667, 0.66666667])

np.concatenate(..., axis=0) would work for both cases.

Upvotes: 1

Rahul charan
Rahul charan

Reputation: 837

Method-1 You can use itertools:-

import itertools
l = [np.array([1,2,3]), np.array([4,5,6,7]), np.array([8])]
new_l = list(itertools.chain(*l))
print(new_l)
print(f"The mean is:\t{np.mean(new_l)} ") 

Output

[1, 2, 3, 4, 5, 6, 7, 8]
The mean is:    4.5 

Method-2 But I think you should use basic for loop:-

l = [np.array([1,2,3]), np.array([4,5,6,7]), np.array([8])]
new_l = [var for my_list in l for var in my_list]
np.mean(new_l)

Upvotes: 0

lenik
lenik

Reputation: 23528

This seems to work:

np.mean( map( np.mean, a ) )

"Look, ma, no loops!!" =)

Another way would be:

np.mean( np.array( a ).flatten() )

Upvotes: 2

Related Questions