Reputation: 145

How to sum/average a specific subset of columns or rows and return the new ndarray in numpy?

For the sake of illustration, imaging I have the following ndarray:

x = [[0.5,  0.3,  0.1,  0.1],
     [0.4,  0.1,  0.3,  0.2],
     [0.4,  0.3,  0.2,  0.1],
     [0.6,  0.1,  0.1,  0.2]]

I want to sum the two vectors at columns 1 and 2 (starting the count from 0) so that the new ndarray would be:

y = [[0.5,  0.4,  0.1],
     [0.4,  0.4,  0.2],
     [0.4,  0.5,  0.1],
     [0.6,  0.2,  0.2]]

And then, I want to average the vectors at rows 1 and 2 so that the final result would be:

z = [[0.5,  0.4,   0.1 ],
     [0.4,  0.45,  0.15],
     [0.6,  0.2,   0.2 ]]

Is there an efficient way to do that in numpy in one command? I really need efficiency as this operation is going to be applied in a nested loop.

Thanks in advance

Upvotes: 0

Answers (3)

Nathan

Reputation: 3648

@hpaulj s solution is very good, be sure to read it

You can sum columns quite easily:

a_summed = np.sum(a[:,1:3], axis=1)

You can also take the mean of multiple rows:

a_mean = np.mean(a[1:3], axis=0)

All you have to do is replace and delete the remaining columns, so it becomes:

import numpy as np

a_summed = np.sum(a[:,1:3], axis=1)
a[:, 1] = a_summed
a = np.delete(a, 2, 1)
a_mean = np.mean(a[1:3], axis=0)
a[1] = a_mean
a = np.delete(a, 2, 0)
print(a)

Upvotes: 2

hpaulj

Reputation: 231530

In [68]: x = [[0.5,  0.3,  0.1,  0.1], 
    ...:      [0.4,  0.1,  0.3,  0.2], 
    ...:      [0.4,  0.3,  0.2,  0.1], 
    ...:      [0.6,  0.1,  0.1,  0.2]]                                                           
In [69]: x=np.array(x)

ufunc like np.add have a reduceat method that lets us perform the action over groups of rows or columns. With that the first reduction is easy (but takes a little playing to understand the parameters):

In [70]: np.add.reduceat(x,[0,1,3], axis=1)                                                      
Out[70]: 
array([[0.5, 0.4, 0.1],
       [0.4, 0.4, 0.2],
       [0.4, 0.5, 0.1],
       [0.6, 0.2, 0.2]])

Apparently mean is not a ufunc, so I had to settle for add to reduce the rows:

In [71]: np.add.reduceat(Out[70],[0,1,3],axis=0)                                                 
Out[71]: 
array([[0.5, 0.4, 0.1],
       [0.8, 0.9, 0.3],
       [0.6, 0.2, 0.2]])

and then divide by the row count to get the mean. I could generalize that to use the same [0,1,3] used in the reduceat, but for now just use a column array:

In [72]: np.add.reduceat(Out[70],[0,1,3],axis=0)/np.array([1,2,1])[:,None]                       
Out[72]: 
array([[0.5 , 0.4 , 0.1 ],
       [0.4 , 0.45, 0.15],
       [0.6 , 0.2 , 0.2 ]])

and the whole thing in one expression:

In [73]: np.add.reduceat(np.add.reduceat(x,[0,1,3], axis=1),[0,1,3],axis=0)/ np.array([1,2,1])[:,None]                                                                                    
Out[73]: 
array([[0.5 , 0.4 , 0.1 ],
       [0.4 , 0.45, 0.15],
       [0.6 , 0.2 , 0.2 ]])

Upvotes: 1

jodobear

Reputation: 51

Since you are changing the original matrix size it would be better to do it in two steps as mentioned in the previous answers but, if you want to do it in one command, you could do it as follows and it makes for a nice generalized solution:

import numpy as np

x = np.array(([0.5,  0.3,  0.1,  0.1, 1],
                [0.4,  0.1,  0.3,  0.2, 1],
                [0.4,  0.3,  0.2,  0.1, 1],
                [0.6,  0.1,  0.1,  0.2, 1]))

def sum_columns(matrix, col_start, col_end):
    return np.column_stack((matrix[:, 0:col_start],
                            np.sum(matrix[:, col_start:col_end + 1], axis=1),
                            matrix[:, col_end + 1:]))

def avgRows_summedColumns(matrix, row_start, row_end):
    return np.row_stack((matrix[0:row_start, :],
                        np.mean(matrix[row_start:row_end + 1, :], axis=0),
                        matrix[row_end:-1, :]))

# call the entire operation in one command
print(avgRows_summedColumns(sum_columns(x, 1, 2), 1, 2))

This way it doesn't matter how big your matrix is.

Upvotes: 1

How to sum/average a specific subset of columns or rows and return the new ndarray in numpy?

Answers (3)

Related Questions