Karnivaurus
Karnivaurus

Reputation: 24121

Calculating means of a NumPy array by grouping rows

I have an m-by-n NumPy array A, where each row represents an observation of some data. My rows are also assigned to one of c classes, and the class for each row is stored in an m-by-1 NumPy array B. I now want to compute the mean observation data M for each class. How can I do this?

For example:

A = numpy.array([[1, 2, 3], [1, 2, 3], [3, 4, 5], [4, 5, 6]])
B = numpy.array([1, 0, 0, 1]) # the first row is class 1, the second row is class 0 ...
M = # Do something

This should give me the output:

>>M
numpy.array([[2, 3, 4], [2.5, 3.5, 4.5]])

Here, row i in M is the mean for class i.

Upvotes: 1

Views: 337

Answers (3)

Eelco Hoogendoorn
Eelco Hoogendoorn

Reputation: 10759

This is a typical grouping problem, which can be solved in a single line using the numpy_indexed package (disclaimer: I am its author):

import numpy_indexed as npi
npi.group_by(B).mean(A)

Upvotes: 0

Daniel
Daniel

Reputation: 19547

Another way to do this using numpy's new at functionality.

A = numpy.array([[1, 2, 3], [1, 2, 3], [3, 4, 5], [4, 5, 6]])
B = numpy.array([1, 0, 0, 1])

u, uinds = numpy.unique(B, return_inverse=True)
M = numpy.zeros((u.shape[0], A.shape[-1]))
numpy.add.at(M, B, A)
M /= numpy.bincount(uinds)[:, None]

M
array([[ 2. ,  3. ,  4. ],
       [ 2.5,  3.5,  4.5]])

As mentioned pandas would make this easier:

import pandas as pd

>>> pd.DataFrame(A).groupby(B).mean()
     0    1    2
0  2.0  3.0  4.0
1  2.5  3.5  4.5

Upvotes: 2

eickenberg
eickenberg

Reputation: 14377

As mentioned in a comment, depending on where you want to go with this, pandas may be more useful. But right now this is still possible with numpy

import numpy
A = numpy.array([[1, 2, 3], [1, 2, 3], [3, 4, 5], [4, 5, 6]])
B = numpy.array([1, 0, 0, 1])

class_indicators = B[:, numpy.newaxis] == numpy.unique(B)
mean_operator = numpy.linalg.pinv(class_indicators.astype(float))

means = mean_operator.dot(A)

This example works for many classes etc, but as you see, this may be cumbersome

Upvotes: 3

Related Questions