Brad Davis
Brad Davis

Reputation: 1170

Calculating variance within cells across matricies in python

I have several matricies with identical dimensions, X, and Y. I want to calculate the variance for each cell across the matricies, such that the resulting output matrix would also have the same dimensions, X, and Y. For example

matrix1 = [[1,1,1], [2,2,2], [3,3,3]]
matrix2 = [[2,2,2], [3,3,3], [4,4,4]]
matrix3 = [[3,3,3], [4,4,4], [5,5,5]]

Using position (0,0) in each cell as an example, I need to first calculate the mean, which would be (1+2+3)/3 = 2

matrix_sum = matrix1 + matrix2 + matrix3

matrix_mean = matrix_sum / 3

Next I'd calculate the population variance which would be:

[(1-2)+(2-2)+(3-2)]^2

And I'd like to be able to do this for an indeterminate (but small number) of matricies (say 50), and the matricies themselves would be at max 250, 250 (they will always be square matricies)

for x in range(1,matrix_mean.shape[0]):
        for y in range(1,matrix_mean.shape[1]):
            standard_deviation_matrix.iat[x,y] = pow(matrix_mean.iat[x,y]- matrix1.iat[x,y],2) + pow(matrix_mean.iat[x,y]- matrix2.iat[x,y],2) + pow(matrix_mean.iat[x,y]- matrix3.iat[x,y],2)

standard_deviation_matrix = standard_deviation_matrix / (3-1)

Here, combined_matrix is just (matrix1 + matrix2 + matrix3 .. matrix5) / 5 (i.e. the mean within each cell across the matricies)

This seems to work, but it's super slow and super clunky; but it's how I'd do it in C. Is there an easier/better/more pythonic way to do this?

Thanks

Upvotes: 0

Views: 57

Answers (2)

DYZ
DYZ

Reputation: 57033

Convert each matrix into a numpy array, stack the arrays (this will add another dimension), and calculate the variance along that dimension:

m1 = np.array(matrix1)
...
m = np.stack([m1, m2, ...])
m.var(axis=0)

Upvotes: 2

Quang Hoang
Quang Hoang

Reputation: 150735

You can try:

all_mat = np.stack([matrix1, matrix2, matrix3])
mat_mean = all_mat.mean(axis=0)
variance = np.var(all_mat, axi=0)

Which gives you:

array([[0.66666667, 0.66666667, 0.66666667],
       [0.66666667, 0.66666667, 0.66666667],
       [0.66666667, 0.66666667, 0.66666667]])

Or for the std:

np.std(all_mat, axis=0)

And you get:

array([[0.81649658, 0.81649658, 0.81649658],
       [0.81649658, 0.81649658, 0.81649658],
       [0.81649658, 0.81649658, 0.81649658]])

Upvotes: 2

Related Questions