Oliver Crow
Oliver Crow

Reputation: 334

Calculating Mean & STD for Batch [Python/Numpy]

Looking to calculate Mean and STD per channel over a batch efficiently.


Details:

So each batch is of size [128, 32, 32, 3].

There are lots of batches (naive method takes ~4min over all batches).

And I would like to output 2 arrays: (meanR, meanG, meanB) and (stdR, stdG, stdB)


(Also if there is an efficient way to perform arithmetic operations on the batches after calculating this, then that would be helpful. For example, subtracting the mean of the whole dataset from each image)

Upvotes: 3

Views: 4548

Answers (3)

MaxU - stand with Ukraine
MaxU - stand with Ukraine

Reputation: 210982

If I understood you correctly and you want to calculate mean and std values for all images:

Demo: 2 images of (2,2,3) shape each (for the sake of simplicity):

In [189]: a
Out[189]:
array([[[[ 1,  2,  3],
         [ 4,  5,  6]],

        [[ 7,  8,  9],
         [10, 11, 12]]],


       [[[13, 14, 15],
         [16, 17, 18]],

        [[19, 20, 21],
         [22, 23, 24]]]])

In [190]: a.shape
Out[190]: (2, 2, 2, 3)

In [191]: np.mean(a, axis=(0,1,2))
Out[191]: array([ 11.5,  12.5,  13.5])

In [192]: np.einsum('ijkl->l', a)/float(np.prod(a.shape[:3]))
Out[192]: array([ 11.5,  12.5,  13.5])

Speed measurements:

In [202]: a = np.random.randint(255, size=(128,32,32,3))

In [203]: %timeit np.mean(a, axis=(0,1,2))
9.48 ms ± 822 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

In [204]: %timeit np.einsum('ijkl->l', a)/float(np.prod(a.shape[:3]))
1.82 ms ± 22.2 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

Upvotes: 5

jebarry will
jebarry will

Reputation: 86

You can use this method to calc the mean and std of R, G, B.

a = np.random.rand(128,32,32,3)
for i in range(3):
    means = [m for m in np.mean(a, axis = (3, i))]
for i in range(3):
    stds = [s for s in np.std(a, axis = (3, i))]

while axis=(3,i) 3 represents the channels, and i represents the colors(R, G, B). Also you can reference this link.Get mean of 2D slice of a 3D array in numpy . And I hope this can help you.

Upvotes: 2

ZYYYY
ZYYYY

Reputation: 101

Assume you want to get the mean of multiple axis(if I didn't get you wrong). numpy.mean(a, axis=None) already supports multiple axis mean if axis is a tuple.

I'm not so sure what you mean by naive method.

Upvotes: 2

Related Questions