Numpy computes different standard deviation when axis is specified

Question

In the course of tracking down a related problem I stumbled upon the fact that np.std seems to be returning different values depending on whether the axis keyword argument was specified or the corresponding masking was done manually. Consider the following snippet:

import numpy as np

np.random.seed(123)

a = np.empty(shape=(100, 2), dtype=float)
a[:, 0] = np.random.uniform()
a[:, 1] = np.random.uniform()

print(np.std(a, axis=0)[0] == np.std(a[:, 0]))  # Should be the same.
print(np.std(a, axis=0)[1] == np.std(a[:, 1]))  # Should be the same.

However the two computations don't return the same result. Further inspection reveals:

>>> print('axis=0: {:e} vs {:e}'.format(np.std(a, axis=0)[0], np.std(a[:, 0])))
axis=0: 7.771561e-16 vs 2.220446e-16
>>> print('axis=1: {:e} vs {:e}'.format(np.std(a, axis=0)[1], np.std(a[:, 1])))
axis=1: 4.440892e-16 vs 0.000000e+00

I don't see why the two ways of computation would return different results since formally they describe the same procedure (masking the axis manually or letting numpy do the job by specifying axis shouldn't make a difference).

I am using Python 3.5.2 and numpy 1.15.0.

Numpy computes different standard deviation when axis is specified

Answers (1)

Related Questions