Reputation: 121
I've found something interesting in Python's numpy. ma.average
is a lot slower than arr.mean
(arr is an array)
>>> arr = np.full((3, 3), -9999, dtype=float)
array([[-9999., -9999., -9999.],
[-9999., -9999., -9999.],
[-9999., -9999., -9999.]])
%timeit np.ma.average(arr, axis=0)
The slowest run took 49.32 times longer than the fastest. This could mean that an intermediate result is being cached.
10000 loops, best of 3: 191 µs per loop
%timeit arr.mean(axis=0)
The slowest run took 6.63 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 7.41 µs per loop
with random numbers
arr = np.random.random((3,3))
%timeit arr.mean(axis=0)
The slowest run took 6.17 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 7.78 µs per loop
%timeit np.ma.average(arr, axis=0)
1000 loops, best of 3: 186 µs per loop
--> That's nearly 24 times slower.
numpy.ma.average(a, axis=None, weights=None, returned=False)
Return the
weighted
average of array over the given axis.
numpy.mean(a, axis=None, dtype=None, out=None, keepdims)
Compute the arithmetic mean along the specified axis.
Why is ma.average
so much slower than arr.mean
? Mathematically they are the same (correct me if I'm wrong). My guess is that it has something to do with the weighted options on ma.average
but shouldn't be there a fallback if no weights are passed?
Upvotes: 2
Views: 1739
Reputation: 121
Thanks to @WillemVanOnsem and @sascha in the comments above
edit: applies to small arrays, see accepted answer for more information
Masked operations are slow try, to avoid it:
mask = self.local_pos_history[:, 0] > -9
local_pos_hist_masked = self.local_pos_history[mask]
avg = local_pos_hist_masked.mean(axis=0)
old with masked
mask = np.ma.masked_where(self.local_pos_history > -9, self.local_pos_history)
local_pos_hist_mask = self.local_pos_history[mask].reshape(len(self.local_pos_history) // 3, 3)
avg_pos = self.local_pos_history
np.average is nearly equal to arr.mean:
%timeit np.average(arr, axis=0)
The slowest run took 5.81 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 9.89 µs per loop
%timeit np.mean(arr, axis=0)
The slowest run took 6.44 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 8.74 µs per loop
just for clarification still a tests on small batch
Upvotes: 0
Reputation: 152765
A good way to find out why something is slower is to profile it. I'll use the 3rd party library line_profiler
and the IPython command %lprun
(see for example this blog) here:
%load_ext line_profiler
import numpy as np
arr = np.full((3, 3), -9999, dtype=float)
%lprun -f np.ma.average np.ma.average(arr, axis=0)
Line # Hits Time Per Hit % Time Line Contents
==============================================================
519 def average(a, axis=None, weights=None, returned=False):
...
570 1 1810 1810.0 30.5 a = asarray(a)
571 1 15 15.0 0.3 m = getmask(a)
572
573 # inspired by 'average' in numpy/lib/function_base.py
574
575 1 5 5.0 0.1 if weights is None:
576 1 3500 3500.0 59.0 avg = a.mean(axis)
577 1 591 591.0 10.0 scl = avg.dtype.type(a.count(axis))
578 else:
...
608
609 1 7 7.0 0.1 if returned:
610 if scl.shape != avg.shape:
611 scl = np.broadcast_to(scl, avg.shape).copy()
612 return avg, scl
613 else:
614 1 5 5.0 0.1 return avg
I removed some irrelevant lines.
So actually 30% of the time is spent in np.ma.asarray
(something that arr.mean
doesn't have to do!).
However the relative times change drastically if you use a bigger array:
arr = np.full((1000, 1000), -9999, dtype=float)
%lprun -f np.ma.average np.ma.average(arr, axis=0)
Line # Hits Time Per Hit % Time Line Contents
==============================================================
519 def average(a, axis=None, weights=None, returned=False):
...
570 1 609 609.0 7.6 a = asarray(a)
571 1 14 14.0 0.2 m = getmask(a)
572
573 # inspired by 'average' in numpy/lib/function_base.py
574
575 1 7 7.0 0.1 if weights is None:
576 1 6924 6924.0 86.9 avg = a.mean(axis)
577 1 404 404.0 5.1 scl = avg.dtype.type(a.count(axis))
578 else:
...
609 1 6 6.0 0.1 if returned:
610 if scl.shape != avg.shape:
611 scl = np.broadcast_to(scl, avg.shape).copy()
612 return avg, scl
613 else:
614 1 6 6.0 0.1 return avg
This time the np.ma.MaskedArray.mean
function almost takes up 90% of the time.
Note: You could also dig deeper and look into np.ma.asarray
or np.ma.MaskedArray.count
or np.ma.MaskedArray.mean
and check their line profilings. But I just wanted to show that there are lots of called function that add to the overhead.
So the next question is: did the relative times between np.ndarray.mean
and np.ma.average
also change? And at least on my computer the difference is much lower now:
%timeit np.ma.average(arr, axis=0)
# 2.96 ms ± 91 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
%timeit arr.mean(axis=0)
# 1.84 ms ± 23.8 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
This time it's not even 2 times slower. I assume for even bigger arrays the difference will get even smaller.
This is also something that is actually quite common with NumPy:
The constant factors are quite high even for plain numpy functions (see for example my answer to the question "Performance in different vectorization method in numpy"). For np.ma
these constant factors are even bigger, especially if you don't use a np.ma.MaskedArray
as input. But even though the constant factors might be high, these functions excel with big arrays.
Upvotes: 3