Reputation: 3048
Say I have a 500000x1
array called A
. I want to divide this array into 1000
equal sections, and then calculate the mean of that section. So I will end up with a 1000x1
array called B
, in which B[1]
is the mean of A[1:500]
, B[2]
is the mean of B[501:1000]`, and so on. Since I will be doing this many many times, I want to do it efficiently. What's the most effective way of doing this in Matlab/Python?
Upvotes: 1
Views: 165
Reputation: 221624
NumPy/Python
We could reshape to have 500
columns and then compute average along the second axis -
A.reshape(-1,500).mean(axis=1)
Sample run -
In [89]: A = np.arange(50)+1;
In [90]: A.reshape(-1,5).mean(1)
Out[90]: array([ 3., 8., 13., 18., 23., 28., 33., 38., 43., 48.])
Runtime test :
An alternative method to get those average values would be with the old-fashioned way of computing the sum and then dividing by the number of elements involved in the summation. Let's time these two methods -
In [107]: A = np.arange(500000)+1;
In [108]: %timeit A.reshape(-1,500).mean(1)
1000 loops, best of 3: 1.19 ms per loop
In [109]: %timeit A.reshape(-1,500).sum(1)/500.0
1000 loops, best of 3: 583 µs per loop
Seems, like quite an improvement there with the alternative method! But wait, it's because with mean
method NumPy is converting to float type by default and that conversion overhead showed up here.
So, if we use float type input arrays, we would have a different and a fair scenario -
In [144]: A = np.arange(500000).astype(float)+1;
In [145]: %timeit A.reshape(-1,500).mean(1)
1000 loops, best of 3: 534 µs per loop
In [146]: %timeit A.reshape(-1,500).sum(1)/500.0
1000 loops, best of 3: 516 µs per loop
MATLAB
With column-major ordering, we would reshape to have 500
rows and then average along the first dimension -
mean(reshape(A,500,[]),1)
Sample run -
>> A = 1:50;
>> mean(reshape(A,5,[]),1)
ans =
3 8 13 18 23 28 33 38 43 48
Runtime test :
Let's try out the old-fashioned way here too -
>> A = 1:500000;
>> func1 = @() mean(reshape(A,500,[]),1);
>> timeit(func1)
ans =
0.0013021
>> func2 = @() sum(reshape(A,500,[]),1)/500.0;
>> timeit(func2)
ans =
0.0012291
Upvotes: 3