Cleb
Cleb

Reputation: 25997

Why does nansum work for input that exceeds matrix dimensions?

I am wondering about matlab's nansumfunction.

When I use the example from the documentation

X = magic(3);
X([1 6:9]) = repmat(NaN, 1, 5);

X =

   NaN     1   NaN
     3     5   NaN
     4   NaN   NaN

and then call

>> nansum(X, 1)

ans =

     7     6     0

>> nansum(X, 2)

ans =

     1
     8
     4

it works as expected.

However, what I did not expect is that it also works for

>> nansum(X, 400)

ans =

     0     1     0
     3     5     0
     4     0     0

What is the reasoning here? Why wouldn't this crash with the error that dimexceeds the matrix dimensions?

Upvotes: 3

Views: 199

Answers (3)

Wolfie
Wolfie

Reputation: 30045

In MATLAB, all arrays/matrices have infinite singleton trailing dimensions.

A singleton dimension is a dimension, dim, where size(A,dim) = 1. It's called a trailing singleton dimension when it comes after all non-singleton dimensions (i.e. it doesn't change the structure of the matrix).

Any function (including nansum) which can operate on a specific dimension can do so on any one of the infinite singleton dimensions. Often you wont see any affect (for instance using max or sum in this way simply returns the inputs[1]), but nansum replaces NaN with zero, so that's all that happens.

Note that nansum(A,dim) is the same as sum(A,dim,'omitnan'). You can see this by typing edit nansum. So my example uses sum for ease. See the bottom of this answer for references about defined behaviour.

Let's try to visualise this:

A = ones(3,4);
size( A ) % >> ans = [3, 4]
% Under the hood:
% size( A ) = [3, 4, 1, 1, 1, 1, ...]
sum( A, 1 )   % Sum through the rows, or the 1st dimension, which has 3 elements per sum
              % >> ans = [3 3 3 3]
sum( A, 2 )   % Sum through the columns, or the 2nd dimension, which has 4 elements per sum
              % >> ans = [4; 4; 4]
sum( A, 400 ) % Sum through the ???, the 400th dimension, which has 1 element per sum
              % >> ans = [1 1 1 1; 1 1 1 1; 1 1 1 1]

If you wanted, you could reshape the original matrix to have singleton 2nd through 399th dimensions to further this:

% Set up dimensions as [3, 1, 1, ..., 1, 1, 4], for a 400-D array!
dims = num2cell( [3 ones(1,398), 4] );
% Note we'll now still have trailing singleton dims, but have 398 in the structure too
B = reshape( A, dims{:} ); 

Now we can do a similar sum example. The final thing to know is that squeeze removes non-trailing singleton dimensions, we can use this to tidy up the outputs:

sum( B, 1 ); % >> ans(:,:,1,1,1,...,1) = 3 
             % >> ans(:,:,1,1,1,...,2) = 3
             % >> ans(:,:,1,1,1,...,3) = 3
             % >> ans(:,:,1,1,1,...,4) = 3
squeeze( sum( B, 1 ) ); % >> ans = [3; 3; 3; 3] 

% similarly  
squeeze( sum( B, 2 ) );   % >> ans = [1 1 1 1; 1 1 1 1; 1 1 1 1]
squeeze( sum( B, 400 ) ); % >> ans = [4; 4; 4]

We can see that, now we've reshaped things, summing in the 400th dimension does the same as originally summing in the 2nd dimension and vice-versa. This would be easier to visualise if you replaced 400 with 3!


[ 1 ] See the sum and max documentation as examples where the behaviour is explicitly defined "if dim is greater than ndims(A)." In both cases the implementation is made more efficient by just returning A. In the case of nansum there has to be some computation in case elements are NaN.

Upvotes: 10

Andrea Bellizzi
Andrea Bellizzi

Reputation: 497

The matrix doesn't have a 400th dimension (better, it has infinite implicit singleton dimension), simply the implementation of sum return the matrix when the dim input exceeds the matrix dimension as the documentation of sum says:

sum returns A when dim is greater than ndims(A) or when size(A,dim) is 1

https://it.mathworks.com/help/matlab/ref/sum.html#btv6ok6-3

Upvotes: 1

Nicky Mattsson
Nicky Mattsson

Reputation: 3052

The matrix does have a 400th dimension. It is 1. So when you sum in this dimension it just returns the matrix you give as input with the NaNs omitted / counted at 0.

It is the same with standard sum

>>A = magic(3)

A =

     8     1     6
     3     5     7
     4     9     2

>>sum(A,400)

ans =

     8     1     6
     3     5     7
     4     9     2

EDIT: A maybe better example is

A = 5;

This variable have size(A)=[1,1]; in other words the size of dimension 1 is 1 just as its 400th dimension, but it still makes sense to sum

sum(A)

ans =

     5

Upvotes: 2

Related Questions