user2860703
user2860703

Reputation: 483

complete rows across multiple arrays in matlab

I have two arrays and I need to count the number of rows that do not contain an NaN in any column in either array. I want sample size after using an array of inputs to train a vector of targets (where NaN rows are not used). Here is an example of my current solution:

% A matrix
A = [
   -0.0057   14.8750  293.2000    2.3743         0       NaN   -0.1186       NaN   38.1000
    2.1543   10.2240  294.0200    1.7650         0       NaN    0.0962       NaN   30.4800
    2.6071    7.1014  266.4000    1.3941         0       NaN   -0.1110   23.6660   27.9400
    0.9736   10.5730  271.2000    1.8700         0       NaN   -0.2457   31.7290   27.9400
   -0.7138   13.6430  286.3100    2.0655         0       NaN   -0.5152   44.3640   27.9400
    4.4969    5.5410  280.1600    0.6042         0       NaN   -0.2783   47.9240   27.9400
    5.4186    2.5648  251.6900    0.2323         0       NaN   -0.0879   39.6710   25.4000
    4.3641    3.4062  266.7800    0.5696         0       NaN   -0.0638   26.9330   25.4000
   -0.3348    8.2900  258.8900    1.3736         0       NaN   -0.0414   59.2570   25.4000
    0.3007    8.3617  274.7400    1.3929         0       NaN   -0.3473   46.6710   25.4000
    3.0400    4.6077  267.3400    0.9704         0    0.5178   -0.2080   32.4850   25.4000
    2.1950    7.7303  253.8300    1.3545         0    0.4927   -0.0870   31.4520   25.4000
   -0.4413    4.2283  275.7400    0.4724         0    0.3687   -0.2470   40.3630   27.9400
   -0.8667    4.0397  261.0800    0.6118         0    0.4143   -0.4723   28.7360   27.9400
   -8.0407    2.2782  158.9600    0.4654         0    0.1775   -0.9863   56.7880   30.4800
  -15.4630    2.0072  230.4100    0.2572         0    0.0530   -2.2110   71.3660   35.5600
  -14.7670    6.6983  293.4800    0.9218         0    0.1224   -4.3823   42.2330   38.1000
   -8.5713    4.2573  249.6900    0.5928         0    0.2057   -4.6927   37.2790   38.1000
  -13.4820    1.4811  120.2200    0.2327         0    0.0542   -4.1213   76.5140   38.1000
  -15.6230    3.9040  300.8400    0.2369         0    0.0602   -3.4780   71.9860   NaN]

% And a vector of inputs
B = [
       NaN
       NaN
    1.2009
    0.6404
    0.5739
    0.6846
    0.4121
    0.7475
    0.5931
    0.5706
    0.8581
    0.9910
       NaN
    0.5652
    0.4008
       NaN
    0.4585
    0.5463
    0.2903
    0.3150]

% Inputs
Alogic = isnan(A);             % logical matrix of nans for drivers used
AlogicNaNSum = sum(Alogic,2);          % sum by row
ANaNSumlogic = AlogicNaNSum >0;            % logical by row with 0 for complete, 1 for some row containing NaNs

% Target
Blogic = isnan(B);                    % logical version of target with 0 for complete, 1 for containing NaN
SumNaNrows = Blogic + ANaNSumlogic; % add logical vectors, with 0 meaning no NaNs in any column

% Final number of rows with no NaN in any column
complete = sum(SumNaNrows(:)==0) 

It seems like there should be a more elegant way to do this (fewer lines of code) that could still apply to vectors and/or matrices of the same length. There are many posts already about finding and replacing NaN rows like this and this, but I haven't found as much about counting the total number of complete rows across arrays.

Upvotes: 0

Views: 35

Answers (1)

Suever
Suever

Reputation: 65460

You can do this using some basic logical operations. As you've shown we can use isnan to create a logical matrix the size of your input where it's true where there is a NaN. We can then use any combined with the second input to check which rows have any NaN values in them. We can then use the element-wise or (|) to create a logical matrix where we want the result to be true if a row in A has a NaN value or there is a NaN value in the corresponding location in B.

toremove = any(isnan(A), 2) | isnan(B);

Then if you simply want the number of rows that match:

complete = sum(~(any(isnan(A), 2) | isnan(B)));

You could also flip the logic around a little bit and check for rows that have no NaN values. The results will be the same

tokeep = all(~isnan(A), 2) & ~isnan(B);
complete = sum(tokeep);

Yet another alternative would be to simply append B as a new column of A and just check the resulting matrix for rows which don't contain any NaN values

tokeep = ~any(isnan([A, B]), 2)

Upvotes: 2

Related Questions