Reputation: 483
I have two arrays and I need to count the number of rows that do not contain an NaN in any column in either array. I want sample size after using an array of inputs to train a vector of targets (where NaN rows are not used). Here is an example of my current solution:
% A matrix
A = [
-0.0057 14.8750 293.2000 2.3743 0 NaN -0.1186 NaN 38.1000
2.1543 10.2240 294.0200 1.7650 0 NaN 0.0962 NaN 30.4800
2.6071 7.1014 266.4000 1.3941 0 NaN -0.1110 23.6660 27.9400
0.9736 10.5730 271.2000 1.8700 0 NaN -0.2457 31.7290 27.9400
-0.7138 13.6430 286.3100 2.0655 0 NaN -0.5152 44.3640 27.9400
4.4969 5.5410 280.1600 0.6042 0 NaN -0.2783 47.9240 27.9400
5.4186 2.5648 251.6900 0.2323 0 NaN -0.0879 39.6710 25.4000
4.3641 3.4062 266.7800 0.5696 0 NaN -0.0638 26.9330 25.4000
-0.3348 8.2900 258.8900 1.3736 0 NaN -0.0414 59.2570 25.4000
0.3007 8.3617 274.7400 1.3929 0 NaN -0.3473 46.6710 25.4000
3.0400 4.6077 267.3400 0.9704 0 0.5178 -0.2080 32.4850 25.4000
2.1950 7.7303 253.8300 1.3545 0 0.4927 -0.0870 31.4520 25.4000
-0.4413 4.2283 275.7400 0.4724 0 0.3687 -0.2470 40.3630 27.9400
-0.8667 4.0397 261.0800 0.6118 0 0.4143 -0.4723 28.7360 27.9400
-8.0407 2.2782 158.9600 0.4654 0 0.1775 -0.9863 56.7880 30.4800
-15.4630 2.0072 230.4100 0.2572 0 0.0530 -2.2110 71.3660 35.5600
-14.7670 6.6983 293.4800 0.9218 0 0.1224 -4.3823 42.2330 38.1000
-8.5713 4.2573 249.6900 0.5928 0 0.2057 -4.6927 37.2790 38.1000
-13.4820 1.4811 120.2200 0.2327 0 0.0542 -4.1213 76.5140 38.1000
-15.6230 3.9040 300.8400 0.2369 0 0.0602 -3.4780 71.9860 NaN]
% And a vector of inputs
B = [
NaN
NaN
1.2009
0.6404
0.5739
0.6846
0.4121
0.7475
0.5931
0.5706
0.8581
0.9910
NaN
0.5652
0.4008
NaN
0.4585
0.5463
0.2903
0.3150]
% Inputs
Alogic = isnan(A); % logical matrix of nans for drivers used
AlogicNaNSum = sum(Alogic,2); % sum by row
ANaNSumlogic = AlogicNaNSum >0; % logical by row with 0 for complete, 1 for some row containing NaNs
% Target
Blogic = isnan(B); % logical version of target with 0 for complete, 1 for containing NaN
SumNaNrows = Blogic + ANaNSumlogic; % add logical vectors, with 0 meaning no NaNs in any column
% Final number of rows with no NaN in any column
complete = sum(SumNaNrows(:)==0)
It seems like there should be a more elegant way to do this (fewer lines of code) that could still apply to vectors and/or matrices of the same length. There are many posts already about finding and replacing NaN rows like this and this, but I haven't found as much about counting the total number of complete rows across arrays.
Upvotes: 0
Views: 35
Reputation: 65460
You can do this using some basic logical operations. As you've shown we can use isnan
to create a logical matrix the size of your input where it's true
where there is a NaN
. We can then use any
combined with the second input to check which rows have any NaN
values in them. We can then use the element-wise or
(|
) to create a logical matrix where we want the result to be true
if a row in A
has a NaN
value or there is a NaN
value in the corresponding location in B
.
toremove = any(isnan(A), 2) | isnan(B);
Then if you simply want the number of rows that match:
complete = sum(~(any(isnan(A), 2) | isnan(B)));
You could also flip the logic around a little bit and check for rows that have no NaN
values. The results will be the same
tokeep = all(~isnan(A), 2) & ~isnan(B);
complete = sum(tokeep);
Yet another alternative would be to simply append B
as a new column of A
and just check the resulting matrix for rows which don't contain any NaN
values
tokeep = ~any(isnan([A, B]), 2)
Upvotes: 2