Sardar Usama
Sardar Usama

Reputation: 19689

Removing columns with all row entries of all zero except for the first occurrence

I have a big matrix with 3 rows and many columns. I want to remove the columns which have zeroes in all its rows except for the first occurrence.

For example, given the following matrix:

    1 1 0 0 0 0 0 0
A = 1 0 0 1 0 0 0 0
    1 0 1 1 0 0 0 0

It would thus be transformed to:

    1 1 0 0 0 
A = 1 0 0 1 0 
    1 0 1 1 0 

Upvotes: 1

Views: 352

Answers (2)

hiandbaii
hiandbaii

Reputation: 1331

 b = any(A);
 b(find(b == 0,1)) = 1;
 A=A(:,b)

fixed per comments to match op

Upvotes: 2

rayryeng
rayryeng

Reputation: 104504

One approach would be to use all and search along all rows for every column to see if every element in a column is equal to 0. We use find to determine these column locations. As soon as you do that, make a copy of the original matrix, then dump all columns that have zeroes except for the first time we encounter such a column:

ind = find(all(A == 0, 1));
out = A;
out(:,ind(2:end)) = [];

With your example, we get:

>> out
out =
     1     1     0     0     0
     1     0     0     1     0
     1     0     1     1     0

A nice point about this approach is that even if there are no columns of complete zero, find will return an empty array and slicing into an empty array will also produce an empty array. The removal step at the last line of code will thus have no effect and you maintain the same matrix as you did before.


If the constraint is maintained such that you will only see columns of zeroes at the end of your matrix and they don't appear in between valid data, we can do this by combining any and all with logical indexing:

out = A(:,any(A,1) | diff([false all(A == 0, 1)]));

We build a mask where the first part of it consists of all of the columns that are non-zero. any in this context will find all columns that are non-zero. This should happen at the very beginning of your data thus building the first half of the mask. The next part uses diff to find pairwise differences in combination with the array that is output by the same all call that we have seen before. We are assuming that the first column will never be non-zero (which is your case), and so padding an array where the first element is false followed by the same all call will determine a logical array where there will be only one time that the difference returned is non-zero, which is the point where the first zero column is returned. We set this location in the mask to be true as well as all other locations that are non-zero, and we finally subset into your matrix thus achieving your result.

Upvotes: 3

Related Questions