Alex Lizz
Alex Lizz

Reputation: 445

Replacing consecutive zeros with a desired value

Let's say I have a matrix (or a vector) of the form

>set.seed(1)
>X=ifelse(matrix((runif(30)),ncol = 2)>0.4,0,1)

      [,1] [,2]
 [1,]    1    1
 [2,]    1    1
 [3,]    0    1
 [4,]    0    0
 [5,]    1    1
 [6,]    0    0
 [7,]    0    0
 [8,]    0    0
 [9,]    0    1
[10,]    1    0
[11,]    1    0
[12,]    1    0
[13,]    0    1
[14,]    1    0
[15,]    0    0
    ...
    etc

How can I count number of consecutive zeros between ones in each column and replace zeros with 1 for these that have count less than predefined constant k. Or at the very least to get the start index and number of elements in each sequence of zeros. Generally there are much more zeros than ones in this data set, and most of the time the length of a sequence is greater than k

So, for example, if k=1, then [4,2];[13,1] and [15,1] are going to be replaced by 1. If k=2 than in addition to [4,1];[13,1] and [15,1], zeros in [3,1],[4,1], [14,2], and [15,2] are going to be replaced by 1 as well in this example.

Of course, I can just run a loop and go through all the rows. I wonder if there is a package, or a neat vectorization trick that can do it.

Update:

desired output example for k=1

      [,1] [,2]
 [1,]    1    1
 [2,]    1    1
 [3,]    0    1
 [4,]    0    1
 [5,]    1    1
 [6,]    0    0
 [7,]    0    0
 [8,]    0    0
 [9,]    0    1
[10,]    1    0
[11,]    1    0
[12,]    1    0
[13,]    1    1
[14,]    1    0
[15,]    1    0

Desired output for k=2

      [,1] [,2]
 [1,]    1    1
 [2,]    1    1
 [3,]    1    1
 [4,]    1    1
 [5,]    1    1
 [6,]    0    0
 [7,]    0    0
 [8,]    0    0
 [9,]    0    1
[10,]    1    0
[11,]    1    0
[12,]    1    0
[13,]    1    1
[14,]    1    1
[15,]    1    1

Upvotes: 2

Views: 499

Answers (1)

Frank
Frank

Reputation: 66819

The run-length tool rle works here:

fill_shortruns <- function(X,k=1,badval=0,newval=1){
    apply(X,2,function(x){
        r <- rle(x)
        r$values[ r$lengths <= k & r$values == badval ] <- newval
        inverse.rle(r)
    })
}

# smaller example

set.seed(1)
X0=ifelse(matrix((runif(10)),ncol = 2)>0.4,0,1)
#      [,1] [,2] [,3] [,4]
# [1,]    1    0    1    0
# [2,]    1    0    1    0
# [3,]    0    0    0    0
# [4,]    0    0    1    1
# [5,]    1    1    0    0

fill_shortruns(X0,2)
#      [,1] [,2] [,3] [,4]
# [1,]    1    0    1    0
# [2,]    1    0    1    0
# [3,]    1    0    1    0
# [4,]    1    0    1    1
# [5,]    1    1    1    1

Upvotes: 4

Related Questions