Joshua Rosenberg
Joshua Rosenberg

Reputation: 4226

Recode a value in a vector based on surrounding values

I'm trying to programmatically change a variable from a 0 to a 1 if there are three 1s before and after a 0.

For example, if the number in a vector were 1, 1, 1, 0, 1, 1, and 1, then I want to change the 0 to a 1.

Here is data in the vector dummy_code in the data.frame df:

original_df <- data.frame(dummy_code = c(1, 0, 0, 1, 1, 1, 0, 1, 1, 1, 0, 0, 1))

Here is how I'm trying to have the values be recoded:

desired_df <- data.frame(dummy_code = c(1, 0, 0, 1, 1, 1, 1, 1, 1, 1, 0, 0, 1)

I tried to use the function fill in the package tidyr, but this fills in missing values, so it won't work. If I were to recode the 0 values to be missing, then that would not work either, because it would simply code every NA as 1, when I would only want to code every NA surrounded by three 1s as 1.

Is there a way to do this in an efficient way programmatically?

Upvotes: 2

Views: 130

Answers (2)

Henrik
Henrik

Reputation: 67778

An rle alternative, using the x from @G. Grothendieck's answer:

r <- rle(x)

Find indexes of runs of three 1:

i1 <- which(r$lengths == 3 & r$values == 1)

Check which of the "1 indexes" that surround a 0, and get the indexes of the 0 to be replaced:

i2 <- i1[which(diff(i1) == 2)] + 1

Replace relevant 0 with 1:

r$values[i2] <- 1

Reverse the rle operation on the updated runs:

inverse.rle(r)
# [1] 1 0 0 1 1 1 1 1 1 1 0 0 1

A similar solution based on data.table::rleid, slightly more compact and perhaps easier to read:

library(data.table)
d <- data.table(x)

Calculate length of each run:

d[ , n := .N, by = rleid(x)]

For "x" which are zero and the preceeding and subsequent runs of 1 are of length 3, set "x" to 1:

d[x == 0 & shift(n) == 3 & shift(n, type = "lead") == 3, x := 1]
d$x
# [1] 1 0 0 1 1 1 1 1 1 1 0 0 1 

Upvotes: 3

G. Grothendieck
G. Grothendieck

Reputation: 269501

Here is a one-liner using rollapply from zoo:

library(zoo)

rollapply(c(0, 0, 0, x, 0, 0, 0), 7, function(x) if (all(x[-4] == 1)) 1 else x[4])
##  [1] 1 0 0 1 1 1 1 1 1 1 0 0 1

Note: Input used was:

x <- c(1, 0, 0, 1, 1, 1, 0, 1, 1, 1, 0, 0, 1)

Upvotes: 3

Related Questions