Reputation: 4226
I'm trying to programmatically change a variable from a 0
to a 1
if there are three 1
s before and after a 0
.
For example, if the number in a vector were 1
, 1
, 1
, 0
, 1
, 1
, and 1
, then I want to change the 0
to a 1
.
Here is data in the vector dummy_code
in the data.frame
df
:
original_df <- data.frame(dummy_code = c(1, 0, 0, 1, 1, 1, 0, 1, 1, 1, 0, 0, 1))
Here is how I'm trying to have the values be recoded:
desired_df <- data.frame(dummy_code = c(1, 0, 0, 1, 1, 1, 1, 1, 1, 1, 0, 0, 1)
I tried to use the function fill
in the package tidyr
, but this fills in missing values, so it won't work. If I were to recode the 0
values to be missing, then that would not work either, because it would simply code every NA
as 1
, when I would only want to code every NA
surrounded by three 1s
as 1
.
Is there a way to do this in an efficient way programmatically?
Upvotes: 2
Views: 130
Reputation: 67778
An rle
alternative, using the x
from @G. Grothendieck's answer:
r <- rle(x)
Find indexes of runs of three 1
:
i1 <- which(r$lengths == 3 & r$values == 1)
Check which of the "1
indexes" that surround a 0
, and get the indexes of the 0
to be replaced:
i2 <- i1[which(diff(i1) == 2)] + 1
Replace relevant 0
with 1
:
r$values[i2] <- 1
Reverse the rle
operation on the updated runs:
inverse.rle(r)
# [1] 1 0 0 1 1 1 1 1 1 1 0 0 1
A similar solution based on data.table::rleid
, slightly more compact and perhaps easier to read:
library(data.table)
d <- data.table(x)
Calculate length of each run:
d[ , n := .N, by = rleid(x)]
For "x" which are zero and the preceeding and subsequent runs of 1
are of length 3
, set "x" to 1
:
d[x == 0 & shift(n) == 3 & shift(n, type = "lead") == 3, x := 1]
d$x
# [1] 1 0 0 1 1 1 1 1 1 1 0 0 1
Upvotes: 3
Reputation: 269501
Here is a one-liner using rollapply
from zoo:
library(zoo)
rollapply(c(0, 0, 0, x, 0, 0, 0), 7, function(x) if (all(x[-4] == 1)) 1 else x[4])
## [1] 1 0 0 1 1 1 1 1 1 1 0 0 1
Note: Input used was:
x <- c(1, 0, 0, 1, 1, 1, 0, 1, 1, 1, 0, 0, 1)
Upvotes: 3