Reputation: 11
i am trying to create a new variable in which the values are assigned as a function of specific set of sequential values in another column. below an example reporting the status (positive or negative) of 10 tests:
df<-data.frame(Trank=c(1:10), status=c(0,1,0,0,1,1,0,1,0,1))
now, the values in the new column "class" should be assigned following some rules such as: class="a" if the current test is negative but the previous two tests were positive and class=="b" if the current test is positive but the previous one was negative, otherwise class=="c". In this example I would get something like:
Trank status class
1 0 c
2 1 b
3 0 c
4 0 c
5 1 b
6 1 c
7 0 a
8 1 b
9 0 c
10 1 b
and I can't figure out how the conditional function to obtain this output should be. I do apologize for not having posted any attempt but I am really stuck with this. Any advice/suggestion would be hghly appreciated! Many thanks!
Upvotes: 0
Views: 47
Reputation: 50668
We can use dplyr::lag
with dplyr::case_when
to encode the different conditions
library(tidyverse)
df %>%
mutate(class = case_when(
status == 0 & lag(status) == 1 & lag(status, n = 2L) == 1 ~ "a",
status == 1 & lag(status) == 0 ~ "b",
TRUE ~ "c"))
# Trank status class
#1 1 0 c
#2 2 1 b
#3 3 0 c
#4 4 0 c
#5 5 1 b
#6 6 1 c
#7 7 0 a
#8 8 1 b
#9 9 0 c
#10 10 1 b
Upvotes: 1