Mat Teo
Mat Teo

Reputation: 11

create a new variable using specific sets of sequential values in another column

i am trying to create a new variable in which the values are assigned as a function of specific set of sequential values in another column. below an example reporting the status (positive or negative) of 10 tests:

    df<-data.frame(Trank=c(1:10), status=c(0,1,0,0,1,1,0,1,0,1))

now, the values in the new column "class" should be assigned following some rules such as: class="a" if the current test is negative but the previous two tests were positive and class=="b" if the current test is positive but the previous one was negative, otherwise class=="c". In this example I would get something like:

Trank status class
  1      0     c
  2      1     b
  3      0     c
  4      0     c
  5      1     b
  6      1     c
  7      0     a
  8      1     b
  9      0     c
 10      1     b

and I can't figure out how the conditional function to obtain this output should be. I do apologize for not having posted any attempt but I am really stuck with this. Any advice/suggestion would be hghly appreciated! Many thanks!

Upvotes: 0

Views: 47

Answers (1)

Maurits Evers
Maurits Evers

Reputation: 50668

We can use dplyr::lag with dplyr::case_when to encode the different conditions

library(tidyverse)
df %>%
    mutate(class = case_when(
        status == 0 & lag(status) == 1 & lag(status, n = 2L) == 1 ~ "a",
        status == 1 & lag(status) == 0 ~ "b",
        TRUE ~ "c"))
#   Trank status class
#1      1      0     c
#2      2      1     b
#3      3      0     c
#4      4      0     c
#5      5      1     b
#6      6      1     c
#7      7      0     a
#8      8      1     b
#9      9      0     c
#10    10      1     b

Upvotes: 1

Related Questions