Sashasaurus
Sashasaurus

Reputation: 1

New column for sequential rows in a loop in R

I would like to create a new column based on values in sequential rows. I have the following data frame where there are two subjects in the subject column and I would like to compare each row in the trial column to the row preceding it for each subject.

df <- data.frame(subject = c(1, 1, 1, 1, 1, 2, 2, 2, 2, 2), 
             trial = c("switch", "noswitch", "switch", "switch", "noswitch", "switch", "switch", "noswitch", "noswitch", "noswitch")) 

I'm trying to use an if statement to look behind one row, then create a new column based on the 4 possible configurations of two sequential rows: switch-switch, noswitch-noswitch, noswitch-switch, switch-noswitch. The new column would be populated by a name for those four configurations: comp, incomp, noswitch_incomp, switch_comp, respectively. The loop would start over for the new subject, so the first index would be NA because there is no preceding value. So far I have the following:

for(i in seq.int(unique(df$subject))){ 
  df$results <- if(df$switch == "switch" & lag(df$switch, 1) == "switch"){
    "comp" 
    } else if (df$switch == "noswitch" & lag(df$switch, 1) == "noswitch"){
      "incomp"
    } else if (df$switch == "noswitch" & lag(df$switch, 1) == "switch"){
      "noswitch_incomp"
    } else {
      "switch_comp" 
    }
}

I'm getting the following error which I think has something to do with the if statement not evaluating the arguments within:

   Error in if (df$switch == "switch" & lag(df$switch, 1) == "switch") { : 
  argument is of length zero

I've tried using mutate() with dplyr to match conditions but similar errors occur. Is there another function I could try to evaluate these conditions?

Upvotes: 0

Views: 118

Answers (1)

hpesoj626
hpesoj626

Reputation: 3619

You can use dplyr::case_when.

df %>% group_by(subject) %>%
  mutate(results = case_when(
    trial == 'switch' & lag(trial) == 'switch' ~ 'comp',
    trial == 'noswitch' & lag(trial) == 'noswitch' ~ 'incomp',
    trial == 'noswitch' & lag(trial) == 'switch' ~ 'noswitch_incomp',
    trial == 'switch' & lag(trial) == 'noswitch' ~ 'switch_comp'
  ))

# # A tibble: 10 x 3
# # Groups:   subject [2]
#    subject trial    results     
#      <dbl> <chr>    <chr>          
#  1      1. switch   NA             
#  2      1. noswitch noswitch_incomp
#  3      1. switch   switch_comp    
#  4      1. switch   comp           
#  5      1. noswitch noswitch_incomp
#  6      2. switch   NA             
#  7      2. switch   comp           
#  8      2. noswitch noswitch_incomp
#  9      2. noswitch incomp         
# 10      2. noswitch incomp

Upvotes: 1

Related Questions