bvowe
bvowe

Reputation: 3384

R Data.Table Check Group Condition

data=data.frame("StudentID" = c(1,1,1,2,2,2,3,3,3),
"Grade"=c(1,2,3,1,2,3,1,2,3),
"Score" = c(1,2,5,2,4,3,1,2,2))

I have 'data' and want to make 'data1' where

data1=data.frame("StudentID" = c(1,1,1,2,2,2,3,3,3),
"Grade"=c(1,2,3,1,2,3,1,2,3),
"Score" = c(1,2,5,2,4,3,1,2,2),
"Flag"=c(0,0,0,1,1,1,2,2,2))

And what Flag does is it indicates if any Score at Grade G for a StudentID is higher at G-1. In other words we expect Score to only go up across grades.

Using @akron perfect answer, library(data.table) setDT(data)[, flag := fifelse(any(diff(Score) < 0), 1, fifelse(anyDuplicated(Score) > 0, 2, 0)) , .(StudentID)]

Now say I there is a flag 2. for a student. How can I update their SECOND consecutive score by adding 1.

Using data1 above

data1=data.frame("StudentID" = c(1,1,1,2,2,2,3,3,3),
"Grade"=c(1,2,3,1,2,3,1,2,3),
"Score" = c(1,2,5,2,4,3,1,2,2),
"Flag"=c(0,0,0,1,1,1,2,2,2),
"Score2" = c(1,2,5,2,4,3,1,2,3))

Upvotes: 0

Views: 500

Answers (1)

akrun
akrun

Reputation: 887118

We convert to 'data.table' (setDT, grouped by 'StudentID', use fifelse to create the 'flag' by checking for any difference in 'Score' less than 0 (basically checks for cases where there is a decrease in values), specify it as 1, if there any duplicates, then it would be 2 and the rest as 0

library(data.table)
setDT(data)[, flag := fifelse(any(diff(Score) < 0), 1, 
      fifelse(anyDuplicated(Score) > 0, 2, 0)) , .(StudentID)]

For the updated case

setDT(data1)[Score2 := Score][Flag == 2, Score2 := seq(Score[1], 
          length.out = .N, by = 1), StudentID]

Or with dplyr

library(dplyr)
data %>% 
  group_by(StudentID) %>% 
  mutate(flag = case_when(any(diff(Score) < 0) ~ 1,  
                anyDuplicated(Score) > 0 ~ 2,  TRUE ~ 0))

Upvotes: 1

Related Questions