Reputation: 3394
data1=data.frame("Student"=c(1, 1, 1, 2, 2, 2, 3, 3, 3),
"Grade"=c(1, 2, 3, 1, 2, 3, 1, 2, 3),
"Score"=c(5, 7, 9, 2, 2, 3, 10, NA, 3))
data2=data.frame("Student"=c(1, 1, 1, 3, 3, 3),
"Grade"=c(1, 2, 3, 1, 2, 3),
"Score"=c(5, 7, 9, 10, NA, 3))
I have 'data1' and wish for 'data2' where I ONLY include 'Student' if 'Score' at 'Grade' = 1 is at least 4.
My only knowledge of how to do this is doing it by 'Grade' and 'Score' but that does not give desired output.
library(data.table)
setDT(data1)
data1=data1[Grade==1 & Score >=4)
Upvotes: 1
Views: 81
Reputation: 1595
You just need to do a join with your desired conditions to retain the Student
id.
Does this work?
library(data.table)
data1 <- data.frame("Student"=c(1, 1, 1, 2, 2, 2, 3, 3, 3),
"Grade"=c(1, 2, 3, 1, 2, 3, 1, 2, 3),
"Score"=c(5, 7, 9, 2, 2, 3, 10, NA, 3))
data2 <- data.frame("Student"=c(1, 1, 1, 3, 3, 3),
"Grade"=c(1, 2, 3, 1, 2, 3),
"Score"=c(5, 7, 9, 10, NA, 3))
setDT(data1)
setDT(data2)
wanted <- data1[ Grade == 1 & Score >= 4, .( Student ) ]
setkey( wanted, Student )
setkey( data1, Student )
data3 = data1[ wanted ]
data2
#> Student Grade Score
#> 1: 1 1 5
#> 2: 1 2 7
#> 3: 1 3 9
#> 4: 3 1 10
#> 5: 3 2 NA
#> 6: 3 3 3
data3
#> Student Grade Score
#> 1: 1 1 5
#> 2: 1 2 7
#> 3: 1 3 9
#> 4: 3 1 10
#> 5: 3 2 NA
#> 6: 3 3 3
Created on 2020-04-29 by the reprex package (v0.3.0)
Upvotes: 2