Reputation: 53
I have a data.table in R that I need to subset based on certain conditions
dt <- data.table(X_ID=c(1,1,1,1,1,1,2,2,2,2,2,2,3,3,3,3,3,3,4,4,4,4,4,4), Y_ID=c(1,2,3,4,5,6, 1,2,3,4,5,6, 1,2,3,4,5,6, 1,2,3,4,5,6), Value1=c(6,4,8,9,10,12,5,7,8,3,4,8,7,3,9,7,4,12,45,3,2,1,2,5),Value2=c(6,6,8,9,10,12,5,4,8,3,4,8,7,5,9,7,4,12,45,1,2,1,2,5))
I want to retain all X_ID's in which value 1 is greater than value 2 when Y_ID is equal to 2. E.g., for X_ID=1, value 1 is greater than value 2 when Y_ID=2. In this case, I want to retain all rows that have X_ID=1.
I am currently using the filter function (shown below) to get at the solution.
dt2 <- filter(dt,is.na(match(dt$X_ID,filter(dt, Y_ID==2&Value1<Value2)$X_ID)))
But is there a way to make this code more readable? Or use something native to data.table to subset the rows I require?
Upvotes: 3
Views: 939
Reputation: 4480
Using data.table
:
dt<-dt[X_ID %in% dt[(Y_ID == 2 & (Value1 > Value2))]$X_ID]
# X_ID Y_ID Value1 Value2
#1: 2 1 5 5
#2: 2 2 7 4
#3: 2 3 8 8
#4: 2 4 3 3
#5: 2 5 4 4
#6: 2 6 8 8
#7: 4 1 45 45
#8: 4 2 3 1
#9: 4 3 2 2
#10: 4 4 1 1
#11: 4 5 2 2
#12: 4 6 5 5
Upvotes: 3
Reputation: 389047
Using dplyr
we can group_by
X_ID
and filter
for groups where there is at least one value for Y_ID == 2
and Value1 > Value2
.
library(dplyr)
dt %>%
group_by(X_ID) %>%
filter(any(Y_ID == 2 & Value1 > Value2))
# X_ID Y_ID Value1 Value2
# <dbl> <dbl> <dbl> <dbl>
# 1 2 1 5 5
# 2 2 2 7 4
# 3 2 3 8 8
# 4 2 4 3 3
# 5 2 5 4 4
# 6 2 6 8 8
# 7 4 1 45 45
# 8 4 2 3 1
# 9 4 3 2 2
#10 4 4 1 1
#11 4 5 2 2
#12 4 6 5 5
Upvotes: 2