Reputation: 907
Let's say we have
library(data.table)
dt <- data.table(Date = c(201405,201405,201504,201505, 201505,201505), ID = c(500,500,600,700,500, 700), INC = c(20,30,50,75,80,90))
return,
Date ID INC
1: 201405 500 20
2: 201405 500 30
3: 201504 600 50
4: 201505 700 75
5: 201505 500 80
6: 201505 700 90
I want to remove all IDs that are in the same Date. The return should be
Date ID INC
1: 201504 600 50
2: 201505 500 80
Could you please suggest?
Upvotes: 5
Views: 2179
Reputation: 887971
We group by 'ID', get a logical index with duplicated
on the 'Date', and negate so that all the unique elements are now TRUE, use .I
to get the row index, extract the index column 'V1' and use that to subset 'dt'.
dt[dt[, .I[!(duplicated(Date)|duplicated(Date, fromLast=TRUE))], ID]$V1]
# Date ID INC
#1: 201505 500 80
#2: 201504 600 50
Or another option would be to group by 'Date', 'ID' and if
the nrow is equal to 1 (.N==1
), we get the Subset of Data.table (.SD
).
dt[, if(.N==1) .SD, .(Date, ID)]
# Date ID INC
#1: 201504 600 50
#2: 201505 500 80
Or as @Frank mentioned, we can use a data.table/base R combo
DT[ave(seq(.N), Date, ID, FUN = function(x) length(x) == 1L)]
Upvotes: 7