Reputation: 4555
I am trying to find the number of unique rows in a data.table
, for each unique element in "A". Here's what I did:
DT <- data.table(A = rep(1:3, each=4), B = rep(1:4, each=3), C = rep(1:2, 6), key = "A")
unique(DT,by=names(DT)) #Gives me each unique row in DT
# A B C
# 1: 1 1 1
# 2: 1 1 2
# 3: 1 2 2
# 4: 2 2 1
# 5: 2 2 2
# 6: 2 3 1
# 7: 2 3 2
# 8: 3 3 1
# 9: 3 4 2
#10: 3 4 1
nrow(unique(DT,by=names(DT))) #Gives me the number of unique rows in DT
# [1] 10
DT[,nrow(unique(DT,by=names(DT))),by=A] #Doesn't give me the number of unique rows for each unique DT$A.
# A V1
# 1: 1 10
# 2: 2 10
# 3: 3 10
Can anyone see what am I doing wrong here?
Upvotes: 1
Views: 417
Reputation: 8691
I think you want to use .SD (the sub table for each group)
DT[,nrow(unique(.SD)),by=A]
# A V1
#1: 1 3
#2: 2 4
#3: 3 3
Upvotes: 3
Reputation: 3525
because nrow(unique(DT,by=names(DT))
is 10
you are basically saying DT[,10,by=A]
Upvotes: 2