Reputation: 3752
Let's say I have a data table like this:
smalldat <- data.table(group1 = rep(1:2, each = 3),
group2 = rep(c('a','b'), times = 3,
value = 1:6)
That looks as follows:
group1 group2 value
1 a 1
1 b 2
1 a 3
2 b 4
2 a 5
2 b 6
I want to calculate the number of observed combinations of group1
and group2
.
The dplyr
way would be (possibly not the most optimal):
nrow(smalldat %>% select(group1, group2) %>% distinct())
What would be the data.table
way?
Upvotes: 1
Views: 67
Reputation: 118779
Use uniqueN
along with .SD
and .SDcols
:
smalldat[, uniqueN(.SD), .SDcols=group1:group2]
# [1] 4
Or even more efficient, as @DavidArenburg shows under comment:
uniqueN(smalldat, by=c("group1", "group2"))
# [1] 4
Upvotes: 4
Reputation: 886938
We can use unique
with the by
option.
nrow(unique(smalldat, by = c('group1', 'group2')))
Or
length(smalldat[,.GRP ,.(group1, group2)]$GRP)
Upvotes: 1