number of unique column value combinations with data.table

Question

Let's say I have a data table like this:

smalldat <- data.table(group1 = rep(1:2, each = 3), 
                   group2 = rep(c('a','b'), times = 3,
                   value = 1:6)

That looks as follows:

group1    group2    value
1         a         1
1         b         2
1         a         3
2         b         4
2         a         5
2         b         6

I want to calculate the number of observed combinations of group1 and group2.

The dplyr way would be (possibly not the most optimal):

nrow(smalldat %>% select(group1, group2) %>% distinct())

What would be the data.table way?

Arun · Accepted Answer

Use uniqueN along with .SD and .SDcols:

smalldat[, uniqueN(.SD), .SDcols=group1:group2]
# [1] 4

Or even more efficient, as @DavidArenburg shows under comment:

uniqueN(smalldat, by=c("group1", "group2"))
# [1] 4

number of unique column value combinations with data.table

Answers (2)

Related Questions