JelenaČuklina
JelenaČuklina

Reputation: 3752

number of unique column value combinations with data.table

Let's say I have a data table like this:

smalldat <- data.table(group1 = rep(1:2, each = 3), 
                   group2 = rep(c('a','b'), times = 3,
                   value = 1:6)

That looks as follows:

group1    group2    value
1         a         1
1         b         2
1         a         3
2         b         4
2         a         5
2         b         6

I want to calculate the number of observed combinations of group1 and group2.

The dplyr way would be (possibly not the most optimal):

nrow(smalldat %>% select(group1, group2) %>% distinct())

What would be the data.table way?

Upvotes: 1

Views: 67

Answers (2)

Arun
Arun

Reputation: 118779

Use uniqueN along with .SD and .SDcols:

smalldat[, uniqueN(.SD), .SDcols=group1:group2]
# [1] 4

Or even more efficient, as @DavidArenburg shows under comment:

uniqueN(smalldat, by=c("group1", "group2"))
# [1] 4

Upvotes: 4

akrun
akrun

Reputation: 886938

We can use unique with the by option.

 nrow(unique(smalldat, by = c('group1', 'group2')))

Or

length(smalldat[,.GRP ,.(group1, group2)]$GRP)

Upvotes: 1

Related Questions