Trying to use .SD inside by in data.table

I'm trying to use the .SD to groupby inside a data.table to apply a function to a specific column. I'm gonna use iris dataset as an example.

Lets say I want to know how many unique Sepal.Length there is by species.

library(data.table)
obj="Species"
as.data.table(iris)[,length(unique(Sepal.Length)),by=.SD,.SDcols=obj]

It is important that I can supply .SDcols as an object, since I'm doing it programmatically. I also would like to know if it is possible using data.table instead of aggregate and/or xtabs solution.

Appreciate any help.

Upvotes: 2

Views: 55

Answers (1)

akrun
akrun

Reputation: 887088

We could directly pass the 'obj' in by and get the length of unique elements of 'Sepal.Length' (uniqueN)

as.data.table(iris)[, .(uniqueLen = uniqueN(Sepal.Length)), by = obj]

If we wanted to go by the .SDcols route

as.data.table(iris)[, .SD[, .(uniqueLen = uniqueN(.SD[[1]])), by = obj], 
            .SDcols = c(obj, "Sepal.Length")]

Upvotes: 4

Related Questions