Reputation: 1345
Let's say that I have the following data.table
object
library(data.table)
dt <- data.table(
x = c(1, 2, 3, 4, 5),
y = c(1, 1, 3, 4, 5),
z = c(1, 1, 1, 4, 5)
)
I want to be able to count the number of unique values of any stats, raised it to a power y
and return it in a data.table
, keeping the name of the stats.
I want to do something like the following
foo <- function(stats, y){
lapply(stats, function(stat){length(unique(stat))^y})
}
dt[, foo(.(x, y), 2)]
## V1 V2
## 1: 25 16
but I expect the output to be
dt[, foo(.(x, y), 2)]
## x y
## 1: 25 16
Note that doing this
dt[, foo(.(x=x, y=y), 2)]
## x y
## 1: 25 16
or this
dt[, foo(data.table(x, y), 2)]
## x y
## 1: 25 16
will work, but I think the syntax I suggested earlier looks better. Is it possible to tweak the foo
function to do so or I have to, in some way, tweak the .(
function directly in the data.table
package?
Upvotes: 2
Views: 153
Reputation: 11255
Here are two potential workarounds. The first one is what you are requesting:
foo <- function(stat, x){
DF <- lapply(stat, function(stat2){length(unique(stat2))^x})
names(DF) <- sapply(substitute(stat)[-1], deparse)
return(DF)
}
dt[, foo(.(x, y), 2)]
x y
1: 25 16
This one I think is probably just as user friendly and might be more powerful. If you asking about data.table
, you should try to utilize its strengths.
foo2 <- function(DT, exponent, SD_cols , by_v = NULL){
DT[,
lapply(.SD, function(stat) {length(unique(stat))^exponent}),
.SDcols = SD_cols,
by = by_v]
}
foo2(dt, 2, c('x','y'), by_v = 'z')
z x y
1: 1 9 4
2: 4 1 1
3: 5 1 1
foo2(dt, 2, c('x', 'y'))
x y
1: 25 16
Upvotes: 3