Reputation: 18496
This question is similar but not identical to Add multiple columns to R data.table in one function call?
Let's say I have a data.table
ex<-data.table(AAA=runif(100000),BBBB=runif(100000),CCC=runif(100000),DDD=runif(100000),EEE=runif(100000),FFF=runif(100000),HHH=runif(100000),III=runif(100000),FLAG=c(rep(c("a","b","c","d","e"),200000)))
I can get the sum and mean of all the columns by doing
ex[,c(sum=lapply(.SD,sum),mean=lapply(.SD,mean)),by=FLAG]
The results look good with the names I specified in the J appended to the existing column names for easy identification with only 1 row for each of the values of FLAG
, as expected.
However, let's say I have a function that returns a list such as
sk<-function(x){
meanx<-mean(x)
lenx<-length(x)
difxmean<-x-meanx
m4<-sum((difxmean)^4)/lenx
m3<-sum((difxmean)^3)/lenx
m2<-sum((difxmean)^2)/lenx
list(mean=meanx,len=lenx,sd=m2^.5,skew=m3/m2^(3/2),kurt=(m4/m2^2)-3)
}
If I do
ex[,lapply(.SD,sk),by=FLAG]
I get results with a row for each output of the list. I'd like to still have just 1 row of results with columns for each of the original columns and function results.
For example the output columns should be
AAA.mean AAA.len AAA.sd AAA.skew AAA.kurt BBBB.mean BBBB.len BBBB.sd BBBB.skew BBBB.kurt .... III.mean III.len III.sd III.skew III.kurt
Is there a way to do this?
I know I could just put all these individual functions in the J and get the columns but I find that when I use this function instead of the individual functions for all the moments it is a good bit faster.
x<-runif(10000000)
system.time({
mean(x)
length(x)
sd(x)
skewness(x)
kurtosis(x)
})
user system elapsed
5.84 0.47 6.30
system.time(sk(x))
user system elapsed
3.9 0.1 4.0
Upvotes: 4
Views: 299
Reputation: 269774
Try this:
ex[, as.list(unlist(lapply(.SD, sk))), by = FLAG]
Upvotes: 5