Dean MacGregor
Dean MacGregor

Reputation: 18496

How to get multiple column results from function that generates a list

This question is similar but not identical to Add multiple columns to R data.table in one function call?

Let's say I have a data.table

ex<-data.table(AAA=runif(100000),BBBB=runif(100000),CCC=runif(100000),DDD=runif(100000),EEE=runif(100000),FFF=runif(100000),HHH=runif(100000),III=runif(100000),FLAG=c(rep(c("a","b","c","d","e"),200000)))

I can get the sum and mean of all the columns by doing

ex[,c(sum=lapply(.SD,sum),mean=lapply(.SD,mean)),by=FLAG]

The results look good with the names I specified in the J appended to the existing column names for easy identification with only 1 row for each of the values of FLAG, as expected.

However, let's say I have a function that returns a list such as

sk<-function(x){
  meanx<-mean(x)
  lenx<-length(x)
  difxmean<-x-meanx
  m4<-sum((difxmean)^4)/lenx
  m3<-sum((difxmean)^3)/lenx
  m2<-sum((difxmean)^2)/lenx
  list(mean=meanx,len=lenx,sd=m2^.5,skew=m3/m2^(3/2),kurt=(m4/m2^2)-3)
}

If I do

ex[,lapply(.SD,sk),by=FLAG]

I get results with a row for each output of the list. I'd like to still have just 1 row of results with columns for each of the original columns and function results.

For example the output columns should be

AAA.mean    AAA.len     AAA.sd     AAA.skew    AAA.kurt       BBBB.mean    BBBB.len     BBBB.sd     BBBB.skew    BBBB.kurt    ....    III.mean    III.len     III.sd     III.skew    III.kurt

Is there a way to do this?

I know I could just put all these individual functions in the J and get the columns but I find that when I use this function instead of the individual functions for all the moments it is a good bit faster.

x<-runif(10000000)
system.time({
mean(x)
length(x)
sd(x)
skewness(x)
kurtosis(x)
})
user  system elapsed 
5.84    0.47    6.30

system.time(sk(x))
user  system elapsed 
3.9     0.1     4.0 

Upvotes: 4

Views: 299

Answers (1)

G. Grothendieck
G. Grothendieck

Reputation: 269774

Try this:

ex[, as.list(unlist(lapply(.SD, sk))), by = FLAG]

Upvotes: 5

Related Questions