Alex
Alex

Reputation: 19803

naming a list when returning values from data.table

When returning items to data.table, it would be nice if they automatically took on the names of the variables. How does one do this? This is what I mean:

require(data.table)
x = data.table(a=1:10, id=1:2)
x[,{s = sum(a); p=prod(a); y = sqrt(abs(s*p)); z = y+1; list(y, z)},by=id]

#   id V1   V2
#1:  1 25  945
#2:  2 30 3840

Instead of V1 and V2 it would be nice if the columsn were labeled s and p. It's no big thing to do this here but if you have 20 columns it becomes a real pain. Any ideas on how to do this?

EDIT: I changed the question to make clear why I don't just do list(name = value)

Upvotes: 1

Views: 109

Answers (3)

Ricardo Saporta
Ricardo Saporta

Reputation: 55340

if you have a large amount of vairables and you are looking for a programmatic way to approach this, you can put the names of the columns in a vector and then use sapply with .SDcols eg:

## sample data
set.seed(7)
DT <- as.data.table(matrix(round(runif(130, 1, 100)), ncol=26))
setnames(DT, LETTERS)


## These are the columns we will compute on
Cols <- c("A", "G", "M", "W", "Z")

DT[,sapply(.SD,mean),.SDcols=Cols]

#    A    G    M    W    Z 
# 25.0 41.2 55.6 43.0 56.0     

If you want to compute different functions on each variable, then use the standard list(nm=function(x))

Upvotes: 1

eddi
eddi

Reputation: 49448

A remix of the other two answers - name them in the list:

x[,{s = sum(a); p=prod(a); y = sqrt(abs(s*p)); z = y+1;
    list(s = y, p = z)}, by=id]

or construct a data.table

x[,{s = sum(a); p=prod(a); y = sqrt(abs(s*p)); z = y+1;
    data.table(y, z)}, by=id]

And here's another option using Hmisc (this is slower than naming manually, but probably faster than data.table):

library(Hmisc)
x[,{s = sum(a); p=prod(a); y = sqrt(abs(s*p)); z = y+1;
    llist(y, z)}, by=id]

Upvotes: 2

Justin
Justin

Reputation: 43245

Forgive me if I'm missing something... but isn't the standard list syntax for data.table what you're looking for? It is more concise and clearer IMHO.

x[, 
  list(s = sum(a),
       p = prod(a)),
  by=id] 

#    id  s    p
# 1:  1 25  945
# 2:  2 30 3840

You can also build up this list as an expression and eval it.

foo <- expression(list(s=sum(a), p=prod(a)))

x[, eval(foo), by=id]

This can then be extended to a function (using as.quoted from plyr instead cause its handy):

expression_maker <- function(funs, cols, names) {
   require(plyr)
   list_contents <- paste0(names, '=', funs, '(', cols, ')', collapse=',')
   as.quoted(paste('list(', list_contents, ')'))[[1]]
}

output <- expression_maker(funs=c('sum', 'prod'), cols=c('a', 'a'), names=c('s', 'p'))
x[, eval(output), by=id]

... But there be dragons!


per OP's edit:

x[,{s = sum(a); p=prod(a); y = sqrt(abs(s*p)); z = y+1; list(y, z)},by=id]

I would do this in a function and return a data.table directly:

yourfun <- function(a) {
  s <- sum(a)
  p <- prod(a)
  y <- sqrt(abs(s*p))
  z <- y+1
  data.table(y, z)
}

x[, yourfun(a), by=id]

Upvotes: 3

Related Questions