Rentrop
Rentrop

Reputation: 21507

data.table - get molten data by j-function

I'd like to get the same output that melt produces with data.table without using melt and labeling it.

DT <- data.table(date=as.IDate(datetime), value=rnorm(10))
DT_melt <- DT[,as.list(summary(value)), by=date]
melt(DT_melt,"date")

Result:

          date variable   value
 1: 2001-01-01     Min. -0.9122
 2: 2001-01-02     Min. -1.2220
 3: 2001-01-01  1st Qu.  0.3462
 4: 2001-01-02  1st Qu. -0.8932
 5: 2001-01-01   Median  0.6230
 6: 2001-01-02   Median -0.2470
 7: 2001-01-01     Mean  0.4189
 8: 2001-01-02     Mean -0.3418
 9: 2001-01-01  3rd Qu.  0.7913
10: 2001-01-02  3rd Qu.  0.2526
11: 2001-01-01     Max.  1.2460
12: 2001-01-02     Max.  0.4010

And i'd like to achieve this without melt.

So far i managed to get it by labeling it manually as follows:

labels <- names(summary(1))
DT[,list(labels,value=unclass(summary(value))), by=date]

But is there a way to use the names of unclass(summary(value)) within data.table? Something like

DT[,c("labels","value"):=unclass(summary(value)), by=date, use.names = TRUE]

Upvotes: 0

Views: 74

Answers (1)

A5C1D2H2I1M1N2O1R2T1
A5C1D2H2I1M1N2O1R2T1

Reputation: 193637

You could create a function like the following:

myFun <- function(x) {
  A <- summary(x)
  list(variable = names(A), 
       value = unlist(A, use.names = FALSE))
}

Here's an example of the function in use.

First, some reproducible data:

datetime <- as.Date("2001-01-01") + 0:1
set.seed(1)
DT <- data.table(date=as.IDate(datetime), value=rnorm(10))

Second, applying the function:

DT[, myFun(value), by = date]
#           date variable    value
#  1: 2001-01-01     Min. -0.83560
#  2: 2001-01-01  1st Qu. -0.62650
#  3: 2001-01-01   Median  0.32950
#  4: 2001-01-01     Mean -0.01387
#  5: 2001-01-01  3rd Qu.  0.48740
#  6: 2001-01-01     Max.  0.57580
#  7: 2001-01-02     Min. -0.82050
#  8: 2001-01-02  1st Qu. -0.30540
#  9: 2001-01-02   Median  0.18360
# 10: 2001-01-02     Mean  0.27830
# 11: 2001-01-02  3rd Qu.  0.73830
# 12: 2001-01-02     Max.  1.59500

Other alternatives might be:

DT[, stack(summary(value)), by = date]

DT[, as.list(summary(value)), by = date][, list(
  variable = names(.SD), value = unlist(.SD)), by = date]

DT[, list(labels = names(summary(1)), 
          value = summary(value)), by = date]

I'm not sure why you wouldn't want to just use melt though. melt on a data.table is quite efficient.

Upvotes: 2

Related Questions