Mohamed
Mohamed

Reputation: 105

data.table apply a user defined function over rows

I have a huge data.table dt (almost 1.5 million rows) let say i want to apply a user defined function growth.ls to its rows, where scols (some columns in dt) are the arguments as

growth.ls <- function(values){
  if (any(!is.finite(values)) || any(values <= 0)) return(NA_real_)
  exp(lm(log(values) ~ (seq_along(values)))$coefficients[[2]] - 1) * 100}
dt[, `:=`(var = growth.ls(as.numeric(.SD))), .SDcols = scols, by = 1:nrow(dt)]

this process takes a very long time, I do not know if the problem is the growth.ls, or i am because i am using by: 1:nrow(dt).

Upvotes: 1

Views: 690

Answers (1)

YOLO
YOLO

Reputation: 21749

What about this (using multicores with data.table):

library(parallel)
cl = makeCluster(detectCores())
choose_cols = startsWith(colnames(df),'x')

df[,growth := unlist(parApply(cl, .SD, 1, growth.ls), .SDcols = choose_cols]

Upvotes: 1

Related Questions