Vasily A
Vasily A

Reputation: 8626

Accessing column's data where column's name is passed by argument

Let's say I want to make a function which makes a sum of chosen fields within datatable. I.e., arguments for this function include: dtInput - datatable to be processed, fldID - column with unique row id, flds2Sum - vector of fields' names to be summed, fldsRes - name of the field to put result. Here is an example:

dt1 <- fread(
  "id,a,b,c,d
       id1,1,10,2,1
       id2,2,30,5,0
       id3,3,40,6,2
       id4,4,25,6,3
     ")

sumflds <- function(dtInput, fldID, flds2Sum, fldsRes) {
  dtInput[, fldsRes:={
    as.character(sum(mget(flds2Sum))) # this doesn't work correctly
  }, by=fldID, with=FALSE]
  return(dtInput);
}

dt2 <- sumflds(dt1, "id", c("c","a","d"), "res")

As I use with=FALSE, such references as fldID and fldsRes are processed correctly. But inside :={} block, I can't address values in desired way. I would be grateful for any advices.

Upvotes: 0

Views: 75

Answers (1)

mnel
mnel

Reputation: 115392

get is not vectorized, so you could use mget instead. Note that you will need to use do.call(sum,...). Note I have explicitly copied the input dataset, otherwise the original dt1 is altered by reference. I have also forced the evaluation of fldsRes by using ()

In this case, I think it easier to use .SD and .SDcols, eg

sumflds <- function(dtInput, fldID, flds2Sum, fldsRes) {
   copy(dtInput)[,(fldsRes) := do.call(sum,.SD), by = fldID, .SDcols = flds2Sum]
}

dt2 <- sumflds(dt1, "id", c("c","a","d"), "res")
dt2
#       id a  b c d res
# 1:   id1 1 10 2 1   4
# 2:   id2 2 30 5 0   7
# 3:   id3 3 40 6 2  11
# 4:   id4 4 25 6 3  13

Upvotes: 1

Related Questions