in R: Setting new Values in a data.table fast

Question

I am trying to set values to a data.table in an efficient way. The following code will do what I want, but it is too slow for large datasets:

DTcars<-as.data.table(mtcars)
for(i in 1:(dim(DTcars)[1]-1)){
  for(j in 1:dim(DTcars)[2]){
    if(DTcars[i,j, with=F]>10){
      set(DTcars,
          i=as.integer(i),
          j =as.integer(j)  ,
          value = DTcars[dim(DTcars)[1],j,with=F])
    }
  }
}

And I want something like this... which is totally a wrong code, but expresses my need and I think it would be faster. Meaning that I want to subset my data.table and insert the same value for a particular column and repeat for each column.

DTcars<-as.data.table(mtcars)
ns<-names(DTcars)
for(j in 1:length(ns)){
  DTcars[ns[j]>10]<-DTcars[20,ns[j]]
}

eddi · Accepted Answer

IMO set should be used sparingly, and regular := is sufficient almost always:

for (col in names(DTcars))
  DTcars[get(col) > 10, (col) := get(col)[.N]]

in R: Setting new Values in a data.table fast

Answers (2)

Related Questions