Reputation: 9421
I am trying to set values to a data.table in an efficient way. The following code will do what I want, but it is too slow for large datasets:
DTcars<-as.data.table(mtcars)
for(i in 1:(dim(DTcars)[1]-1)){
for(j in 1:dim(DTcars)[2]){
if(DTcars[i,j, with=F]>10){
set(DTcars,
i=as.integer(i),
j =as.integer(j) ,
value = DTcars[dim(DTcars)[1],j,with=F])
}
}
}
And I want something like this... which is totally a wrong code, but expresses my need and I think it would be faster. Meaning that I want to subset my data.table and insert the same value for a particular column and repeat for each column.
DTcars<-as.data.table(mtcars)
ns<-names(DTcars)
for(j in 1:length(ns)){
DTcars[ns[j]>10]<-DTcars[20,ns[j]]
}
Upvotes: 2
Views: 89
Reputation: 49448
IMO set
should be used sparingly, and regular :=
is sufficient almost always:
for (col in names(DTcars))
DTcars[get(col) > 10, (col) := get(col)[.N]]
Upvotes: 2
Reputation: 66819
I think you're looking for
for (j in names(DTcars)) set(DTcars,
i = which(DTcars[[j]]>10),
j = j,
value = tail(DTcars[[j]],1)
)
The column numbers or names can be used as the for
iterator here.
The value
changes between the two pieces of code in the OP, so I'm not sure about that.
Upvotes: 3